[Intel-xe] [PATCH v3 0/6] Some small-bar prep patches

Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed

* [Intel-xe] [PATCH v3 0/6] Some small-bar prep patches
@ 2023-03-08 12:30 Matthew Auld
  2023-03-08 12:30 ` [Intel-xe] [PATCH v3 1/6] drm/xe: add xe_ttm_stolen_cpu_access_needs_ggtt() Matthew Auld
                   ` (6 more replies)
  0 siblings, 7 replies; 10+ messages in thread
From: Matthew Auld @ 2023-03-08 12:30 UTC (permalink / raw)
  To: intel-xe

Split from the small-bar series. Plus one extra patch to fix the overloaded
meaning behind xe_ttm_stolen_cpu_inaccessible(), as pointed out by Maarten.

v2: Apply s/lmem/vram/ for the entire driver
v3: Try to simplfy the stolen stuff by removing cpu_inaccessible()

-- 
2.39.2


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Intel-xe] [PATCH v3 1/6] drm/xe: add xe_ttm_stolen_cpu_access_needs_ggtt()
  2023-03-08 12:30 [Intel-xe] [PATCH v3 0/6] Some small-bar prep patches Matthew Auld
@ 2023-03-08 12:30 ` Matthew Auld
  2023-03-08 12:30 ` [Intel-xe] [PATCH v3 2/6] drm/xe: s/lmem/vram/ Matthew Auld
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Matthew Auld @ 2023-03-08 12:30 UTC (permalink / raw)
  To: intel-xe

xe_ttm_stolen_cpu_inaccessible() was originally meant to just cover the
case where stolen is not directly CPU accessible on some older
integrated platforms, and as such a GGTT mapping was also required for
CPU access (as per the check in xe_bo_create_pin_map_at()).

However with small-bar systems on dgfx we have one more case where
stolen is also inaccessible, however here we don't have any fallback
GGTT mode for CPU access. Fix the check in xe_bo_create_pin_map_at() to
make this distinction clear. In such a case the later vmap() will fail
anyway.

v2: fix kernel-doc warning
v3: Simplify further and remove cpu_inaccessible()

Suggested-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
---
 drivers/gpu/drm/xe/xe_bo.c             |  2 +-
 drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c | 39 ++++++++------------------
 drivers/gpu/drm/xe/xe_ttm_stolen_mgr.h |  2 +-
 3 files changed, 14 insertions(+), 29 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
index 2bfd3f6f2e9a..876f77669104 100644
--- a/drivers/gpu/drm/xe/xe_bo.c
+++ b/drivers/gpu/drm/xe/xe_bo.c
@@ -1151,7 +1151,7 @@ struct xe_bo *xe_bo_create_pin_map_at(struct xe_device *xe, struct xe_gt *gt,
 	u64 end = offset == ~0ull ? offset : start + size;
 
 	if (flags & XE_BO_CREATE_STOLEN_BIT &&
-	    xe_ttm_stolen_cpu_inaccessible(xe))
+	    xe_ttm_stolen_cpu_access_needs_ggtt(xe))
 		flags |= XE_BO_CREATE_GGTT_BIT;
 
 	bo = xe_bo_create_locked_range(xe, gt, vm, size, start, end, type, flags);
diff --git a/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c b/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c
index 2e8d07ad42ae..0ad0f1820f4b 100644
--- a/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c
+++ b/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c
@@ -38,32 +38,17 @@ to_stolen_mgr(struct ttm_resource_manager *man)
 }
 
 /**
- * xe_ttm_stolen_cpu_inaccessible - Can we directly CPU access stolen memory for
- * this device.
+ * xe_ttm_stolen_cpu_access_needs_ggtt - If we can't directly CPU access stolen,
+ * can we then fallback to mapping through the GGTT.
  * @xe: xe device
  *
- * On some integrated platforms we can't directly access stolen via the CPU
- * (like some normal system memory).  Also on small-bar systems for discrete,
- * since stolen is always as the end of normal VRAM, and the BAR likely doesn't
- * stretch that far. However CPU access of stolen is generally rare, and at
- * least on discrete should not be needed.
- *
- * If this is indeed inaccessible then we fallback to using the GGTT mappable
- * aperture for CPU access. On discrete platforms we have no such thing, so when
- * later attempting to CPU map the memory an error is instead thrown.
+ * Some older integrated platforms don't support reliable CPU access for stolen,
+ * however on such hardware we can always use the mappable part of the GGTT for
+ * CPU access. Check if that's the case for this device.
  */
-bool xe_ttm_stolen_cpu_inaccessible(struct xe_device *xe)
+bool xe_ttm_stolen_cpu_access_needs_ggtt(struct xe_device *xe)
 {
-	struct ttm_resource_manager *ttm_mgr =
-		ttm_manager_type(&xe->ttm, XE_PL_STOLEN);
-	struct xe_ttm_stolen_mgr *mgr;
-
-	if (!ttm_mgr)
-		return true;
-
-	mgr = to_stolen_mgr(ttm_mgr);
-
-	return !mgr->io_base || GRAPHICS_VERx100(xe) < 1270;
+	return GRAPHICS_VERx100(xe) < 1270 && !IS_DGFX(xe);
 }
 
 static s64 detect_bar2_dgfx(struct xe_device *xe, struct xe_ttm_stolen_mgr *mgr)
@@ -172,7 +157,7 @@ void xe_ttm_stolen_mgr_init(struct xe_device *xe)
 	drm_dbg_kms(&xe->drm, "Initialized stolen memory support with %llu bytes\n",
 		    stolen_size);
 
-	if (!xe_ttm_stolen_cpu_inaccessible(xe))
+	if (mgr->io_base && !xe_ttm_stolen_cpu_access_needs_ggtt(xe))
 		mgr->mapping = devm_ioremap_wc(&pdev->dev, mgr->io_base, stolen_size);
 }
 
@@ -185,7 +170,7 @@ u64 xe_ttm_stolen_io_offset(struct xe_bo *bo, u32 offset)
 
 	XE_BUG_ON(!mgr->io_base);
 
-	if (!IS_DGFX(xe) && xe_ttm_stolen_cpu_inaccessible(xe))
+	if (xe_ttm_stolen_cpu_access_needs_ggtt(xe))
 		return mgr->io_base + xe_bo_ggtt_addr(bo) + offset;
 
 	xe_res_first(bo->ttm.resource, offset, 4096, &cur);
@@ -251,10 +236,10 @@ int xe_ttm_stolen_io_mem_reserve(struct xe_device *xe, struct ttm_resource *mem)
 	if (!mgr || !mgr->io_base)
 		return -EIO;
 
-	if (!xe_ttm_stolen_cpu_inaccessible(xe))
-		return __xe_ttm_stolen_io_mem_reserve_bar2(xe, mgr, mem);
-	else
+	if (xe_ttm_stolen_cpu_access_needs_ggtt(xe))
 		return __xe_ttm_stolen_io_mem_reserve_stolen(xe, mgr, mem);
+	else
+		return __xe_ttm_stolen_io_mem_reserve_bar2(xe, mgr, mem);
 }
 
 u64 xe_ttm_stolen_gpu_offset(struct xe_device *xe)
diff --git a/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.h b/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.h
index 2fda97b97a05..1777245ff810 100644
--- a/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.h
+++ b/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.h
@@ -14,7 +14,7 @@ struct xe_device;
 
 void xe_ttm_stolen_mgr_init(struct xe_device *xe);
 int xe_ttm_stolen_io_mem_reserve(struct xe_device *xe, struct ttm_resource *mem);
-bool xe_ttm_stolen_cpu_inaccessible(struct xe_device *xe);
+bool xe_ttm_stolen_cpu_access_needs_ggtt(struct xe_device *xe);
 u64 xe_ttm_stolen_io_offset(struct xe_bo *bo, u32 offset);
 u64 xe_ttm_stolen_gpu_offset(struct xe_device *xe);
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [Intel-xe] [PATCH v3 2/6] drm/xe: s/lmem/vram/
  2023-03-08 12:30 [Intel-xe] [PATCH v3 0/6] Some small-bar prep patches Matthew Auld
  2023-03-08 12:30 ` [Intel-xe] [PATCH v3 1/6] drm/xe: add xe_ttm_stolen_cpu_access_needs_ggtt() Matthew Auld
@ 2023-03-08 12:30 ` Matthew Auld
  2023-03-08 13:37   ` Rodrigo Vivi
  2023-03-08 12:30 ` [Intel-xe] [PATCH v3 3/6] drm/xe/vram: start tracking the io_size Matthew Auld
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 10+ messages in thread
From: Matthew Auld @ 2023-03-08 12:30 UTC (permalink / raw)
  To: intel-xe; +Cc: Lucas De Marchi

This seems to be the preferred nomenclature in xe. Currently we are
intermixing vram and lmem, which is confusing.

v2 (Gwan-gyeong Mun & Lucas):
  - Rather apply to the entire driver

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: Lucas De Marchi <lucas.demarchi@intel.com>
---
 drivers/gpu/drm/xe/Kconfig.debug      |  2 +-
 drivers/gpu/drm/xe/tests/xe_migrate.c |  2 +-
 drivers/gpu/drm/xe/xe_bo.c            | 10 +++----
 drivers/gpu/drm/xe/xe_bo.h            |  6 ++--
 drivers/gpu/drm/xe/xe_ggtt.c          |  6 ++--
 drivers/gpu/drm/xe/xe_migrate.c       | 12 ++++----
 drivers/gpu/drm/xe/xe_mmio.c          | 40 +++++++++++++--------------
 drivers/gpu/drm/xe/xe_module.c        |  6 ++--
 drivers/gpu/drm/xe/xe_module.h        |  2 +-
 drivers/gpu/drm/xe/xe_pt.c            | 12 ++++----
 drivers/gpu/drm/xe/xe_vm.c            | 10 +++----
 11 files changed, 54 insertions(+), 54 deletions(-)

diff --git a/drivers/gpu/drm/xe/Kconfig.debug b/drivers/gpu/drm/xe/Kconfig.debug
index 565be3f6b9b9..93b284cdd0a2 100644
--- a/drivers/gpu/drm/xe/Kconfig.debug
+++ b/drivers/gpu/drm/xe/Kconfig.debug
@@ -41,7 +41,7 @@ config DRM_XE_DEBUG_VM
 	  If in doubt, say "N".
 
 config DRM_XE_DEBUG_MEM
-	bool "Enable passing SYS/LMEM addresses to user space"
+	bool "Enable passing SYS/VRAM addresses to user space"
 	default n
 	help
 	  Pass object location trough uapi. Intended for extended
diff --git a/drivers/gpu/drm/xe/tests/xe_migrate.c b/drivers/gpu/drm/xe/tests/xe_migrate.c
index b7e4a126e8b7..ac659b94e7f5 100644
--- a/drivers/gpu/drm/xe/tests/xe_migrate.c
+++ b/drivers/gpu/drm/xe/tests/xe_migrate.c
@@ -129,7 +129,7 @@ static void test_copy(struct xe_migrate *m, struct xe_bo *bo,
 	}
 	dma_fence_put(fence);
 
-	/* Try to copy 0xc0 from sysmem to lmem with 2MB or 64KiB/4KiB pages */
+	/* Try to copy 0xc0 from sysmem to vram with 2MB or 64KiB/4KiB pages */
 	xe_map_memset(xe, &sysmem->vmap, 0, 0xc0, sysmem->size);
 	xe_map_memset(xe, &bo->vmap, 0, 0xd0, bo->size);
 
diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
index 876f77669104..73a7f2cd4ad8 100644
--- a/drivers/gpu/drm/xe/xe_bo.c
+++ b/drivers/gpu/drm/xe/xe_bo.c
@@ -1290,12 +1290,12 @@ int xe_bo_pin(struct xe_bo *bo)
 	if (IS_DGFX(xe) && !(IS_ENABLED(CONFIG_DRM_XE_DEBUG) &&
 	    bo->flags & XE_BO_INTERNAL_TEST)) {
 		struct ttm_place *place = &(bo->placements[0]);
-		bool lmem;
+		bool vram;
 
 		if (mem_type_is_vram(place->mem_type)) {
 			XE_BUG_ON(!(place->flags & TTM_PL_FLAG_CONTIGUOUS));
 
-			place->fpfn = (xe_bo_addr(bo, 0, PAGE_SIZE, &lmem) -
+			place->fpfn = (xe_bo_addr(bo, 0, PAGE_SIZE, &vram) -
 				       vram_region_io_offset(bo)) >> PAGE_SHIFT;
 			place->lpfn = place->fpfn + (bo->size >> PAGE_SHIFT);
 
@@ -1415,7 +1415,7 @@ bool xe_bo_is_xe_bo(struct ttm_buffer_object *bo)
 }
 
 dma_addr_t xe_bo_addr(struct xe_bo *bo, u64 offset,
-		      size_t page_size, bool *is_lmem)
+		      size_t page_size, bool *is_vram)
 {
 	struct xe_res_cursor cur;
 	u64 page;
@@ -1427,9 +1427,9 @@ dma_addr_t xe_bo_addr(struct xe_bo *bo, u64 offset,
 	page = offset >> PAGE_SHIFT;
 	offset &= (PAGE_SIZE - 1);
 
-	*is_lmem = xe_bo_is_vram(bo);
+	*is_vram = xe_bo_is_vram(bo);
 
-	if (!*is_lmem && !xe_bo_is_stolen(bo)) {
+	if (!*is_vram && !xe_bo_is_stolen(bo)) {
 		XE_BUG_ON(!bo->ttm.ttm);
 
 		xe_res_first_sg(xe_bo_get_sg(bo), page << PAGE_SHIFT,
diff --git a/drivers/gpu/drm/xe/xe_bo.h b/drivers/gpu/drm/xe/xe_bo.h
index 0699b2b4c5ca..f841e74cd417 100644
--- a/drivers/gpu/drm/xe/xe_bo.h
+++ b/drivers/gpu/drm/xe/xe_bo.h
@@ -196,14 +196,14 @@ static inline void xe_bo_unpin_map_no_vm(struct xe_bo *bo)
 
 bool xe_bo_is_xe_bo(struct ttm_buffer_object *bo);
 dma_addr_t xe_bo_addr(struct xe_bo *bo, u64 offset,
-		      size_t page_size, bool *is_lmem);
+		      size_t page_size, bool *is_vram);
 
 static inline dma_addr_t
 xe_bo_main_addr(struct xe_bo *bo, size_t page_size)
 {
-	bool is_lmem;
+	bool is_vram;
 
-	return xe_bo_addr(bo, 0, page_size, &is_lmem);
+	return xe_bo_addr(bo, 0, page_size, &is_vram);
 }
 
 static inline u32
diff --git a/drivers/gpu/drm/xe/xe_ggtt.c b/drivers/gpu/drm/xe/xe_ggtt.c
index a0d672b6afb2..a430d1568890 100644
--- a/drivers/gpu/drm/xe/xe_ggtt.c
+++ b/drivers/gpu/drm/xe/xe_ggtt.c
@@ -28,12 +28,12 @@ u64 xe_ggtt_pte_encode(struct xe_bo *bo, u64 bo_offset)
 {
 	struct xe_device *xe = xe_bo_device(bo);
 	u64 pte;
-	bool is_lmem;
+	bool is_vram;
 
-	pte = xe_bo_addr(bo, bo_offset, GEN8_PAGE_SIZE, &is_lmem);
+	pte = xe_bo_addr(bo, bo_offset, GEN8_PAGE_SIZE, &is_vram);
 	pte |= GEN8_PAGE_PRESENT;
 
-	if (is_lmem)
+	if (is_vram)
 		pte |= GEN12_GGTT_PTE_LM;
 
 	/* FIXME: vfunc + pass in caching rules */
diff --git a/drivers/gpu/drm/xe/xe_migrate.c b/drivers/gpu/drm/xe/xe_migrate.c
index bc69ec17d5ad..c0523d8fe944 100644
--- a/drivers/gpu/drm/xe/xe_migrate.c
+++ b/drivers/gpu/drm/xe/xe_migrate.c
@@ -222,15 +222,15 @@ static int xe_migrate_prepare_vm(struct xe_gt *gt, struct xe_migrate *m,
 			level++;
 		}
 	} else {
-		bool is_lmem;
-		u64 batch_addr = xe_bo_addr(batch, 0, GEN8_PAGE_SIZE, &is_lmem);
+		bool is_vram;
+		u64 batch_addr = xe_bo_addr(batch, 0, GEN8_PAGE_SIZE, &is_vram);
 
 		m->batch_base_ofs = xe_migrate_vram_ofs(batch_addr);
 
 		if (xe->info.supports_usm) {
 			batch = gt->usm.bb_pool.bo;
 			batch_addr = xe_bo_addr(batch, 0, GEN8_PAGE_SIZE,
-						&is_lmem);
+						&is_vram);
 			m->usm_batch_base_ofs = xe_migrate_vram_ofs(batch_addr);
 		}
 	}
@@ -933,12 +933,12 @@ static void write_pgtable(struct xe_gt *gt, struct xe_bb *bb, u64 ppgtt_ofs,
 	 */
 	XE_BUG_ON(update->qwords > 0x1ff);
 	if (!ppgtt_ofs) {
-		bool is_lmem;
+		bool is_vram;
 
 		ppgtt_ofs = xe_migrate_vram_ofs(xe_bo_addr(update->pt_bo, 0,
 							   GEN8_PAGE_SIZE,
-							   &is_lmem));
-		XE_BUG_ON(!is_lmem);
+							   &is_vram));
+		XE_BUG_ON(!is_vram);
 	}
 
 	do {
diff --git a/drivers/gpu/drm/xe/xe_mmio.c b/drivers/gpu/drm/xe/xe_mmio.c
index 65b0df9bb579..e5bd4609aaee 100644
--- a/drivers/gpu/drm/xe/xe_mmio.c
+++ b/drivers/gpu/drm/xe/xe_mmio.c
@@ -68,7 +68,7 @@ _resize_bar(struct xe_device *xe, int resno, resource_size_t size)
 	return 1;
 }
 
-static int xe_resize_lmem_bar(struct xe_device *xe, resource_size_t lmem_size)
+static int xe_resize_vram_bar(struct xe_device *xe, resource_size_t vram_size)
 {
 	struct pci_dev *pdev = to_pci_dev(xe->drm.dev);
 	struct pci_bus *root = pdev->bus;
@@ -78,31 +78,31 @@ static int xe_resize_lmem_bar(struct xe_device *xe, resource_size_t lmem_size)
 	u32 pci_cmd;
 	int i;
 	int ret;
-	u64 force_lmem_bar_size = xe_force_lmem_bar_size;
+	u64 force_vram_bar_size = xe_force_vram_bar_size;
 
 	current_size = roundup_pow_of_two(pci_resource_len(pdev, GEN12_LMEM_BAR));
 
-	if (force_lmem_bar_size) {
+	if (force_vram_bar_size) {
 		u32 bar_sizes;
 
-		rebar_size = force_lmem_bar_size * (resource_size_t)SZ_1M;
+		rebar_size = force_vram_bar_size * (resource_size_t)SZ_1M;
 		bar_sizes = pci_rebar_get_possible_sizes(pdev, GEN12_LMEM_BAR);
 
 		if (rebar_size == current_size)
 			return 0;
 
 		if (!(bar_sizes & BIT(pci_rebar_bytes_to_size(rebar_size))) ||
-		    rebar_size >= roundup_pow_of_two(lmem_size)) {
-			rebar_size = lmem_size;
+		    rebar_size >= roundup_pow_of_two(vram_size)) {
+			rebar_size = vram_size;
 			drm_info(&xe->drm,
 				 "Given bar size is not within supported size, setting it to default: %llu\n",
-				 (u64)lmem_size >> 20);
+				 (u64)vram_size >> 20);
 		}
 	} else {
 		rebar_size = current_size;
 
-		if (rebar_size != roundup_pow_of_two(lmem_size))
-			rebar_size = lmem_size;
+		if (rebar_size != roundup_pow_of_two(vram_size))
+			rebar_size = vram_size;
 		else
 			return 0;
 	}
@@ -117,7 +117,7 @@ static int xe_resize_lmem_bar(struct xe_device *xe, resource_size_t lmem_size)
 	}
 
 	if (!root_res) {
-		drm_info(&xe->drm, "Can't resize LMEM BAR - platform support is missing\n");
+		drm_info(&xe->drm, "Can't resize VRAM BAR - platform support is missing\n");
 		return -1;
 	}
 
@@ -168,7 +168,7 @@ int xe_mmio_total_vram_size(struct xe_device *xe, u64 *vram_size, u64 *usable_si
 	if (usable_size) {
 		reg = xe_gt_mcr_unicast_read_any(gt, XEHP_FLAT_CCS_BASE_ADDR);
 		*usable_size = (u64)REG_FIELD_GET(GENMASK(31, 8), reg) * SZ_64K;
-		drm_info(&xe->drm, "lmem_size: 0x%llx usable_size: 0x%llx\n",
+		drm_info(&xe->drm, "vram_size: 0x%llx usable_size: 0x%llx\n",
 			 *vram_size, *usable_size);
 	}
 
@@ -180,7 +180,7 @@ int xe_mmio_probe_vram(struct xe_device *xe)
 	struct pci_dev *pdev = to_pci_dev(xe->drm.dev);
 	struct xe_gt *gt;
 	u8 id;
-	u64 lmem_size;
+	u64 vram_size;
 	u64 original_size;
 	u64 current_size;
 	u64 usable_size;
@@ -207,29 +207,29 @@ int xe_mmio_probe_vram(struct xe_device *xe)
 	gt = xe_device_get_gt(xe, 0);
 	original_size = pci_resource_len(pdev, GEN12_LMEM_BAR);
 
-	err = xe_mmio_total_vram_size(xe, &lmem_size, &usable_size);
+	err = xe_mmio_total_vram_size(xe, &vram_size, &usable_size);
 	if (err)
 		return err;
 
-	resize_result = xe_resize_lmem_bar(xe, lmem_size);
+	resize_result = xe_resize_vram_bar(xe, vram_size);
 	current_size = pci_resource_len(pdev, GEN12_LMEM_BAR);
 	xe->mem.vram.io_start = pci_resource_start(pdev, GEN12_LMEM_BAR);
 
-	xe->mem.vram.size = min(current_size, lmem_size);
+	xe->mem.vram.size = min(current_size, vram_size);
 
 	if (!xe->mem.vram.size)
 		return -EIO;
 
 	if (resize_result > 0)
-		drm_info(&xe->drm, "Successfully resize LMEM from %lluMiB to %lluMiB\n",
+		drm_info(&xe->drm, "Successfully resize VRAM from %lluMiB to %lluMiB\n",
 			 (u64)original_size >> 20,
 			 (u64)current_size >> 20);
-	else if (xe->mem.vram.size < lmem_size && !xe_force_lmem_bar_size)
+	else if (xe->mem.vram.size < vram_size && !xe_force_vram_bar_size)
 		drm_info(&xe->drm, "Using a reduced BAR size of %lluMiB. Consider enabling 'Resizable BAR' support in your BIOS.\n",
 			 (u64)xe->mem.vram.size >> 20);
-	if (xe->mem.vram.size < lmem_size)
+	if (xe->mem.vram.size < vram_size)
 		drm_warn(&xe->drm, "Restricting VRAM size to PCI resource size (0x%llx->0x%llx)\n",
-			 lmem_size, (u64)xe->mem.vram.size);
+			 vram_size, (u64)xe->mem.vram.size);
 
 	xe->mem.vram.mapping = ioremap_wc(xe->mem.vram.io_start, xe->mem.vram.size);
 	xe->mem.vram.size = min_t(u64, xe->mem.vram.size, usable_size);
@@ -360,7 +360,7 @@ int xe_mmio_init(struct xe_device *xe)
 	 * and we should not continue with driver initialization.
 	 */
 	if (IS_DGFX(xe) && !(xe_mmio_read32(gt, GU_CNTL.reg) & LMEM_INIT)) {
-		drm_err(&xe->drm, "LMEM not initialized by firmware\n");
+		drm_err(&xe->drm, "VRAM not initialized by firmware\n");
 		return -ENODEV;
 	}
 
diff --git a/drivers/gpu/drm/xe/xe_module.c b/drivers/gpu/drm/xe/xe_module.c
index 5a51a9959eff..6860586ce7f8 100644
--- a/drivers/gpu/drm/xe/xe_module.c
+++ b/drivers/gpu/drm/xe/xe_module.c
@@ -22,9 +22,9 @@ bool enable_display = true;
 module_param_named(enable_display, enable_display, bool, 0444);
 MODULE_PARM_DESC(enable_display, "Enable display");
 
-u32 xe_force_lmem_bar_size;
-module_param_named(lmem_bar_size, xe_force_lmem_bar_size, uint, 0600);
-MODULE_PARM_DESC(lmem_bar_size, "Set the lmem bar size(in MiB)");
+u32 xe_force_vram_bar_size;
+module_param_named(vram_bar_size, xe_force_vram_bar_size, uint, 0600);
+MODULE_PARM_DESC(vram_bar_size, "Set the vram bar size(in MiB)");
 
 int xe_guc_log_level = 5;
 module_param_named(guc_log_level, xe_guc_log_level, int, 0600);
diff --git a/drivers/gpu/drm/xe/xe_module.h b/drivers/gpu/drm/xe/xe_module.h
index 2c6ee46f5595..86916c176382 100644
--- a/drivers/gpu/drm/xe/xe_module.h
+++ b/drivers/gpu/drm/xe/xe_module.h
@@ -8,6 +8,6 @@
 /* Module modprobe variables */
 extern bool enable_guc;
 extern bool enable_display;
-extern u32 xe_force_lmem_bar_size;
+extern u32 xe_force_vram_bar_size;
 extern int xe_guc_log_level;
 extern char *xe_param_force_probe;
diff --git a/drivers/gpu/drm/xe/xe_pt.c b/drivers/gpu/drm/xe/xe_pt.c
index 06fc7b206997..dfd97b0ec42a 100644
--- a/drivers/gpu/drm/xe/xe_pt.c
+++ b/drivers/gpu/drm/xe/xe_pt.c
@@ -62,12 +62,12 @@ u64 gen8_pde_encode(struct xe_bo *bo, u64 bo_offset,
 		    const enum xe_cache_level level)
 {
 	u64 pde;
-	bool is_lmem;
+	bool is_vram;
 
-	pde = xe_bo_addr(bo, bo_offset, GEN8_PAGE_SIZE, &is_lmem);
+	pde = xe_bo_addr(bo, bo_offset, GEN8_PAGE_SIZE, &is_vram);
 	pde |= GEN8_PAGE_PRESENT | GEN8_PAGE_RW;
 
-	XE_WARN_ON(IS_DGFX(xe_bo_device(bo)) && !is_lmem);
+	XE_WARN_ON(IS_DGFX(xe_bo_device(bo)) && !is_vram);
 
 	/* FIXME: I don't think the PPAT handling is correct for MTL */
 
@@ -80,13 +80,13 @@ u64 gen8_pde_encode(struct xe_bo *bo, u64 bo_offset,
 }
 
 static dma_addr_t vma_addr(struct xe_vma *vma, u64 offset,
-			   size_t page_size, bool *is_lmem)
+			   size_t page_size, bool *is_vram)
 {
 	if (xe_vma_is_userptr(vma)) {
 		struct xe_res_cursor cur;
 		u64 page;
 
-		*is_lmem = false;
+		*is_vram = false;
 		page = offset >> PAGE_SHIFT;
 		offset &= (PAGE_SIZE - 1);
 
@@ -94,7 +94,7 @@ static dma_addr_t vma_addr(struct xe_vma *vma, u64 offset,
 				&cur);
 		return xe_res_dma(&cur) + offset;
 	} else {
-		return xe_bo_addr(vma->bo, offset, page_size, is_lmem);
+		return xe_bo_addr(vma->bo, offset, page_size, is_vram);
 	}
 }
 
diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
index a0e35627e45e..6cf7172e84cc 100644
--- a/drivers/gpu/drm/xe/xe_vm.c
+++ b/drivers/gpu/drm/xe/xe_vm.c
@@ -3355,7 +3355,7 @@ int xe_vm_invalidate_vma(struct xe_vma *vma)
 int xe_analyze_vm(struct drm_printer *p, struct xe_vm *vm, int gt_id)
 {
 	struct rb_node *node;
-	bool is_lmem;
+	bool is_vram;
 	uint64_t addr;
 
 	if (!down_read_trylock(&vm->lock)) {
@@ -3363,8 +3363,8 @@ int xe_analyze_vm(struct drm_printer *p, struct xe_vm *vm, int gt_id)
 		return 0;
 	}
 	if (vm->pt_root[gt_id]) {
-		addr = xe_bo_addr(vm->pt_root[gt_id]->bo, 0, GEN8_PAGE_SIZE, &is_lmem);
-		drm_printf(p, " VM root: A:0x%llx %s\n", addr, is_lmem ? "LMEM" : "SYS");
+		addr = xe_bo_addr(vm->pt_root[gt_id]->bo, 0, GEN8_PAGE_SIZE, &is_vram);
+		drm_printf(p, " VM root: A:0x%llx %s\n", addr, is_vram ? "VRAM" : "SYS");
 	}
 
 	for (node = rb_first(&vm->vmas); node; node = rb_next(node)) {
@@ -3377,11 +3377,11 @@ int xe_analyze_vm(struct drm_printer *p, struct xe_vm *vm, int gt_id)
 			xe_res_first_sg(vma->userptr.sg, 0, GEN8_PAGE_SIZE, &cur);
 			addr = xe_res_dma(&cur);
 		} else {
-			addr = xe_bo_addr(vma->bo, 0, GEN8_PAGE_SIZE, &is_lmem);
+			addr = xe_bo_addr(vma->bo, 0, GEN8_PAGE_SIZE, &is_vram);
 		}
 		drm_printf(p, " [%016llx-%016llx] S:0x%016llx A:%016llx %s\n",
 			   vma->start, vma->end, vma->end - vma->start + 1ull,
-			   addr, is_userptr ? "USR" : is_lmem ? "VRAM" : "SYS");
+			   addr, is_userptr ? "USR" : is_vram ? "VRAM" : "SYS");
 	}
 	up_read(&vm->lock);
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [Intel-xe] [PATCH v3 2/6] drm/xe: s/lmem/vram/
  2023-03-08 12:30 ` [Intel-xe] [PATCH v3 2/6] drm/xe: s/lmem/vram/ Matthew Auld
@ 2023-03-08 13:37   ` Rodrigo Vivi
  0 siblings, 0 replies; 10+ messages in thread
From: Rodrigo Vivi @ 2023-03-08 13:37 UTC (permalink / raw)
  To: Matthew Auld; +Cc: Lucas De Marchi, intel-xe

On Wed, Mar 08, 2023 at 12:30:08PM +0000, Matthew Auld wrote:
> This seems to be the preferred nomenclature in xe. Currently we are
> intermixing vram and lmem, which is confusing.

I honestly like local memory or device memory term better since it
opposes nicely to the system memory... But well, VRAM aligns better
with TTM.

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

> 
> v2 (Gwan-gyeong Mun & Lucas):
>   - Rather apply to the entire driver
> 
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
> Cc: Lucas De Marchi <lucas.demarchi@intel.com>
> Acked-by: Lucas De Marchi <lucas.demarchi@intel.com>
> ---
>  drivers/gpu/drm/xe/Kconfig.debug      |  2 +-
>  drivers/gpu/drm/xe/tests/xe_migrate.c |  2 +-
>  drivers/gpu/drm/xe/xe_bo.c            | 10 +++----
>  drivers/gpu/drm/xe/xe_bo.h            |  6 ++--
>  drivers/gpu/drm/xe/xe_ggtt.c          |  6 ++--
>  drivers/gpu/drm/xe/xe_migrate.c       | 12 ++++----
>  drivers/gpu/drm/xe/xe_mmio.c          | 40 +++++++++++++--------------
>  drivers/gpu/drm/xe/xe_module.c        |  6 ++--
>  drivers/gpu/drm/xe/xe_module.h        |  2 +-
>  drivers/gpu/drm/xe/xe_pt.c            | 12 ++++----
>  drivers/gpu/drm/xe/xe_vm.c            | 10 +++----
>  11 files changed, 54 insertions(+), 54 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/Kconfig.debug b/drivers/gpu/drm/xe/Kconfig.debug
> index 565be3f6b9b9..93b284cdd0a2 100644
> --- a/drivers/gpu/drm/xe/Kconfig.debug
> +++ b/drivers/gpu/drm/xe/Kconfig.debug
> @@ -41,7 +41,7 @@ config DRM_XE_DEBUG_VM
>  	  If in doubt, say "N".
>  
>  config DRM_XE_DEBUG_MEM
> -	bool "Enable passing SYS/LMEM addresses to user space"
> +	bool "Enable passing SYS/VRAM addresses to user space"
>  	default n
>  	help
>  	  Pass object location trough uapi. Intended for extended
> diff --git a/drivers/gpu/drm/xe/tests/xe_migrate.c b/drivers/gpu/drm/xe/tests/xe_migrate.c
> index b7e4a126e8b7..ac659b94e7f5 100644
> --- a/drivers/gpu/drm/xe/tests/xe_migrate.c
> +++ b/drivers/gpu/drm/xe/tests/xe_migrate.c
> @@ -129,7 +129,7 @@ static void test_copy(struct xe_migrate *m, struct xe_bo *bo,
>  	}
>  	dma_fence_put(fence);
>  
> -	/* Try to copy 0xc0 from sysmem to lmem with 2MB or 64KiB/4KiB pages */
> +	/* Try to copy 0xc0 from sysmem to vram with 2MB or 64KiB/4KiB pages */
>  	xe_map_memset(xe, &sysmem->vmap, 0, 0xc0, sysmem->size);
>  	xe_map_memset(xe, &bo->vmap, 0, 0xd0, bo->size);
>  
> diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
> index 876f77669104..73a7f2cd4ad8 100644
> --- a/drivers/gpu/drm/xe/xe_bo.c
> +++ b/drivers/gpu/drm/xe/xe_bo.c
> @@ -1290,12 +1290,12 @@ int xe_bo_pin(struct xe_bo *bo)
>  	if (IS_DGFX(xe) && !(IS_ENABLED(CONFIG_DRM_XE_DEBUG) &&
>  	    bo->flags & XE_BO_INTERNAL_TEST)) {
>  		struct ttm_place *place = &(bo->placements[0]);
> -		bool lmem;
> +		bool vram;
>  
>  		if (mem_type_is_vram(place->mem_type)) {
>  			XE_BUG_ON(!(place->flags & TTM_PL_FLAG_CONTIGUOUS));
>  
> -			place->fpfn = (xe_bo_addr(bo, 0, PAGE_SIZE, &lmem) -
> +			place->fpfn = (xe_bo_addr(bo, 0, PAGE_SIZE, &vram) -
>  				       vram_region_io_offset(bo)) >> PAGE_SHIFT;
>  			place->lpfn = place->fpfn + (bo->size >> PAGE_SHIFT);
>  
> @@ -1415,7 +1415,7 @@ bool xe_bo_is_xe_bo(struct ttm_buffer_object *bo)
>  }
>  
>  dma_addr_t xe_bo_addr(struct xe_bo *bo, u64 offset,
> -		      size_t page_size, bool *is_lmem)
> +		      size_t page_size, bool *is_vram)
>  {
>  	struct xe_res_cursor cur;
>  	u64 page;
> @@ -1427,9 +1427,9 @@ dma_addr_t xe_bo_addr(struct xe_bo *bo, u64 offset,
>  	page = offset >> PAGE_SHIFT;
>  	offset &= (PAGE_SIZE - 1);
>  
> -	*is_lmem = xe_bo_is_vram(bo);
> +	*is_vram = xe_bo_is_vram(bo);
>  
> -	if (!*is_lmem && !xe_bo_is_stolen(bo)) {
> +	if (!*is_vram && !xe_bo_is_stolen(bo)) {
>  		XE_BUG_ON(!bo->ttm.ttm);
>  
>  		xe_res_first_sg(xe_bo_get_sg(bo), page << PAGE_SHIFT,
> diff --git a/drivers/gpu/drm/xe/xe_bo.h b/drivers/gpu/drm/xe/xe_bo.h
> index 0699b2b4c5ca..f841e74cd417 100644
> --- a/drivers/gpu/drm/xe/xe_bo.h
> +++ b/drivers/gpu/drm/xe/xe_bo.h
> @@ -196,14 +196,14 @@ static inline void xe_bo_unpin_map_no_vm(struct xe_bo *bo)
>  
>  bool xe_bo_is_xe_bo(struct ttm_buffer_object *bo);
>  dma_addr_t xe_bo_addr(struct xe_bo *bo, u64 offset,
> -		      size_t page_size, bool *is_lmem);
> +		      size_t page_size, bool *is_vram);
>  
>  static inline dma_addr_t
>  xe_bo_main_addr(struct xe_bo *bo, size_t page_size)
>  {
> -	bool is_lmem;
> +	bool is_vram;
>  
> -	return xe_bo_addr(bo, 0, page_size, &is_lmem);
> +	return xe_bo_addr(bo, 0, page_size, &is_vram);
>  }
>  
>  static inline u32
> diff --git a/drivers/gpu/drm/xe/xe_ggtt.c b/drivers/gpu/drm/xe/xe_ggtt.c
> index a0d672b6afb2..a430d1568890 100644
> --- a/drivers/gpu/drm/xe/xe_ggtt.c
> +++ b/drivers/gpu/drm/xe/xe_ggtt.c
> @@ -28,12 +28,12 @@ u64 xe_ggtt_pte_encode(struct xe_bo *bo, u64 bo_offset)
>  {
>  	struct xe_device *xe = xe_bo_device(bo);
>  	u64 pte;
> -	bool is_lmem;
> +	bool is_vram;
>  
> -	pte = xe_bo_addr(bo, bo_offset, GEN8_PAGE_SIZE, &is_lmem);
> +	pte = xe_bo_addr(bo, bo_offset, GEN8_PAGE_SIZE, &is_vram);
>  	pte |= GEN8_PAGE_PRESENT;
>  
> -	if (is_lmem)
> +	if (is_vram)
>  		pte |= GEN12_GGTT_PTE_LM;
>  
>  	/* FIXME: vfunc + pass in caching rules */
> diff --git a/drivers/gpu/drm/xe/xe_migrate.c b/drivers/gpu/drm/xe/xe_migrate.c
> index bc69ec17d5ad..c0523d8fe944 100644
> --- a/drivers/gpu/drm/xe/xe_migrate.c
> +++ b/drivers/gpu/drm/xe/xe_migrate.c
> @@ -222,15 +222,15 @@ static int xe_migrate_prepare_vm(struct xe_gt *gt, struct xe_migrate *m,
>  			level++;
>  		}
>  	} else {
> -		bool is_lmem;
> -		u64 batch_addr = xe_bo_addr(batch, 0, GEN8_PAGE_SIZE, &is_lmem);
> +		bool is_vram;
> +		u64 batch_addr = xe_bo_addr(batch, 0, GEN8_PAGE_SIZE, &is_vram);
>  
>  		m->batch_base_ofs = xe_migrate_vram_ofs(batch_addr);
>  
>  		if (xe->info.supports_usm) {
>  			batch = gt->usm.bb_pool.bo;
>  			batch_addr = xe_bo_addr(batch, 0, GEN8_PAGE_SIZE,
> -						&is_lmem);
> +						&is_vram);
>  			m->usm_batch_base_ofs = xe_migrate_vram_ofs(batch_addr);
>  		}
>  	}
> @@ -933,12 +933,12 @@ static void write_pgtable(struct xe_gt *gt, struct xe_bb *bb, u64 ppgtt_ofs,
>  	 */
>  	XE_BUG_ON(update->qwords > 0x1ff);
>  	if (!ppgtt_ofs) {
> -		bool is_lmem;
> +		bool is_vram;
>  
>  		ppgtt_ofs = xe_migrate_vram_ofs(xe_bo_addr(update->pt_bo, 0,
>  							   GEN8_PAGE_SIZE,
> -							   &is_lmem));
> -		XE_BUG_ON(!is_lmem);
> +							   &is_vram));
> +		XE_BUG_ON(!is_vram);
>  	}
>  
>  	do {
> diff --git a/drivers/gpu/drm/xe/xe_mmio.c b/drivers/gpu/drm/xe/xe_mmio.c
> index 65b0df9bb579..e5bd4609aaee 100644
> --- a/drivers/gpu/drm/xe/xe_mmio.c
> +++ b/drivers/gpu/drm/xe/xe_mmio.c
> @@ -68,7 +68,7 @@ _resize_bar(struct xe_device *xe, int resno, resource_size_t size)
>  	return 1;
>  }
>  
> -static int xe_resize_lmem_bar(struct xe_device *xe, resource_size_t lmem_size)
> +static int xe_resize_vram_bar(struct xe_device *xe, resource_size_t vram_size)
>  {
>  	struct pci_dev *pdev = to_pci_dev(xe->drm.dev);
>  	struct pci_bus *root = pdev->bus;
> @@ -78,31 +78,31 @@ static int xe_resize_lmem_bar(struct xe_device *xe, resource_size_t lmem_size)
>  	u32 pci_cmd;
>  	int i;
>  	int ret;
> -	u64 force_lmem_bar_size = xe_force_lmem_bar_size;
> +	u64 force_vram_bar_size = xe_force_vram_bar_size;
>  
>  	current_size = roundup_pow_of_two(pci_resource_len(pdev, GEN12_LMEM_BAR));
>  
> -	if (force_lmem_bar_size) {
> +	if (force_vram_bar_size) {
>  		u32 bar_sizes;
>  
> -		rebar_size = force_lmem_bar_size * (resource_size_t)SZ_1M;
> +		rebar_size = force_vram_bar_size * (resource_size_t)SZ_1M;
>  		bar_sizes = pci_rebar_get_possible_sizes(pdev, GEN12_LMEM_BAR);
>  
>  		if (rebar_size == current_size)
>  			return 0;
>  
>  		if (!(bar_sizes & BIT(pci_rebar_bytes_to_size(rebar_size))) ||
> -		    rebar_size >= roundup_pow_of_two(lmem_size)) {
> -			rebar_size = lmem_size;
> +		    rebar_size >= roundup_pow_of_two(vram_size)) {
> +			rebar_size = vram_size;
>  			drm_info(&xe->drm,
>  				 "Given bar size is not within supported size, setting it to default: %llu\n",
> -				 (u64)lmem_size >> 20);
> +				 (u64)vram_size >> 20);
>  		}
>  	} else {
>  		rebar_size = current_size;
>  
> -		if (rebar_size != roundup_pow_of_two(lmem_size))
> -			rebar_size = lmem_size;
> +		if (rebar_size != roundup_pow_of_two(vram_size))
> +			rebar_size = vram_size;
>  		else
>  			return 0;
>  	}
> @@ -117,7 +117,7 @@ static int xe_resize_lmem_bar(struct xe_device *xe, resource_size_t lmem_size)
>  	}
>  
>  	if (!root_res) {
> -		drm_info(&xe->drm, "Can't resize LMEM BAR - platform support is missing\n");
> +		drm_info(&xe->drm, "Can't resize VRAM BAR - platform support is missing\n");
>  		return -1;
>  	}
>  
> @@ -168,7 +168,7 @@ int xe_mmio_total_vram_size(struct xe_device *xe, u64 *vram_size, u64 *usable_si
>  	if (usable_size) {
>  		reg = xe_gt_mcr_unicast_read_any(gt, XEHP_FLAT_CCS_BASE_ADDR);
>  		*usable_size = (u64)REG_FIELD_GET(GENMASK(31, 8), reg) * SZ_64K;
> -		drm_info(&xe->drm, "lmem_size: 0x%llx usable_size: 0x%llx\n",
> +		drm_info(&xe->drm, "vram_size: 0x%llx usable_size: 0x%llx\n",
>  			 *vram_size, *usable_size);
>  	}
>  
> @@ -180,7 +180,7 @@ int xe_mmio_probe_vram(struct xe_device *xe)
>  	struct pci_dev *pdev = to_pci_dev(xe->drm.dev);
>  	struct xe_gt *gt;
>  	u8 id;
> -	u64 lmem_size;
> +	u64 vram_size;
>  	u64 original_size;
>  	u64 current_size;
>  	u64 usable_size;
> @@ -207,29 +207,29 @@ int xe_mmio_probe_vram(struct xe_device *xe)
>  	gt = xe_device_get_gt(xe, 0);
>  	original_size = pci_resource_len(pdev, GEN12_LMEM_BAR);
>  
> -	err = xe_mmio_total_vram_size(xe, &lmem_size, &usable_size);
> +	err = xe_mmio_total_vram_size(xe, &vram_size, &usable_size);
>  	if (err)
>  		return err;
>  
> -	resize_result = xe_resize_lmem_bar(xe, lmem_size);
> +	resize_result = xe_resize_vram_bar(xe, vram_size);
>  	current_size = pci_resource_len(pdev, GEN12_LMEM_BAR);
>  	xe->mem.vram.io_start = pci_resource_start(pdev, GEN12_LMEM_BAR);
>  
> -	xe->mem.vram.size = min(current_size, lmem_size);
> +	xe->mem.vram.size = min(current_size, vram_size);
>  
>  	if (!xe->mem.vram.size)
>  		return -EIO;
>  
>  	if (resize_result > 0)
> -		drm_info(&xe->drm, "Successfully resize LMEM from %lluMiB to %lluMiB\n",
> +		drm_info(&xe->drm, "Successfully resize VRAM from %lluMiB to %lluMiB\n",
>  			 (u64)original_size >> 20,
>  			 (u64)current_size >> 20);
> -	else if (xe->mem.vram.size < lmem_size && !xe_force_lmem_bar_size)
> +	else if (xe->mem.vram.size < vram_size && !xe_force_vram_bar_size)
>  		drm_info(&xe->drm, "Using a reduced BAR size of %lluMiB. Consider enabling 'Resizable BAR' support in your BIOS.\n",
>  			 (u64)xe->mem.vram.size >> 20);
> -	if (xe->mem.vram.size < lmem_size)
> +	if (xe->mem.vram.size < vram_size)
>  		drm_warn(&xe->drm, "Restricting VRAM size to PCI resource size (0x%llx->0x%llx)\n",
> -			 lmem_size, (u64)xe->mem.vram.size);
> +			 vram_size, (u64)xe->mem.vram.size);
>  
>  	xe->mem.vram.mapping = ioremap_wc(xe->mem.vram.io_start, xe->mem.vram.size);
>  	xe->mem.vram.size = min_t(u64, xe->mem.vram.size, usable_size);
> @@ -360,7 +360,7 @@ int xe_mmio_init(struct xe_device *xe)
>  	 * and we should not continue with driver initialization.
>  	 */
>  	if (IS_DGFX(xe) && !(xe_mmio_read32(gt, GU_CNTL.reg) & LMEM_INIT)) {
> -		drm_err(&xe->drm, "LMEM not initialized by firmware\n");
> +		drm_err(&xe->drm, "VRAM not initialized by firmware\n");
>  		return -ENODEV;
>  	}
>  
> diff --git a/drivers/gpu/drm/xe/xe_module.c b/drivers/gpu/drm/xe/xe_module.c
> index 5a51a9959eff..6860586ce7f8 100644
> --- a/drivers/gpu/drm/xe/xe_module.c
> +++ b/drivers/gpu/drm/xe/xe_module.c
> @@ -22,9 +22,9 @@ bool enable_display = true;
>  module_param_named(enable_display, enable_display, bool, 0444);
>  MODULE_PARM_DESC(enable_display, "Enable display");
>  
> -u32 xe_force_lmem_bar_size;
> -module_param_named(lmem_bar_size, xe_force_lmem_bar_size, uint, 0600);
> -MODULE_PARM_DESC(lmem_bar_size, "Set the lmem bar size(in MiB)");
> +u32 xe_force_vram_bar_size;
> +module_param_named(vram_bar_size, xe_force_vram_bar_size, uint, 0600);
> +MODULE_PARM_DESC(vram_bar_size, "Set the vram bar size(in MiB)");
>  
>  int xe_guc_log_level = 5;
>  module_param_named(guc_log_level, xe_guc_log_level, int, 0600);
> diff --git a/drivers/gpu/drm/xe/xe_module.h b/drivers/gpu/drm/xe/xe_module.h
> index 2c6ee46f5595..86916c176382 100644
> --- a/drivers/gpu/drm/xe/xe_module.h
> +++ b/drivers/gpu/drm/xe/xe_module.h
> @@ -8,6 +8,6 @@
>  /* Module modprobe variables */
>  extern bool enable_guc;
>  extern bool enable_display;
> -extern u32 xe_force_lmem_bar_size;
> +extern u32 xe_force_vram_bar_size;
>  extern int xe_guc_log_level;
>  extern char *xe_param_force_probe;
> diff --git a/drivers/gpu/drm/xe/xe_pt.c b/drivers/gpu/drm/xe/xe_pt.c
> index 06fc7b206997..dfd97b0ec42a 100644
> --- a/drivers/gpu/drm/xe/xe_pt.c
> +++ b/drivers/gpu/drm/xe/xe_pt.c
> @@ -62,12 +62,12 @@ u64 gen8_pde_encode(struct xe_bo *bo, u64 bo_offset,
>  		    const enum xe_cache_level level)
>  {
>  	u64 pde;
> -	bool is_lmem;
> +	bool is_vram;
>  
> -	pde = xe_bo_addr(bo, bo_offset, GEN8_PAGE_SIZE, &is_lmem);
> +	pde = xe_bo_addr(bo, bo_offset, GEN8_PAGE_SIZE, &is_vram);
>  	pde |= GEN8_PAGE_PRESENT | GEN8_PAGE_RW;
>  
> -	XE_WARN_ON(IS_DGFX(xe_bo_device(bo)) && !is_lmem);
> +	XE_WARN_ON(IS_DGFX(xe_bo_device(bo)) && !is_vram);
>  
>  	/* FIXME: I don't think the PPAT handling is correct for MTL */
>  
> @@ -80,13 +80,13 @@ u64 gen8_pde_encode(struct xe_bo *bo, u64 bo_offset,
>  }
>  
>  static dma_addr_t vma_addr(struct xe_vma *vma, u64 offset,
> -			   size_t page_size, bool *is_lmem)
> +			   size_t page_size, bool *is_vram)
>  {
>  	if (xe_vma_is_userptr(vma)) {
>  		struct xe_res_cursor cur;
>  		u64 page;
>  
> -		*is_lmem = false;
> +		*is_vram = false;
>  		page = offset >> PAGE_SHIFT;
>  		offset &= (PAGE_SIZE - 1);
>  
> @@ -94,7 +94,7 @@ static dma_addr_t vma_addr(struct xe_vma *vma, u64 offset,
>  				&cur);
>  		return xe_res_dma(&cur) + offset;
>  	} else {
> -		return xe_bo_addr(vma->bo, offset, page_size, is_lmem);
> +		return xe_bo_addr(vma->bo, offset, page_size, is_vram);
>  	}
>  }
>  
> diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
> index a0e35627e45e..6cf7172e84cc 100644
> --- a/drivers/gpu/drm/xe/xe_vm.c
> +++ b/drivers/gpu/drm/xe/xe_vm.c
> @@ -3355,7 +3355,7 @@ int xe_vm_invalidate_vma(struct xe_vma *vma)
>  int xe_analyze_vm(struct drm_printer *p, struct xe_vm *vm, int gt_id)
>  {
>  	struct rb_node *node;
> -	bool is_lmem;
> +	bool is_vram;
>  	uint64_t addr;
>  
>  	if (!down_read_trylock(&vm->lock)) {
> @@ -3363,8 +3363,8 @@ int xe_analyze_vm(struct drm_printer *p, struct xe_vm *vm, int gt_id)
>  		return 0;
>  	}
>  	if (vm->pt_root[gt_id]) {
> -		addr = xe_bo_addr(vm->pt_root[gt_id]->bo, 0, GEN8_PAGE_SIZE, &is_lmem);
> -		drm_printf(p, " VM root: A:0x%llx %s\n", addr, is_lmem ? "LMEM" : "SYS");
> +		addr = xe_bo_addr(vm->pt_root[gt_id]->bo, 0, GEN8_PAGE_SIZE, &is_vram);
> +		drm_printf(p, " VM root: A:0x%llx %s\n", addr, is_vram ? "VRAM" : "SYS");
>  	}
>  
>  	for (node = rb_first(&vm->vmas); node; node = rb_next(node)) {
> @@ -3377,11 +3377,11 @@ int xe_analyze_vm(struct drm_printer *p, struct xe_vm *vm, int gt_id)
>  			xe_res_first_sg(vma->userptr.sg, 0, GEN8_PAGE_SIZE, &cur);
>  			addr = xe_res_dma(&cur);
>  		} else {
> -			addr = xe_bo_addr(vma->bo, 0, GEN8_PAGE_SIZE, &is_lmem);
> +			addr = xe_bo_addr(vma->bo, 0, GEN8_PAGE_SIZE, &is_vram);
>  		}
>  		drm_printf(p, " [%016llx-%016llx] S:0x%016llx A:%016llx %s\n",
>  			   vma->start, vma->end, vma->end - vma->start + 1ull,
> -			   addr, is_userptr ? "USR" : is_lmem ? "VRAM" : "SYS");
> +			   addr, is_userptr ? "USR" : is_vram ? "VRAM" : "SYS");
>  	}
>  	up_read(&vm->lock);
>  
> -- 
> 2.39.2
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Intel-xe] [PATCH v3 3/6] drm/xe/vram: start tracking the io_size
  2023-03-08 12:30 [Intel-xe] [PATCH v3 0/6] Some small-bar prep patches Matthew Auld
  2023-03-08 12:30 ` [Intel-xe] [PATCH v3 1/6] drm/xe: add xe_ttm_stolen_cpu_access_needs_ggtt() Matthew Auld
  2023-03-08 12:30 ` [Intel-xe] [PATCH v3 2/6] drm/xe: s/lmem/vram/ Matthew Auld
@ 2023-03-08 12:30 ` Matthew Auld
  2023-03-08 12:30 ` [Intel-xe] [PATCH v3 4/6] drm/xe/buddy: remove the virtualized start Matthew Auld
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Matthew Auld @ 2023-03-08 12:30 UTC (permalink / raw)
  To: intel-xe; +Cc: Lucas De Marchi

First step towards supporting small-bar is to track the io_size for
vram. We can longer assume that the io_size == vram size. This way we
know how much is CPU accessible via the BAR, and how much is not.
Effectively giving us a two tiered vram, where in some later patches we
can support different allocation strategies depending on if the memory
needs to be CPU accessible or not.

Note as this stage we still clamp the vram size to the usable vram size.
Only in the final patch do we turn this on for real, and allow distinct
io_size and vram_size.

v2: (Lucas):
  - Improve the commit message, plus improve the kernel-doc for the
    io_size to give a better sense of what it actually is.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
---
 drivers/gpu/drm/xe/xe_device_types.h | 14 +++++++--
 drivers/gpu/drm/xe/xe_gt_types.h     | 14 +++++++--
 drivers/gpu/drm/xe/xe_mmio.c         | 44 ++++++++++++++++++++--------
 3 files changed, 55 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
index 034e0956f4ea..b98ffad45064 100644
--- a/drivers/gpu/drm/xe/xe_device_types.h
+++ b/drivers/gpu/drm/xe/xe_device_types.h
@@ -176,9 +176,19 @@ struct xe_device {
 	struct {
 		/** @vram: VRAM info for device */
 		struct {
-			/** @io_start: start address of VRAM */
+			/** @io_start: IO start address of VRAM */
 			resource_size_t io_start;
-			/** @size: size of VRAM */
+			/**
+			 * @io_size: IO size of VRAM.
+			 *
+			 * This represents how much of VRAM we can access via
+			 * the CPU through the VRAM BAR. This can be smaller
+			 * than @size, in which case only part of VRAM is CPU
+			 * accessible (typically the first 256M). This
+			 * configuration is known as small-bar.
+			 */
+			resource_size_t io_size;
+			/** @size: Total size of VRAM */
 			resource_size_t size;
 			/** @mapping: pointer to VRAM mappable space */
 			void *__iomem mapping;
diff --git a/drivers/gpu/drm/xe/xe_gt_types.h b/drivers/gpu/drm/xe/xe_gt_types.h
index 74b4e6776bf1..8f29aba455e0 100644
--- a/drivers/gpu/drm/xe/xe_gt_types.h
+++ b/drivers/gpu/drm/xe/xe_gt_types.h
@@ -143,9 +143,19 @@ struct xe_gt {
 		 * (virtual split), can be subset of global device VRAM
 		 */
 		struct {
-			/** @io_start: start address of VRAM */
+			/** @io_start: IO start address of this VRAM instance */
 			resource_size_t io_start;
-			/** @size: size of VRAM */
+			/**
+			 * @io_size: IO size of this VRAM instance
+			 *
+			 * This represents how much of this VRAM we can access
+			 * via the CPU through the VRAM BAR. This can be smaller
+			 * than @size, in which case only part of VRAM is CPU
+			 * accessible (typically the first 256M). This
+			 * configuration is known as small-bar.
+			 */
+			resource_size_t io_size;
+			/** @size: size of VRAM. */
 			resource_size_t size;
 			/** @mapping: pointer to VRAM mappable space */
 			void *__iomem mapping;
diff --git a/drivers/gpu/drm/xe/xe_mmio.c b/drivers/gpu/drm/xe/xe_mmio.c
index e5bd4609aaee..5cacaa05759a 100644
--- a/drivers/gpu/drm/xe/xe_mmio.c
+++ b/drivers/gpu/drm/xe/xe_mmio.c
@@ -182,7 +182,6 @@ int xe_mmio_probe_vram(struct xe_device *xe)
 	u8 id;
 	u64 vram_size;
 	u64 original_size;
-	u64 current_size;
 	u64 usable_size;
 	int resize_result, err;
 
@@ -190,11 +189,13 @@ int xe_mmio_probe_vram(struct xe_device *xe)
 		xe->mem.vram.mapping = 0;
 		xe->mem.vram.size = 0;
 		xe->mem.vram.io_start = 0;
+		xe->mem.vram.io_size = 0;
 
 		for_each_gt(gt, xe, id) {
 			gt->mem.vram.mapping = 0;
 			gt->mem.vram.size = 0;
 			gt->mem.vram.io_start = 0;
+			gt->mem.vram.io_size = 0;
 		}
 		return 0;
 	}
@@ -212,10 +213,10 @@ int xe_mmio_probe_vram(struct xe_device *xe)
 		return err;
 
 	resize_result = xe_resize_vram_bar(xe, vram_size);
-	current_size = pci_resource_len(pdev, GEN12_LMEM_BAR);
 	xe->mem.vram.io_start = pci_resource_start(pdev, GEN12_LMEM_BAR);
-
-	xe->mem.vram.size = min(current_size, vram_size);
+	xe->mem.vram.io_size = min(usable_size,
+				   pci_resource_len(pdev, GEN12_LMEM_BAR));
+	xe->mem.vram.size = xe->mem.vram.io_size;
 
 	if (!xe->mem.vram.size)
 		return -EIO;
@@ -223,15 +224,15 @@ int xe_mmio_probe_vram(struct xe_device *xe)
 	if (resize_result > 0)
 		drm_info(&xe->drm, "Successfully resize VRAM from %lluMiB to %lluMiB\n",
 			 (u64)original_size >> 20,
-			 (u64)current_size >> 20);
-	else if (xe->mem.vram.size < vram_size && !xe_force_vram_bar_size)
+			 (u64)xe->mem.vram.io_size >> 20);
+	else if (xe->mem.vram.io_size < usable_size && !xe_force_vram_bar_size)
 		drm_info(&xe->drm, "Using a reduced BAR size of %lluMiB. Consider enabling 'Resizable BAR' support in your BIOS.\n",
 			 (u64)xe->mem.vram.size >> 20);
 	if (xe->mem.vram.size < vram_size)
 		drm_warn(&xe->drm, "Restricting VRAM size to PCI resource size (0x%llx->0x%llx)\n",
 			 vram_size, (u64)xe->mem.vram.size);
 
-	xe->mem.vram.mapping = ioremap_wc(xe->mem.vram.io_start, xe->mem.vram.size);
+	xe->mem.vram.mapping = ioremap_wc(xe->mem.vram.io_start, xe->mem.vram.io_size);
 	xe->mem.vram.size = min_t(u64, xe->mem.vram.size, usable_size);
 
 	drm_info(&xe->drm, "TOTAL VRAM: %pa, %pa\n", &xe->mem.vram.io_start, &xe->mem.vram.size);
@@ -239,7 +240,7 @@ int xe_mmio_probe_vram(struct xe_device *xe)
 	/* FIXME: Assuming equally partitioned VRAM, incorrect */
 	if (xe->info.tile_count > 1) {
 		u8 adj_tile_count = xe->info.tile_count;
-		resource_size_t size, io_start;
+		resource_size_t size, io_start, io_size;
 
 		for_each_gt(gt, xe, id)
 			if (xe_gt_is_media_type(gt))
@@ -249,15 +250,31 @@ int xe_mmio_probe_vram(struct xe_device *xe)
 
 		size = xe->mem.vram.size / adj_tile_count;
 		io_start = xe->mem.vram.io_start;
+		io_size = xe->mem.vram.io_size;
 
 		for_each_gt(gt, xe, id) {
-			if (id && !xe_gt_is_media_type(gt))
-				io_start += size;
+			if (id && !xe_gt_is_media_type(gt)) {
+				io_size -= min(io_size, size);
+				io_start += io_size;
+			}
 
 			gt->mem.vram.size = size;
-			gt->mem.vram.io_start = io_start;
-			gt->mem.vram.mapping = xe->mem.vram.mapping +
-				(io_start - xe->mem.vram.io_start);
+
+			/*
+			 * XXX: multi-tile small-bar might be wild. Hopefully
+			 * full tile without any mappable vram is not something
+			 * we care about.
+			 */
+
+			gt->mem.vram.io_size = min(size, io_size);
+			if (io_size) {
+				gt->mem.vram.io_start = io_start;
+				gt->mem.vram.mapping = xe->mem.vram.mapping +
+					(io_start - xe->mem.vram.io_start);
+			} else {
+				drm_err(&xe->drm, "Tile without any CPU visible VRAM. Aborting.\n");
+				return -ENODEV;
+			}
 
 			drm_info(&xe->drm, "VRAM[%u, %u]: %pa, %pa\n",
 				 id, gt->info.vram_id, &gt->mem.vram.io_start,
@@ -266,6 +283,7 @@ int xe_mmio_probe_vram(struct xe_device *xe)
 	} else {
 		gt->mem.vram.size = xe->mem.vram.size;
 		gt->mem.vram.io_start = xe->mem.vram.io_start;
+		gt->mem.vram.io_size = xe->mem.vram.io_size;
 		gt->mem.vram.mapping = xe->mem.vram.mapping;
 
 		drm_info(&xe->drm, "VRAM: %pa\n", &gt->mem.vram.size);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [Intel-xe] [PATCH v3 4/6] drm/xe/buddy: remove the virtualized start
  2023-03-08 12:30 [Intel-xe] [PATCH v3 0/6] Some small-bar prep patches Matthew Auld
                   ` (2 preceding siblings ...)
  2023-03-08 12:30 ` [Intel-xe] [PATCH v3 3/6] drm/xe/vram: start tracking the io_size Matthew Auld
@ 2023-03-08 12:30 ` Matthew Auld
  2023-03-08 12:30 ` [Intel-xe] [PATCH v3 5/6] drm/xe/buddy: add visible tracking Matthew Auld
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Matthew Auld @ 2023-03-08 12:30 UTC (permalink / raw)
  To: intel-xe; +Cc: Lucas De Marchi

Hopefully not needed anymore. We can add a .compatible() hook once we
need to differentiate between mappable and non-mappable vram. If the
allocation is not contiguous then the start value is kind of
meaningless, so rather just mark as invalid.

In upstream, TTM wants to eventually remove the ttm_resource.start
usage.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
---
 drivers/gpu/drm/xe/xe_bo.c           |  6 ++++++
 drivers/gpu/drm/xe/xe_ttm_vram_mgr.c | 29 ++++++++++++++--------------
 2 files changed, 20 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
index 73a7f2cd4ad8..346573f406c2 100644
--- a/drivers/gpu/drm/xe/xe_bo.c
+++ b/drivers/gpu/drm/xe/xe_bo.c
@@ -662,6 +662,12 @@ static int xe_bo_move(struct ttm_buffer_object *ttm_bo, bool evict,
 				void *new_addr = gt->mem.vram.mapping +
 					(new_mem->start << PAGE_SHIFT);
 
+				if (XE_WARN_ON(new_mem->start == XE_BO_INVALID_OFFSET)) {
+					ret = -EINVAL;
+					xe_device_mem_access_put(xe);
+					goto out;
+				}
+
 				XE_BUG_ON(new_mem->start !=
 					  bo->placements->fpfn);
 
diff --git a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
index 643365b18bc7..e3ac8c6d3978 100644
--- a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
+++ b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
@@ -54,7 +54,6 @@ static int xe_ttm_vram_mgr_new(struct ttm_resource_manager *man,
 	struct xe_ttm_vram_mgr_resource *vres;
 	u64 size, remaining_size, lpfn, fpfn;
 	struct drm_buddy *mm = &mgr->mm;
-	struct drm_buddy_block *block;
 	unsigned long pages_per_block;
 	int r;
 
@@ -186,24 +185,24 @@ static int xe_ttm_vram_mgr_new(struct ttm_resource_manager *man,
 			list_splice_tail(trim_list, &vres->blocks);
 	}
 
-	vres->base.start = 0;
-	list_for_each_entry(block, &vres->blocks, link) {
-		unsigned long start;
+	if (!(vres->base.placement & TTM_PL_FLAG_CONTIGUOUS) &&
+	    xe_is_vram_mgr_blocks_contiguous(mm, &vres->blocks))
+		vres->base.placement |= TTM_PL_FLAG_CONTIGUOUS;
 
-		start = drm_buddy_block_offset(block) +
-			drm_buddy_block_size(mm, block);
-		start >>= PAGE_SHIFT;
+	/*
+	 * For some kernel objects we still rely on the start when io mapping
+	 * the object.
+	 */
+	if (vres->base.placement & TTM_PL_FLAG_CONTIGUOUS) {
+		struct drm_buddy_block *block = list_first_entry(&vres->blocks,
+								 typeof(*block),
+								 link);
 
-		if (start > PFN_UP(vres->base.size))
-			start -= PFN_UP(vres->base.size);
-		else
-			start = 0;
-		vres->base.start = max(vres->base.start, start);
+		vres->base.start = drm_buddy_block_offset(block) >> PAGE_SHIFT;
+	} else {
+		vres->base.start = XE_BO_INVALID_OFFSET;
 	}
 
-	if (xe_is_vram_mgr_blocks_contiguous(mm, &vres->blocks))
-		vres->base.placement |= TTM_PL_FLAG_CONTIGUOUS;
-
 	*res = &vres->base;
 	return 0;
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [Intel-xe] [PATCH v3 5/6] drm/xe/buddy: add visible tracking
  2023-03-08 12:30 [Intel-xe] [PATCH v3 0/6] Some small-bar prep patches Matthew Auld
                   ` (3 preceding siblings ...)
  2023-03-08 12:30 ` [Intel-xe] [PATCH v3 4/6] drm/xe/buddy: remove the virtualized start Matthew Auld
@ 2023-03-08 12:30 ` Matthew Auld
  2023-03-08 13:40   ` Rodrigo Vivi
  2023-03-08 12:30 ` [Intel-xe] [PATCH v3 6/6] drm/xe/buddy: add compatible and intersects hooks Matthew Auld
  2023-03-13 16:05 ` [Intel-xe] ✗ CI.Patch_applied: failure for Some small-bar prep patches (rev3) Patchwork
  6 siblings, 1 reply; 10+ messages in thread
From: Matthew Auld @ 2023-03-08 12:30 UTC (permalink / raw)
  To: intel-xe; +Cc: Lucas De Marchi

Replace the allocation code with the i915 version. This simplifies the
code a little, and importantly we get the accounting at the mgr level,
which is useful for debug (and maybe userspace), plus per resource
tracking so we can easily check if a resource is using one or pages in
the mappable part of vram (useful for eviction), or if the resource is
completely within the mappable portion (useful for checking if the
resource can be safely CPU mapped).

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
---
 drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c     |  18 +-
 drivers/gpu/drm/xe/xe_ttm_vram_mgr.c       | 201 ++++++++++-----------
 drivers/gpu/drm/xe/xe_ttm_vram_mgr.h       |   3 +-
 drivers/gpu/drm/xe/xe_ttm_vram_mgr_types.h |   6 +
 4 files changed, 118 insertions(+), 110 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c b/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c
index 0ad0f1820f4b..7e403dccf9d5 100644
--- a/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c
+++ b/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c
@@ -129,7 +129,7 @@ void xe_ttm_stolen_mgr_init(struct xe_device *xe)
 {
 	struct xe_ttm_stolen_mgr *mgr = drmm_kzalloc(&xe->drm, sizeof(*mgr), GFP_KERNEL);
 	struct pci_dev *pdev = to_pci_dev(xe->drm.dev);
-	u64 stolen_size, pgsize;
+	u64 stolen_size, io_size, pgsize;
 	int err;
 
 	if (IS_DGFX(xe))
@@ -148,7 +148,17 @@ void xe_ttm_stolen_mgr_init(struct xe_device *xe)
 	if (pgsize < PAGE_SIZE)
 		pgsize = PAGE_SIZE;
 
-	err = __xe_ttm_vram_mgr_init(xe, &mgr->base, XE_PL_STOLEN, stolen_size, pgsize);
+	/*
+	 * We don't try to attempt partial visible support for stolen vram,
+	 * since stolen is always at the end of vram, and the BAR size is pretty
+	 * much always 256M, with small-bar.
+	 */
+	io_size = 0;
+	if (mgr->io_base && !xe_ttm_stolen_cpu_access_needs_ggtt(xe))
+		io_size = stolen_size;
+
+	err = __xe_ttm_vram_mgr_init(xe, &mgr->base, XE_PL_STOLEN, stolen_size,
+				     io_size, pgsize);
 	if (err) {
 		drm_dbg_kms(&xe->drm, "Stolen mgr init failed: %i\n", err);
 		return;
@@ -157,8 +167,8 @@ void xe_ttm_stolen_mgr_init(struct xe_device *xe)
 	drm_dbg_kms(&xe->drm, "Initialized stolen memory support with %llu bytes\n",
 		    stolen_size);
 
-	if (mgr->io_base && !xe_ttm_stolen_cpu_access_needs_ggtt(xe))
-		mgr->mapping = devm_ioremap_wc(&pdev->dev, mgr->io_base, stolen_size);
+	if (io_size)
+		mgr->mapping = devm_ioremap_wc(&pdev->dev, mgr->io_base, io_size);
 }
 
 u64 xe_ttm_stolen_io_offset(struct xe_bo *bo, u32 offset)
diff --git a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
index e3ac8c6d3978..f703e962f499 100644
--- a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
+++ b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
@@ -49,45 +49,26 @@ static int xe_ttm_vram_mgr_new(struct ttm_resource_manager *man,
 			       const struct ttm_place *place,
 			       struct ttm_resource **res)
 {
-	u64 max_bytes, cur_size, min_block_size;
 	struct xe_ttm_vram_mgr *mgr = to_xe_ttm_vram_mgr(man);
 	struct xe_ttm_vram_mgr_resource *vres;
-	u64 size, remaining_size, lpfn, fpfn;
 	struct drm_buddy *mm = &mgr->mm;
-	unsigned long pages_per_block;
-	int r;
-
-	lpfn = (u64)place->lpfn << PAGE_SHIFT;
-	if (!lpfn || lpfn > man->size)
-		lpfn = man->size;
-
-	fpfn = (u64)place->fpfn << PAGE_SHIFT;
+	u64 size, remaining_size, min_page_size;
+	unsigned long lpfn;
+	int err;
 
-	max_bytes = mgr->manager.size;
-	if (place->flags & TTM_PL_FLAG_CONTIGUOUS) {
-		pages_per_block = ~0ul;
-	} else {
-#ifdef CONFIG_TRANSPARENT_HUGEPAGE
-		pages_per_block = HPAGE_PMD_NR;
-#else
-		/* default to 2MB */
-		pages_per_block = 2UL << (20UL - PAGE_SHIFT);
-#endif
-
-		pages_per_block = max_t(uint32_t, pages_per_block,
-					tbo->page_alignment);
-	}
+	lpfn = place->lpfn;
+	if (!lpfn || lpfn > man->size >> PAGE_SHIFT)
+		lpfn = man->size >> PAGE_SHIFT;
 
 	vres = kzalloc(sizeof(*vres), GFP_KERNEL);
 	if (!vres)
 		return -ENOMEM;
 
 	ttm_resource_init(tbo, place, &vres->base);
-	remaining_size = vres->base.size;
 
 	/* bail out quickly if there's likely not enough VRAM for this BO */
-	if (ttm_resource_manager_usage(man) > max_bytes) {
-		r = -ENOSPC;
+	if (ttm_resource_manager_usage(man) > man->size) {
+		err = -ENOSPC;
 		goto error_fini;
 	}
 
@@ -96,95 +77,91 @@ static int xe_ttm_vram_mgr_new(struct ttm_resource_manager *man,
 	if (place->flags & TTM_PL_FLAG_TOPDOWN)
 		vres->flags |= DRM_BUDDY_TOPDOWN_ALLOCATION;
 
-	if (fpfn || lpfn != man->size)
-		/* Allocate blocks in desired range */
+	if (place->fpfn || lpfn != man->size >> PAGE_SHIFT)
 		vres->flags |= DRM_BUDDY_RANGE_ALLOCATION;
 
-	mutex_lock(&mgr->lock);
-	while (remaining_size) {
-		if (tbo->page_alignment)
-			min_block_size = tbo->page_alignment << PAGE_SHIFT;
-		else
-			min_block_size = mgr->default_page_size;
-
-		XE_BUG_ON(min_block_size < mm->chunk_size);
-
-		/* Limit maximum size to 2GiB due to SG table limitations */
-		size = min(remaining_size, 2ULL << 30);
-
-		if (size >= pages_per_block << PAGE_SHIFT)
-			min_block_size = pages_per_block << PAGE_SHIFT;
-
-		cur_size = size;
-
-		if (fpfn + size != place->lpfn << PAGE_SHIFT) {
-			/*
-			 * Except for actual range allocation, modify the size and
-			 * min_block_size conforming to continuous flag enablement
-			 */
-			if (place->flags & TTM_PL_FLAG_CONTIGUOUS) {
-				size = roundup_pow_of_two(size);
-				min_block_size = size;
-			/*
-			 * Modify the size value if size is not
-			 * aligned with min_block_size
-			 */
-			} else if (!IS_ALIGNED(size, min_block_size)) {
-				size = round_up(size, min_block_size);
-			}
-		}
+	XE_BUG_ON(!vres->base.size);
+	size = vres->base.size;
 
-		r = drm_buddy_alloc_blocks(mm, fpfn,
-					   lpfn,
-					   size,
-					   min_block_size,
-					   &vres->blocks,
-					   vres->flags);
-		if (unlikely(r))
-			goto error_free_blocks;
+	min_page_size = mgr->default_page_size;
+	if (tbo->page_alignment)
+		min_page_size = tbo->page_alignment << PAGE_SHIFT;
 
-		if (size > remaining_size)
-			remaining_size = 0;
-		else
-			remaining_size -= size;
+	XE_BUG_ON(min_page_size < mm->chunk_size);
+	XE_BUG_ON(min_page_size > SZ_2G); /* FIXME: sg limit */
+	XE_BUG_ON(size > SZ_2G &&
+		  (vres->base.placement & TTM_PL_FLAG_CONTIGUOUS));
+	XE_BUG_ON(!IS_ALIGNED(size, min_page_size));
+
+	if (place->fpfn + (vres->base.size >> PAGE_SHIFT) != place->lpfn &&
+	    place->flags & TTM_PL_FLAG_CONTIGUOUS) {
+		unsigned long pages;
+
+		size = roundup_pow_of_two(size);
+		min_page_size = size;
+
+		pages = size >> ilog2(mm->chunk_size);
+		if (pages > lpfn)
+			lpfn = pages;
 	}
-	mutex_unlock(&mgr->lock);
 
-	if (cur_size != size) {
-		struct drm_buddy_block *block;
-		struct list_head *trim_list;
-		u64 original_size;
-		LIST_HEAD(temp);
+	if (size > lpfn << PAGE_SHIFT) {
+		err = -E2BIG; /* don't trigger eviction */
+		goto error_fini;
+	}
 
-		trim_list = &vres->blocks;
-		original_size = vres->base.size;
+	mutex_lock(&mgr->lock);
+	if (lpfn <= mgr->visible_size && size > mgr->visible_avail) {
+		mutex_unlock(&mgr->lock);
+		err = -ENOSPC;
+		goto error_fini;
+	}
 
+	remaining_size = size;
+	do {
 		/*
-		 * If size value is rounded up to min_block_size, trim the last
-		 * block to the required size
+		 * Limit maximum size to 2GiB due to SG table limitations.
+		 * FIXME: Should maybe be handled as part of sg construction.
 		 */
-		if (!list_is_singular(&vres->blocks)) {
-			block = list_last_entry(&vres->blocks, typeof(*block), link);
-			list_move_tail(&block->link, &temp);
-			trim_list = &temp;
-			/*
-			 * Compute the original_size value by subtracting the
-			 * last block size with (aligned size - original size)
-			 */
-			original_size = drm_buddy_block_size(mm, block) -
-				(size - cur_size);
-		}
+		u64 alloc_size = min_t(u64, remaining_size, SZ_2G);
+
+		err = drm_buddy_alloc_blocks(mm, (u64)place->fpfn << PAGE_SHIFT,
+					     (u64)lpfn << PAGE_SHIFT,
+					     alloc_size,
+					     min_page_size,
+					     &vres->blocks,
+					     vres->flags);
+		if (err)
+			goto error_free_blocks;
 
-		mutex_lock(&mgr->lock);
-		drm_buddy_block_trim(mm,
-				     original_size,
-				     trim_list);
-		mutex_unlock(&mgr->lock);
+		remaining_size -= alloc_size;
+	} while (remaining_size);
 
-		if (!list_empty(&temp))
-			list_splice_tail(trim_list, &vres->blocks);
+	if (place->flags & TTM_PL_FLAG_CONTIGUOUS) {
+		if (!drm_buddy_block_trim(mm, vres->base.size, &vres->blocks))
+			size = vres->base.size;
 	}
 
+	if (lpfn <= mgr->visible_size >> PAGE_SHIFT) {
+		vres->used_visible_size = size;
+	} else {
+		struct drm_buddy_block *block;
+
+		list_for_each_entry(block, &vres->blocks, link) {
+			u64 start = drm_buddy_block_offset(block);
+
+			if (start < mgr->visible_size) {
+				u64 end = start + drm_buddy_block_size(mm, block);
+
+				vres->used_visible_size +=
+					min(end, mgr->visible_size) - start;
+			}
+		}
+	}
+
+	mgr->visible_avail -= vres->used_visible_size;
+	mutex_unlock(&mgr->lock);
+
 	if (!(vres->base.placement & TTM_PL_FLAG_CONTIGUOUS) &&
 	    xe_is_vram_mgr_blocks_contiguous(mm, &vres->blocks))
 		vres->base.placement |= TTM_PL_FLAG_CONTIGUOUS;
@@ -213,7 +190,7 @@ static int xe_ttm_vram_mgr_new(struct ttm_resource_manager *man,
 	ttm_resource_fini(man, &vres->base);
 	kfree(vres);
 
-	return r;
+	return err;
 }
 
 static void xe_ttm_vram_mgr_del(struct ttm_resource_manager *man,
@@ -226,6 +203,7 @@ static void xe_ttm_vram_mgr_del(struct ttm_resource_manager *man,
 
 	mutex_lock(&mgr->lock);
 	drm_buddy_free_list(mm, &vres->blocks);
+	mgr->visible_avail += vres->used_visible_size;
 	mutex_unlock(&mgr->lock);
 
 	ttm_resource_fini(man, res);
@@ -240,6 +218,13 @@ static void xe_ttm_vram_mgr_debug(struct ttm_resource_manager *man,
 	struct drm_buddy *mm = &mgr->mm;
 
 	mutex_lock(&mgr->lock);
+	drm_printf(printer, "default_page_size: %lluKiB\n",
+		   mgr->default_page_size >> 10);
+	drm_printf(printer, "visible_avail: %lluMiB\n",
+		   (u64)mgr->visible_avail >> 20);
+	drm_printf(printer, "visible_size: %lluMiB\n",
+		   (u64)mgr->visible_size >> 20);
+
 	drm_buddy_print(mm, printer);
 	mutex_unlock(&mgr->lock);
 	drm_printf(printer, "man size:%llu\n", man->size);
@@ -262,6 +247,8 @@ static void ttm_vram_mgr_fini(struct drm_device *dev, void *arg)
 	if (ttm_resource_manager_evict_all(&xe->ttm, man))
 		return;
 
+	WARN_ON_ONCE(mgr->visible_avail != mgr->visible_size);
+
 	drm_buddy_fini(&mgr->mm);
 
 	ttm_resource_manager_cleanup(&mgr->manager);
@@ -270,7 +257,8 @@ static void ttm_vram_mgr_fini(struct drm_device *dev, void *arg)
 }
 
 int __xe_ttm_vram_mgr_init(struct xe_device *xe, struct xe_ttm_vram_mgr *mgr,
-			   u32 mem_type, u64 size, u64 default_page_size)
+			   u32 mem_type, u64 size, u64 io_size,
+			   u64 default_page_size)
 {
 	struct ttm_resource_manager *man = &mgr->manager;
 	int err;
@@ -279,6 +267,8 @@ int __xe_ttm_vram_mgr_init(struct xe_device *xe, struct xe_ttm_vram_mgr *mgr,
 	mgr->mem_type = mem_type;
 	mutex_init(&mgr->lock);
 	mgr->default_page_size = default_page_size;
+	mgr->visible_size = io_size;
+	mgr->visible_avail = io_size;
 
 	ttm_resource_manager_init(man, &xe->ttm, size);
 	err = drm_buddy_init(&mgr->mm, man->size, default_page_size);
@@ -298,7 +288,8 @@ int xe_ttm_vram_mgr_init(struct xe_gt *gt, struct xe_ttm_vram_mgr *mgr)
 	mgr->gt = gt;
 
 	return __xe_ttm_vram_mgr_init(xe, mgr, XE_PL_VRAM0 + gt->info.vram_id,
-				      gt->mem.vram.size, PAGE_SIZE);
+				      gt->mem.vram.size, gt->mem.vram.io_size,
+				      PAGE_SIZE);
 }
 
 int xe_ttm_vram_mgr_alloc_sgt(struct xe_device *xe,
diff --git a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.h b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.h
index 78f332d26224..35e5367a79fb 100644
--- a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.h
+++ b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.h
@@ -13,7 +13,8 @@ struct xe_device;
 struct xe_gt;
 
 int __xe_ttm_vram_mgr_init(struct xe_device *xe, struct xe_ttm_vram_mgr *mgr,
-			   u32 mem_type, u64 size, u64 default_page_size);
+			   u32 mem_type, u64 size, u64 io_size,
+			   u64 default_page_size);
 int xe_ttm_vram_mgr_init(struct xe_gt *gt, struct xe_ttm_vram_mgr *mgr);
 int xe_ttm_vram_mgr_alloc_sgt(struct xe_device *xe,
 			      struct ttm_resource *res,
diff --git a/drivers/gpu/drm/xe/xe_ttm_vram_mgr_types.h b/drivers/gpu/drm/xe/xe_ttm_vram_mgr_types.h
index 39aa2ec1b968..3d9417ff7434 100644
--- a/drivers/gpu/drm/xe/xe_ttm_vram_mgr_types.h
+++ b/drivers/gpu/drm/xe/xe_ttm_vram_mgr_types.h
@@ -23,6 +23,10 @@ struct xe_ttm_vram_mgr {
 	struct ttm_resource_manager manager;
 	/** @mm: DRM buddy allocator which manages the VRAM */
 	struct drm_buddy mm;
+	/** @visible_size: Proped size of the CPU visible portion */
+	u64 visible_size;
+	/** @visible_avail: CPU visible portion still unallocated */
+	u64 visible_avail;
 	/** @default_page_size: default page size */
 	u64 default_page_size;
 	/** @lock: protects allocations of VRAM */
@@ -39,6 +43,8 @@ struct xe_ttm_vram_mgr_resource {
 	struct ttm_resource base;
 	/** @blocks: list of DRM buddy blocks */
 	struct list_head blocks;
+	/** @used_visible_size: How many CPU visible bytes this resource is using */
+	u64 used_visible_size;
 	/** @flags: flags associated with the resource */
 	unsigned long flags;
 };
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [Intel-xe] [PATCH v3 5/6] drm/xe/buddy: add visible tracking
  2023-03-08 12:30 ` [Intel-xe] [PATCH v3 5/6] drm/xe/buddy: add visible tracking Matthew Auld
@ 2023-03-08 13:40   ` Rodrigo Vivi
  0 siblings, 0 replies; 10+ messages in thread
From: Rodrigo Vivi @ 2023-03-08 13:40 UTC (permalink / raw)
  To: Matthew Auld, Thomas Hellström; +Cc: Lucas De Marchi, intel-xe

On Wed, Mar 08, 2023 at 12:30:11PM +0000, Matthew Auld wrote:
> Replace the allocation code with the i915 version. This simplifies the
> code a little, and importantly we get the accounting at the mgr level,
> which is useful for debug (and maybe userspace), plus per resource
> tracking so we can easily check if a resource is using one or pages in
> the mappable part of vram (useful for eviction), or if the resource is
> completely within the mappable portion (useful for checking if the
> resource can be safely CPU mapped).
> 
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Cc: Lucas De Marchi <lucas.demarchi@intel.com>

Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>

> ---
>  drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c     |  18 +-
>  drivers/gpu/drm/xe/xe_ttm_vram_mgr.c       | 201 ++++++++++-----------
>  drivers/gpu/drm/xe/xe_ttm_vram_mgr.h       |   3 +-
>  drivers/gpu/drm/xe/xe_ttm_vram_mgr_types.h |   6 +
>  4 files changed, 118 insertions(+), 110 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c b/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c
> index 0ad0f1820f4b..7e403dccf9d5 100644
> --- a/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c
> +++ b/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c
> @@ -129,7 +129,7 @@ void xe_ttm_stolen_mgr_init(struct xe_device *xe)
>  {
>  	struct xe_ttm_stolen_mgr *mgr = drmm_kzalloc(&xe->drm, sizeof(*mgr), GFP_KERNEL);
>  	struct pci_dev *pdev = to_pci_dev(xe->drm.dev);
> -	u64 stolen_size, pgsize;
> +	u64 stolen_size, io_size, pgsize;
>  	int err;
>  
>  	if (IS_DGFX(xe))
> @@ -148,7 +148,17 @@ void xe_ttm_stolen_mgr_init(struct xe_device *xe)
>  	if (pgsize < PAGE_SIZE)
>  		pgsize = PAGE_SIZE;
>  
> -	err = __xe_ttm_vram_mgr_init(xe, &mgr->base, XE_PL_STOLEN, stolen_size, pgsize);
> +	/*
> +	 * We don't try to attempt partial visible support for stolen vram,
> +	 * since stolen is always at the end of vram, and the BAR size is pretty
> +	 * much always 256M, with small-bar.
> +	 */
> +	io_size = 0;
> +	if (mgr->io_base && !xe_ttm_stolen_cpu_access_needs_ggtt(xe))
> +		io_size = stolen_size;
> +
> +	err = __xe_ttm_vram_mgr_init(xe, &mgr->base, XE_PL_STOLEN, stolen_size,
> +				     io_size, pgsize);
>  	if (err) {
>  		drm_dbg_kms(&xe->drm, "Stolen mgr init failed: %i\n", err);
>  		return;
> @@ -157,8 +167,8 @@ void xe_ttm_stolen_mgr_init(struct xe_device *xe)
>  	drm_dbg_kms(&xe->drm, "Initialized stolen memory support with %llu bytes\n",
>  		    stolen_size);
>  
> -	if (mgr->io_base && !xe_ttm_stolen_cpu_access_needs_ggtt(xe))
> -		mgr->mapping = devm_ioremap_wc(&pdev->dev, mgr->io_base, stolen_size);
> +	if (io_size)
> +		mgr->mapping = devm_ioremap_wc(&pdev->dev, mgr->io_base, io_size);
>  }
>  
>  u64 xe_ttm_stolen_io_offset(struct xe_bo *bo, u32 offset)
> diff --git a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
> index e3ac8c6d3978..f703e962f499 100644
> --- a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
> +++ b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
> @@ -49,45 +49,26 @@ static int xe_ttm_vram_mgr_new(struct ttm_resource_manager *man,
>  			       const struct ttm_place *place,
>  			       struct ttm_resource **res)
>  {
> -	u64 max_bytes, cur_size, min_block_size;
>  	struct xe_ttm_vram_mgr *mgr = to_xe_ttm_vram_mgr(man);
>  	struct xe_ttm_vram_mgr_resource *vres;
> -	u64 size, remaining_size, lpfn, fpfn;
>  	struct drm_buddy *mm = &mgr->mm;
> -	unsigned long pages_per_block;
> -	int r;
> -
> -	lpfn = (u64)place->lpfn << PAGE_SHIFT;
> -	if (!lpfn || lpfn > man->size)
> -		lpfn = man->size;
> -
> -	fpfn = (u64)place->fpfn << PAGE_SHIFT;
> +	u64 size, remaining_size, min_page_size;
> +	unsigned long lpfn;
> +	int err;
>  
> -	max_bytes = mgr->manager.size;
> -	if (place->flags & TTM_PL_FLAG_CONTIGUOUS) {
> -		pages_per_block = ~0ul;
> -	} else {
> -#ifdef CONFIG_TRANSPARENT_HUGEPAGE
> -		pages_per_block = HPAGE_PMD_NR;
> -#else
> -		/* default to 2MB */
> -		pages_per_block = 2UL << (20UL - PAGE_SHIFT);
> -#endif
> -
> -		pages_per_block = max_t(uint32_t, pages_per_block,
> -					tbo->page_alignment);
> -	}
> +	lpfn = place->lpfn;
> +	if (!lpfn || lpfn > man->size >> PAGE_SHIFT)
> +		lpfn = man->size >> PAGE_SHIFT;
>  
>  	vres = kzalloc(sizeof(*vres), GFP_KERNEL);
>  	if (!vres)
>  		return -ENOMEM;
>  
>  	ttm_resource_init(tbo, place, &vres->base);
> -	remaining_size = vres->base.size;
>  
>  	/* bail out quickly if there's likely not enough VRAM for this BO */
> -	if (ttm_resource_manager_usage(man) > max_bytes) {
> -		r = -ENOSPC;
> +	if (ttm_resource_manager_usage(man) > man->size) {
> +		err = -ENOSPC;
>  		goto error_fini;
>  	}
>  
> @@ -96,95 +77,91 @@ static int xe_ttm_vram_mgr_new(struct ttm_resource_manager *man,
>  	if (place->flags & TTM_PL_FLAG_TOPDOWN)
>  		vres->flags |= DRM_BUDDY_TOPDOWN_ALLOCATION;
>  
> -	if (fpfn || lpfn != man->size)
> -		/* Allocate blocks in desired range */
> +	if (place->fpfn || lpfn != man->size >> PAGE_SHIFT)
>  		vres->flags |= DRM_BUDDY_RANGE_ALLOCATION;
>  
> -	mutex_lock(&mgr->lock);
> -	while (remaining_size) {
> -		if (tbo->page_alignment)
> -			min_block_size = tbo->page_alignment << PAGE_SHIFT;
> -		else
> -			min_block_size = mgr->default_page_size;
> -
> -		XE_BUG_ON(min_block_size < mm->chunk_size);
> -
> -		/* Limit maximum size to 2GiB due to SG table limitations */
> -		size = min(remaining_size, 2ULL << 30);
> -
> -		if (size >= pages_per_block << PAGE_SHIFT)
> -			min_block_size = pages_per_block << PAGE_SHIFT;
> -
> -		cur_size = size;
> -
> -		if (fpfn + size != place->lpfn << PAGE_SHIFT) {
> -			/*
> -			 * Except for actual range allocation, modify the size and
> -			 * min_block_size conforming to continuous flag enablement
> -			 */
> -			if (place->flags & TTM_PL_FLAG_CONTIGUOUS) {
> -				size = roundup_pow_of_two(size);
> -				min_block_size = size;
> -			/*
> -			 * Modify the size value if size is not
> -			 * aligned with min_block_size
> -			 */
> -			} else if (!IS_ALIGNED(size, min_block_size)) {
> -				size = round_up(size, min_block_size);
> -			}
> -		}
> +	XE_BUG_ON(!vres->base.size);
> +	size = vres->base.size;
>  
> -		r = drm_buddy_alloc_blocks(mm, fpfn,
> -					   lpfn,
> -					   size,
> -					   min_block_size,
> -					   &vres->blocks,
> -					   vres->flags);
> -		if (unlikely(r))
> -			goto error_free_blocks;
> +	min_page_size = mgr->default_page_size;
> +	if (tbo->page_alignment)
> +		min_page_size = tbo->page_alignment << PAGE_SHIFT;
>  
> -		if (size > remaining_size)
> -			remaining_size = 0;
> -		else
> -			remaining_size -= size;
> +	XE_BUG_ON(min_page_size < mm->chunk_size);
> +	XE_BUG_ON(min_page_size > SZ_2G); /* FIXME: sg limit */
> +	XE_BUG_ON(size > SZ_2G &&
> +		  (vres->base.placement & TTM_PL_FLAG_CONTIGUOUS));
> +	XE_BUG_ON(!IS_ALIGNED(size, min_page_size));
> +
> +	if (place->fpfn + (vres->base.size >> PAGE_SHIFT) != place->lpfn &&
> +	    place->flags & TTM_PL_FLAG_CONTIGUOUS) {
> +		unsigned long pages;
> +
> +		size = roundup_pow_of_two(size);
> +		min_page_size = size;
> +
> +		pages = size >> ilog2(mm->chunk_size);
> +		if (pages > lpfn)
> +			lpfn = pages;
>  	}
> -	mutex_unlock(&mgr->lock);
>  
> -	if (cur_size != size) {
> -		struct drm_buddy_block *block;
> -		struct list_head *trim_list;
> -		u64 original_size;
> -		LIST_HEAD(temp);
> +	if (size > lpfn << PAGE_SHIFT) {
> +		err = -E2BIG; /* don't trigger eviction */
> +		goto error_fini;
> +	}
>  
> -		trim_list = &vres->blocks;
> -		original_size = vres->base.size;
> +	mutex_lock(&mgr->lock);
> +	if (lpfn <= mgr->visible_size && size > mgr->visible_avail) {
> +		mutex_unlock(&mgr->lock);
> +		err = -ENOSPC;
> +		goto error_fini;
> +	}
>  
> +	remaining_size = size;
> +	do {
>  		/*
> -		 * If size value is rounded up to min_block_size, trim the last
> -		 * block to the required size
> +		 * Limit maximum size to 2GiB due to SG table limitations.
> +		 * FIXME: Should maybe be handled as part of sg construction.
>  		 */
> -		if (!list_is_singular(&vres->blocks)) {
> -			block = list_last_entry(&vres->blocks, typeof(*block), link);
> -			list_move_tail(&block->link, &temp);
> -			trim_list = &temp;
> -			/*
> -			 * Compute the original_size value by subtracting the
> -			 * last block size with (aligned size - original size)
> -			 */
> -			original_size = drm_buddy_block_size(mm, block) -
> -				(size - cur_size);
> -		}
> +		u64 alloc_size = min_t(u64, remaining_size, SZ_2G);
> +
> +		err = drm_buddy_alloc_blocks(mm, (u64)place->fpfn << PAGE_SHIFT,
> +					     (u64)lpfn << PAGE_SHIFT,
> +					     alloc_size,
> +					     min_page_size,
> +					     &vres->blocks,
> +					     vres->flags);
> +		if (err)
> +			goto error_free_blocks;
>  
> -		mutex_lock(&mgr->lock);
> -		drm_buddy_block_trim(mm,
> -				     original_size,
> -				     trim_list);
> -		mutex_unlock(&mgr->lock);
> +		remaining_size -= alloc_size;
> +	} while (remaining_size);
>  
> -		if (!list_empty(&temp))
> -			list_splice_tail(trim_list, &vres->blocks);
> +	if (place->flags & TTM_PL_FLAG_CONTIGUOUS) {
> +		if (!drm_buddy_block_trim(mm, vres->base.size, &vres->blocks))
> +			size = vres->base.size;
>  	}
>  
> +	if (lpfn <= mgr->visible_size >> PAGE_SHIFT) {
> +		vres->used_visible_size = size;
> +	} else {
> +		struct drm_buddy_block *block;
> +
> +		list_for_each_entry(block, &vres->blocks, link) {
> +			u64 start = drm_buddy_block_offset(block);
> +
> +			if (start < mgr->visible_size) {
> +				u64 end = start + drm_buddy_block_size(mm, block);
> +
> +				vres->used_visible_size +=
> +					min(end, mgr->visible_size) - start;
> +			}
> +		}
> +	}
> +
> +	mgr->visible_avail -= vres->used_visible_size;
> +	mutex_unlock(&mgr->lock);
> +
>  	if (!(vres->base.placement & TTM_PL_FLAG_CONTIGUOUS) &&
>  	    xe_is_vram_mgr_blocks_contiguous(mm, &vres->blocks))
>  		vres->base.placement |= TTM_PL_FLAG_CONTIGUOUS;
> @@ -213,7 +190,7 @@ static int xe_ttm_vram_mgr_new(struct ttm_resource_manager *man,
>  	ttm_resource_fini(man, &vres->base);
>  	kfree(vres);
>  
> -	return r;
> +	return err;
>  }
>  
>  static void xe_ttm_vram_mgr_del(struct ttm_resource_manager *man,
> @@ -226,6 +203,7 @@ static void xe_ttm_vram_mgr_del(struct ttm_resource_manager *man,
>  
>  	mutex_lock(&mgr->lock);
>  	drm_buddy_free_list(mm, &vres->blocks);
> +	mgr->visible_avail += vres->used_visible_size;
>  	mutex_unlock(&mgr->lock);
>  
>  	ttm_resource_fini(man, res);
> @@ -240,6 +218,13 @@ static void xe_ttm_vram_mgr_debug(struct ttm_resource_manager *man,
>  	struct drm_buddy *mm = &mgr->mm;
>  
>  	mutex_lock(&mgr->lock);
> +	drm_printf(printer, "default_page_size: %lluKiB\n",
> +		   mgr->default_page_size >> 10);
> +	drm_printf(printer, "visible_avail: %lluMiB\n",
> +		   (u64)mgr->visible_avail >> 20);
> +	drm_printf(printer, "visible_size: %lluMiB\n",
> +		   (u64)mgr->visible_size >> 20);
> +
>  	drm_buddy_print(mm, printer);
>  	mutex_unlock(&mgr->lock);
>  	drm_printf(printer, "man size:%llu\n", man->size);
> @@ -262,6 +247,8 @@ static void ttm_vram_mgr_fini(struct drm_device *dev, void *arg)
>  	if (ttm_resource_manager_evict_all(&xe->ttm, man))
>  		return;
>  
> +	WARN_ON_ONCE(mgr->visible_avail != mgr->visible_size);
> +
>  	drm_buddy_fini(&mgr->mm);
>  
>  	ttm_resource_manager_cleanup(&mgr->manager);
> @@ -270,7 +257,8 @@ static void ttm_vram_mgr_fini(struct drm_device *dev, void *arg)
>  }
>  
>  int __xe_ttm_vram_mgr_init(struct xe_device *xe, struct xe_ttm_vram_mgr *mgr,
> -			   u32 mem_type, u64 size, u64 default_page_size)
> +			   u32 mem_type, u64 size, u64 io_size,
> +			   u64 default_page_size)
>  {
>  	struct ttm_resource_manager *man = &mgr->manager;
>  	int err;
> @@ -279,6 +267,8 @@ int __xe_ttm_vram_mgr_init(struct xe_device *xe, struct xe_ttm_vram_mgr *mgr,
>  	mgr->mem_type = mem_type;
>  	mutex_init(&mgr->lock);
>  	mgr->default_page_size = default_page_size;
> +	mgr->visible_size = io_size;
> +	mgr->visible_avail = io_size;
>  
>  	ttm_resource_manager_init(man, &xe->ttm, size);
>  	err = drm_buddy_init(&mgr->mm, man->size, default_page_size);
> @@ -298,7 +288,8 @@ int xe_ttm_vram_mgr_init(struct xe_gt *gt, struct xe_ttm_vram_mgr *mgr)
>  	mgr->gt = gt;
>  
>  	return __xe_ttm_vram_mgr_init(xe, mgr, XE_PL_VRAM0 + gt->info.vram_id,
> -				      gt->mem.vram.size, PAGE_SIZE);
> +				      gt->mem.vram.size, gt->mem.vram.io_size,
> +				      PAGE_SIZE);
>  }
>  
>  int xe_ttm_vram_mgr_alloc_sgt(struct xe_device *xe,
> diff --git a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.h b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.h
> index 78f332d26224..35e5367a79fb 100644
> --- a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.h
> +++ b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.h
> @@ -13,7 +13,8 @@ struct xe_device;
>  struct xe_gt;
>  
>  int __xe_ttm_vram_mgr_init(struct xe_device *xe, struct xe_ttm_vram_mgr *mgr,
> -			   u32 mem_type, u64 size, u64 default_page_size);
> +			   u32 mem_type, u64 size, u64 io_size,
> +			   u64 default_page_size);
>  int xe_ttm_vram_mgr_init(struct xe_gt *gt, struct xe_ttm_vram_mgr *mgr);
>  int xe_ttm_vram_mgr_alloc_sgt(struct xe_device *xe,
>  			      struct ttm_resource *res,
> diff --git a/drivers/gpu/drm/xe/xe_ttm_vram_mgr_types.h b/drivers/gpu/drm/xe/xe_ttm_vram_mgr_types.h
> index 39aa2ec1b968..3d9417ff7434 100644
> --- a/drivers/gpu/drm/xe/xe_ttm_vram_mgr_types.h
> +++ b/drivers/gpu/drm/xe/xe_ttm_vram_mgr_types.h
> @@ -23,6 +23,10 @@ struct xe_ttm_vram_mgr {
>  	struct ttm_resource_manager manager;
>  	/** @mm: DRM buddy allocator which manages the VRAM */
>  	struct drm_buddy mm;
> +	/** @visible_size: Proped size of the CPU visible portion */
> +	u64 visible_size;
> +	/** @visible_avail: CPU visible portion still unallocated */
> +	u64 visible_avail;
>  	/** @default_page_size: default page size */
>  	u64 default_page_size;
>  	/** @lock: protects allocations of VRAM */
> @@ -39,6 +43,8 @@ struct xe_ttm_vram_mgr_resource {
>  	struct ttm_resource base;
>  	/** @blocks: list of DRM buddy blocks */
>  	struct list_head blocks;
> +	/** @used_visible_size: How many CPU visible bytes this resource is using */
> +	u64 used_visible_size;
>  	/** @flags: flags associated with the resource */
>  	unsigned long flags;
>  };
> -- 
> 2.39.2
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Intel-xe] [PATCH v3 6/6] drm/xe/buddy: add compatible and intersects hooks
  2023-03-08 12:30 [Intel-xe] [PATCH v3 0/6] Some small-bar prep patches Matthew Auld
                   ` (4 preceding siblings ...)
  2023-03-08 12:30 ` [Intel-xe] [PATCH v3 5/6] drm/xe/buddy: add visible tracking Matthew Auld
@ 2023-03-08 12:30 ` Matthew Auld
  2023-03-13 16:05 ` [Intel-xe] ✗ CI.Patch_applied: failure for Some small-bar prep patches (rev3) Patchwork
  6 siblings, 0 replies; 10+ messages in thread
From: Matthew Auld @ 2023-03-08 12:30 UTC (permalink / raw)
  To: intel-xe; +Cc: Lucas De Marchi

Copy this from i915. We need .compatible for lmem -> lmem transfers, so
they don't just get nooped by ttm, if need to move something from
mappable to non-mappble or vice versa. The .intersects is needed for
eviction, to determine if a victim resource is worth eviction. e.g if we
need mappable space there is no point in evicting a resource that has
zero mappable pages.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
---
 drivers/gpu/drm/xe/xe_ttm_vram_mgr.c | 62 ++++++++++++++++++++++++++++
 1 file changed, 62 insertions(+)

diff --git a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
index f703e962f499..8dd33ac65499 100644
--- a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
+++ b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
@@ -230,9 +230,71 @@ static void xe_ttm_vram_mgr_debug(struct ttm_resource_manager *man,
 	drm_printf(printer, "man size:%llu\n", man->size);
 }
 
+static bool xe_ttm_vram_mgr_intersects(struct ttm_resource_manager *man,
+				       struct ttm_resource *res,
+				       const struct ttm_place *place,
+				       size_t size)
+{
+	struct xe_ttm_vram_mgr *mgr = to_xe_ttm_vram_mgr(man);
+	struct xe_ttm_vram_mgr_resource *vres =
+		to_xe_ttm_vram_mgr_resource(res);
+	struct drm_buddy *mm = &mgr->mm;
+	struct drm_buddy_block *block;
+
+	if (!place->fpfn && !place->lpfn)
+		return true;
+
+	if (!place->fpfn && place->lpfn == mgr->visible_size >> PAGE_SHIFT)
+		return vres->used_visible_size > 0;
+
+	list_for_each_entry(block, &vres->blocks, link) {
+		unsigned long fpfn =
+			drm_buddy_block_offset(block) >> PAGE_SHIFT;
+		unsigned long lpfn = fpfn +
+			(drm_buddy_block_size(mm, block) >> PAGE_SHIFT);
+
+		if (place->fpfn < lpfn && place->lpfn > fpfn)
+			return true;
+	}
+
+	return false;
+}
+
+static bool xe_ttm_vram_mgr_compatible(struct ttm_resource_manager *man,
+				       struct ttm_resource *res,
+				       const struct ttm_place *place,
+				       size_t size)
+{
+	struct xe_ttm_vram_mgr *mgr = to_xe_ttm_vram_mgr(man);
+	struct xe_ttm_vram_mgr_resource *vres =
+		to_xe_ttm_vram_mgr_resource(res);
+	struct drm_buddy *mm = &mgr->mm;
+	struct drm_buddy_block *block;
+
+	if (!place->fpfn && !place->lpfn)
+		return true;
+
+	if (!place->fpfn && place->lpfn == mgr->visible_size >> PAGE_SHIFT)
+		return vres->used_visible_size == size;
+
+	list_for_each_entry(block, &vres->blocks, link) {
+		unsigned long fpfn =
+			drm_buddy_block_offset(block) >> PAGE_SHIFT;
+		unsigned long lpfn = fpfn +
+			(drm_buddy_block_size(mm, block) >> PAGE_SHIFT);
+
+		if (fpfn < place->fpfn || lpfn > place->lpfn)
+			return false;
+	}
+
+	return true;
+}
+
 static const struct ttm_resource_manager_func xe_ttm_vram_mgr_func = {
 	.alloc	= xe_ttm_vram_mgr_new,
 	.free	= xe_ttm_vram_mgr_del,
+	.intersects = xe_ttm_vram_mgr_intersects,
+	.compatible = xe_ttm_vram_mgr_compatible,
 	.debug	= xe_ttm_vram_mgr_debug
 };
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [Intel-xe] ✗ CI.Patch_applied: failure for Some small-bar prep patches (rev3)
  2023-03-08 12:30 [Intel-xe] [PATCH v3 0/6] Some small-bar prep patches Matthew Auld
                   ` (5 preceding siblings ...)
  2023-03-08 12:30 ` [Intel-xe] [PATCH v3 6/6] drm/xe/buddy: add compatible and intersects hooks Matthew Auld
@ 2023-03-13 16:05 ` Patchwork
  6 siblings, 0 replies; 10+ messages in thread
From: Patchwork @ 2023-03-13 16:05 UTC (permalink / raw)
  To: Matthew Auld; +Cc: intel-xe

== Series Details ==

Series: Some small-bar prep patches (rev3)
URL   : https://patchwork.freedesktop.org/series/114521/
State : failure

== Summary ==

=== Applying kernel patches on branch 'drm-xe-next' with base: ===
commit 91440895e3fe91928c6e8a682e87d2b8ca677d82
Author:     Lucas De Marchi <lucas.demarchi@intel.com>
AuthorDate: Wed Mar 1 01:31:12 2023 -0800
Commit:     Lucas De Marchi <lucas.demarchi@intel.com>
CommitDate: Fri Mar 10 13:52:30 2023 -0800

    drm/xe/pvc: Remove A* steppings
    
    The PVC pre-production A* steppings are not going to be supported in xe
    driver - the steppings are important for the WAs and since we are not
    adding the pre-productions ones, there is no need to add the stepping.
    
    Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
    Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
=== git am output follows ===
error: patch failed: drivers/gpu/drm/xe/Kconfig.debug:41
error: drivers/gpu/drm/xe/Kconfig.debug: patch does not apply
error: patch failed: drivers/gpu/drm/xe/tests/xe_migrate.c:129
error: drivers/gpu/drm/xe/tests/xe_migrate.c: patch does not apply
error: patch failed: drivers/gpu/drm/xe/xe_bo.c:1290
error: drivers/gpu/drm/xe/xe_bo.c: patch does not apply
error: patch failed: drivers/gpu/drm/xe/xe_bo.h:196
error: drivers/gpu/drm/xe/xe_bo.h: patch does not apply
error: patch failed: drivers/gpu/drm/xe/xe_ggtt.c:28
error: drivers/gpu/drm/xe/xe_ggtt.c: patch does not apply
error: patch failed: drivers/gpu/drm/xe/xe_migrate.c:222
error: drivers/gpu/drm/xe/xe_migrate.c: patch does not apply
error: patch failed: drivers/gpu/drm/xe/xe_mmio.c:68
error: drivers/gpu/drm/xe/xe_mmio.c: patch does not apply
error: patch failed: drivers/gpu/drm/xe/xe_module.c:22
error: drivers/gpu/drm/xe/xe_module.c: patch does not apply
error: patch failed: drivers/gpu/drm/xe/xe_module.h:8
error: drivers/gpu/drm/xe/xe_module.h: patch does not apply
error: patch failed: drivers/gpu/drm/xe/xe_pt.c:62
error: drivers/gpu/drm/xe/xe_pt.c: patch does not apply
error: patch failed: drivers/gpu/drm/xe/xe_vm.c:3355
error: drivers/gpu/drm/xe/xe_vm.c: patch does not apply
hint: Use 'git am --show-current-patch' to see the failed patch
Applying: drm/xe: add xe_ttm_stolen_cpu_access_needs_ggtt()
Applying: drm/xe: s/lmem/vram/
Patch failed at 0002 drm/xe: s/lmem/vram/
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".



^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2023-03-13 16:05 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-03-08 12:30 [Intel-xe] [PATCH v3 0/6] Some small-bar prep patches Matthew Auld
2023-03-08 12:30 ` [Intel-xe] [PATCH v3 1/6] drm/xe: add xe_ttm_stolen_cpu_access_needs_ggtt() Matthew Auld
2023-03-08 12:30 ` [Intel-xe] [PATCH v3 2/6] drm/xe: s/lmem/vram/ Matthew Auld
2023-03-08 13:37   ` Rodrigo Vivi
2023-03-08 12:30 ` [Intel-xe] [PATCH v3 3/6] drm/xe/vram: start tracking the io_size Matthew Auld
2023-03-08 12:30 ` [Intel-xe] [PATCH v3 4/6] drm/xe/buddy: remove the virtualized start Matthew Auld
2023-03-08 12:30 ` [Intel-xe] [PATCH v3 5/6] drm/xe/buddy: add visible tracking Matthew Auld
2023-03-08 13:40   ` Rodrigo Vivi
2023-03-08 12:30 ` [Intel-xe] [PATCH v3 6/6] drm/xe/buddy: add compatible and intersects hooks Matthew Auld
2023-03-13 16:05 ` [Intel-xe] ✗ CI.Patch_applied: failure for Some small-bar prep patches (rev3) Patchwork

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox