Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [Intel-xe] [PATCH 00/26] Separate GT and tile
@ 2023-05-11  3:46 Matt Roper
  2023-05-11  3:46 ` [Intel-xe] [PATCH 01/26] drm/xe/mtl: Disable media GT Matt Roper
                   ` (34 more replies)
  0 siblings, 35 replies; 75+ messages in thread
From: Matt Roper @ 2023-05-11  3:46 UTC (permalink / raw)
  To: intel-xe
  Cc: Lucas De Marchi, Michael J . Ruhl, Rodrigo Vivi, matthew.d.roper,
	Nirmoy Das

A 'tile' is not the same thing as a 'GT.'  For historical reasons, i915
attempted to use a single 'struct intel_gt' to represent both concepts,
although this design hasn't worked out terribly well.  For Xe we have
the opportunity to design the driver in a way that more accurately
reflects the real hardware behavior.

Different vendors use the term "tile" a bit differently, but in the
Intel world, a 'tile' is pretty close to what most people would think of
as being a complete GPU.  When multiple GPUs are placed behind a single
PCI device, that's what we refer to as a "multi-tile device."  In such
cases, pretty much all hardware is replicated per-tile, although certain
responsibilities like PCI communication, reporting of interrupts to the
OS, etc. are handled solely by the "root tile."  A multi-tile platform
takes care of tying the tiles together in a way such that interrupt
notifications from remote tiles are forwarded to the root tile, the
per-tile vram is combined into a single address space, etc.

In contrast, a "GT" (which officially stands for "Graphics Technology")
is the subset of a GPU/tile that is responsible for implementing
graphics and/or media operations.  The GT is where a lot of the driver
implementation happens since it's where the hardware engines, the
execution units, and the GuC all reside.

Historically most Intel devices were single-tile devices that contained
a single GT.  PVC is currently the only released Intel platform built on
a multi-tile design (i.e., multiple GPUs behind a single PCI device);
each PVC tile only has a single GT.  In contrast, platforms like MTL
that have separate chips for render and media IP are still only a single
logical GPU, but the graphics and media IP blocks are exposed each
exposed as a separate GT within that single GPU.  This is important from
a software perspective because multi-GT platforms like MTL only
replicate a subset of the GPU hardware and behave differently than
multi-tile platforms like PVC where nearly everything is replicated.

This series separates tiles from GTs in a manner that more closely
matches the hardware behavior.  We now consider a PCI device (xe_device)
to contain one or more tiles (struct xe_tile).  Each tile will contain
one or two GTs (struct xe_gt).  Although we don't have any platforms yet
that are multi-tile *and* contain more than one GT per tile, that may
change in the future.  This driver redesign splits functionality as
follows:

Per-tile functionality (shared by all GTs within the tile):
 - Complete 4MB MMIO space (containing SGunit/SoC registers, GT
   registers, display registers, etc.)
 - Global GTT
 - VRAM (if discrete)
 - Interrupt flows
 - Migration context
 - kernel batchbuffer pool
 - Primary GT
 - Media GT (if media version >= 13)

Per-GT functionality:
 - GuC
 - Hardware engines
 - Programmable hardware units (subslices, EUs)
 - GSI subset of registers (multiple copies of these registers reside
   within the complete MMIO space provided by the tile, but at different
   offsets --- 0 for render, 0x380000 for media)
 - Multicast register steering
 - TLBs to cache page table translations
 - Reset capability
 - Low-level power management (e.g., C6)
 - Clock frequency
 - MOCS and PAT programming

At the moment I've left USM / pagefault handling at the GT level,
although I'm not familiar enough with that specific feature to know
whether it's truly correct or not.

The first patch in this series temporarily drops MTL media GT support.
The driver doesn't load properly on MTL today, largely due to the
mishandling of GT vs tile; dropping support completely allows us to more
easily make the necessary driver redesign required.  The media GT is
re-enabled (properly this time) near the end of the series and this
allows the driver to load successfully without error on MTL for the
first time.  There are still issues when submitting workloads to MTL
after driver load (i.e., CAT errors), but those seem to be a separate
platform-specific issues unrelated to the GT/tile work in this series
that will need to be debugged and fixed separately.


This series leaves a few open questions and FIXME's:
 - Unlike i915, the Xe driver has chosen to expose GTs to userspace
   rather than keeping them a hidden implementation detail.  With the
   separation of xe_tile and xe_gt, we need to decide whether we also
   want to expose tiles (in addition to GTs), whether we want to _only_
   expose tiles (and keep the primary vs media GT separation a hidden
   internal detail), or something else.
 - How should GTs be numbered?  Today it's straightforward --- PVC
   assigns GT IDs 0 and 1 to the primary GT of each tile.  MTL assigns
   GT IDs 0 and 1 to the primary and media GTs of its sole tile.  But if
   we have a platform in the future that has multiple tiles _and_
   multiple GTs per tile, how should we handle the numbering in that
   case?
 - Xe (mis)design used xe_gt as the target of all MMIO operations (i.e.,
   xe_mmio_*()).  This really doesn't make sense, especially since
   there's a lot of MMIO accesses that are completely unrelated to GT
   (i.e., sgunit registers, display registers, etc.).  i915 used
   'intel_uncore' as the MMIO target, although that wasn't really an
   accurate reflection of the hardware either.  What we really want is
   something that combines the MMIO register space (stored in the tile)
   with the GSI offset (stored in the GT).  My current plan is to
   introduce an "xe_mmio_view" (name may change) in a future series that
   will serve as a target for register operations.  There will be
   sensible APIs to obtain an xe_mmio_view appropriate to the type of
   register access being performed (and that will also be able to do
   some range sanity checking in debug drivers to help catch misuse).
   That's a somewhat large/invasive change, so I'm saving that for a
   follow-up series after this one is completed.


Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Michael J. Ruhl <michael.j.ruhl@intel.com>
Cc: Nirmoy Das <nirmoy.das@intel.com>


Matt Roper (26):
  drm/xe/mtl: Disable media GT
  drm/xe: Introduce xe_tile
  drm/xe: Add backpointer from gt to tile
  drm/xe: Add for_each_tile iterator
  drm/xe: Move register MMIO into xe_tile
  drm/xe: Move VRAM from GT to tile
  drm/xe: Memory allocations are tile-based, not GT-based
  drm/xe: Move migration from GT to tile
  drm/xe: Clarify 'gt' retrieval for primary tile
  drm/xe: Drop vram_id
  drm/xe: Drop extra_gts[] declarations and XE_GT_TYPE_REMOTE
  drm/xe: Allocate GT dynamically
  drm/xe: Add media GT to tile
  drm/xe: Move display IRQ postinstall out of GT function
  drm/xe: Interrupts are delivered per-tile, not per-GT
  drm/xe/irq: Handle ASLE backlight interrupts at same time as display
  drm/xe/irq: Actually call xe_irq_postinstall()
  drm/xe/irq: Ensure primary GuC won't clobber media GuC's interrupt
    mask
  drm/xe/irq: Untangle postinstall functions
  drm/xe: Replace xe_gt_irq_postinstall with xe_irq_enable_hwe
  drm/xe: Invalidate TLB on all affected GTs during GGTT updates
  drm/xe/tlb: Obtain forcewake when doing GGTT TLB invalidations
  drm/xe: Allow GT looping and lookup on standalone media
  drm/xe: Update query uapi to support standalone media
  drm/xe: Reinstate media GT support
  drm/xe: Clarify source of GT log messages

 drivers/gpu/drm/i915/display/intel_dsb.c      |   5 +-
 drivers/gpu/drm/i915/display/intel_fbc.c      |   3 +-
 drivers/gpu/drm/i915/display/intel_fbdev.c    |   7 +-
 drivers/gpu/drm/xe/Makefile                   |   1 +
 .../drm/xe/compat-i915-headers/intel_uncore.h |   2 +-
 drivers/gpu/drm/xe/display/ext/i915_irq.c     |   2 +-
 drivers/gpu/drm/xe/display/xe_fb_pin.c        |  13 +-
 drivers/gpu/drm/xe/display/xe_plane_initial.c |   8 +-
 drivers/gpu/drm/xe/regs/xe_gt_regs.h          |   8 +
 drivers/gpu/drm/xe/tests/xe_bo.c              |   8 +-
 drivers/gpu/drm/xe/tests/xe_migrate.c         |  15 +-
 drivers/gpu/drm/xe/xe_bb.c                    |   5 +-
 drivers/gpu/drm/xe/xe_bo.c                    | 104 ++---
 drivers/gpu/drm/xe/xe_bo.h                    |  20 +-
 drivers/gpu/drm/xe/xe_bo_evict.c              |  22 +-
 drivers/gpu/drm/xe/xe_bo_types.h              |   4 +-
 drivers/gpu/drm/xe/xe_device.c                |  12 +-
 drivers/gpu/drm/xe/xe_device.h                |  49 ++-
 drivers/gpu/drm/xe/xe_device_types.h          | 107 ++++-
 drivers/gpu/drm/xe/xe_engine.c                |   2 +-
 drivers/gpu/drm/xe/xe_ggtt.c                  |  45 +-
 drivers/gpu/drm/xe/xe_ggtt.h                  |   6 +-
 drivers/gpu/drm/xe/xe_ggtt_types.h            |   2 +-
 drivers/gpu/drm/xe/xe_gt.c                    | 191 ++-------
 drivers/gpu/drm/xe/xe_gt.h                    |   8 +-
 drivers/gpu/drm/xe/xe_gt_debugfs.c            |   8 +-
 drivers/gpu/drm/xe/xe_gt_mcr.c                |   2 +-
 drivers/gpu/drm/xe/xe_gt_pagefault.c          |  16 +-
 drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c   |   4 +-
 drivers/gpu/drm/xe/xe_gt_types.h              |  87 ++--
 drivers/gpu/drm/xe/xe_guc.c                   |  11 +-
 drivers/gpu/drm/xe/xe_guc_ads.c               |   5 +-
 drivers/gpu/drm/xe/xe_guc_ct.c                |   5 +-
 drivers/gpu/drm/xe/xe_guc_hwconfig.c          |   5 +-
 drivers/gpu/drm/xe/xe_guc_log.c               |   6 +-
 drivers/gpu/drm/xe/xe_guc_pc.c                |   5 +-
 drivers/gpu/drm/xe/xe_hw_engine.c             |   6 +-
 drivers/gpu/drm/xe/xe_irq.c                   | 393 +++++++++---------
 drivers/gpu/drm/xe/xe_irq.h                   |   3 +-
 drivers/gpu/drm/xe/xe_lrc.c                   |  13 +-
 drivers/gpu/drm/xe/xe_lrc_types.h             |   4 +-
 drivers/gpu/drm/xe/xe_migrate.c               |  76 ++--
 drivers/gpu/drm/xe/xe_migrate.h               |   9 +-
 drivers/gpu/drm/xe/xe_mmio.c                  |  92 ++--
 drivers/gpu/drm/xe/xe_mmio.h                  |  21 +-
 drivers/gpu/drm/xe/xe_mocs.c                  |  14 +-
 drivers/gpu/drm/xe/xe_pci.c                   |  92 ++--
 drivers/gpu/drm/xe/xe_pt.c                    | 150 ++++---
 drivers/gpu/drm/xe/xe_pt.h                    |  14 +-
 drivers/gpu/drm/xe/xe_query.c                 |  32 +-
 drivers/gpu/drm/xe/xe_res_cursor.h            |   2 +-
 drivers/gpu/drm/xe/xe_sa.c                    |  13 +-
 drivers/gpu/drm/xe/xe_sa.h                    |   4 +-
 drivers/gpu/drm/xe/xe_tile.c                  |  89 ++++
 drivers/gpu/drm/xe/xe_tile.h                  |  16 +
 drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c        |   4 +-
 drivers/gpu/drm/xe/xe_ttm_vram_mgr.c          |  16 +-
 drivers/gpu/drm/xe/xe_ttm_vram_mgr.h          |   4 +-
 drivers/gpu/drm/xe/xe_ttm_vram_mgr_types.h    |   6 +-
 drivers/gpu/drm/xe/xe_uc_fw.c                 |   5 +-
 drivers/gpu/drm/xe/xe_vm.c                    | 156 ++++---
 drivers/gpu/drm/xe/xe_vm.h                    |   2 +-
 drivers/gpu/drm/xe/xe_vm_types.h              |  22 +-
 include/uapi/drm/xe_drm.h                     |   4 +-
 64 files changed, 1108 insertions(+), 957 deletions(-)
 create mode 100644 drivers/gpu/drm/xe/xe_tile.c
 create mode 100644 drivers/gpu/drm/xe/xe_tile.h

-- 
2.40.0


^ permalink raw reply	[flat|nested] 75+ messages in thread

* [Intel-xe] [PATCH 01/26] drm/xe/mtl: Disable media GT
  2023-05-11  3:46 [Intel-xe] [PATCH 00/26] Separate GT and tile Matt Roper
@ 2023-05-11  3:46 ` Matt Roper
  2023-05-11 20:50   ` Matt Atwood
  2023-05-11 23:29   ` Lucas De Marchi
  2023-05-11  3:46 ` [Intel-xe] [PATCH 02/26] drm/xe: Introduce xe_tile Matt Roper
                   ` (33 subsequent siblings)
  34 siblings, 2 replies; 75+ messages in thread
From: Matt Roper @ 2023-05-11  3:46 UTC (permalink / raw)
  To: intel-xe; +Cc: matthew.d.roper

Xe incorrectly conflates the concept of 'tile' and 'GT.'  Since MTL's
media support is not yet functioning properly, let's just disable it
completely for now while we fix the fundamental driver design.  Support
for media GTs on platforms like MTL will be re-added later.

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/xe/xe_gt_mcr.c |  2 +-
 drivers/gpu/drm/xe/xe_mmio.c   |  2 --
 drivers/gpu/drm/xe/xe_pci.c    | 15 ++-------------
 3 files changed, 3 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_gt_mcr.c b/drivers/gpu/drm/xe/xe_gt_mcr.c
index 3db550c85e32..be80fdc4b5a2 100644
--- a/drivers/gpu/drm/xe/xe_gt_mcr.c
+++ b/drivers/gpu/drm/xe/xe_gt_mcr.c
@@ -293,7 +293,7 @@ void xe_gt_mcr_init(struct xe_gt *gt)
 
 	spin_lock_init(&gt->mcr_lock);
 
-	if (gt->info.type == XE_GT_TYPE_MEDIA) {
+	if (xe_gt_is_media_type(gt)) {
 		drm_WARN_ON(&xe->drm, MEDIA_VER(xe) < 13);
 
 		gt->steering[OADDRM].ranges = xelpmp_oaddrm_steering_table;
diff --git a/drivers/gpu/drm/xe/xe_mmio.c b/drivers/gpu/drm/xe/xe_mmio.c
index c7fbb1cc1f64..4804616a3c44 100644
--- a/drivers/gpu/drm/xe/xe_mmio.c
+++ b/drivers/gpu/drm/xe/xe_mmio.c
@@ -301,8 +301,6 @@ static void xe_mmio_probe_tiles(struct xe_device *xe)
 	mtcfg = xe_mmio_read64(gt, XEHP_MTCFG_ADDR);
 	adj_tile_count = xe->info.tile_count =
 		REG_FIELD_GET(TILE_COUNT, mtcfg) + 1;
-	if (xe->info.media_verx100 >= 1300)
-		xe->info.tile_count *= 2;
 
 	drm_info(&xe->drm, "tile_count: %d, adj_tile_count %d\n",
 		 xe->info.tile_count, adj_tile_count);
diff --git a/drivers/gpu/drm/xe/xe_pci.c b/drivers/gpu/drm/xe/xe_pci.c
index a6858fc7fe8d..bf2c234c4f6e 100644
--- a/drivers/gpu/drm/xe/xe_pci.c
+++ b/drivers/gpu/drm/xe/xe_pci.c
@@ -19,6 +19,7 @@
 #include "xe_device.h"
 #include "xe_display.h"
 #include "xe_drv.h"
+#include "xe_gt.h"
 #include "xe_macros.h"
 #include "xe_module.h"
 #include "xe_pci_types.h"
@@ -271,20 +272,10 @@ static const struct xe_device_desc pvc_desc = {
 	.extra_gts = pvc_gts,
 };
 
-static const struct xe_gt_desc xelpmp_gts[] = {
-	{
-		.type = XE_GT_TYPE_MEDIA,
-		.vram_id = 0,
-		.mmio_adj_limit = 0x40000,
-		.mmio_adj_offset = 0x380000,
-	},
-};
-
 static const struct xe_device_desc mtl_desc = {
 	/* .graphics and .media determined via GMD_ID */
 	.require_force_probe = true,
 	PLATFORM(XE_METEORLAKE),
-	.extra_gts = xelpmp_gts,
 };
 
 #undef PLATFORM
@@ -528,8 +519,6 @@ static int xe_info_init(struct xe_device *xe,
 	 * treats it as the number of GTs rather than just the number of tiles.
 	 */
 	xe->info.tile_count = 1 + graphics_desc->max_remote_tiles;
-	if (MEDIA_VER(xe) >= 13)
-		xe->info.tile_count++;
 
 	xe->info.subplatform = subplatform_desc ?
 		subplatform_desc->subplatform : XE_SUBPLATFORM_NONE;
@@ -553,7 +542,7 @@ static int xe_info_init(struct xe_device *xe,
 		} else {
 			gt->info.type = desc->extra_gts[id - 1].type;
 			gt->info.vram_id = desc->extra_gts[id - 1].vram_id;
-			gt->info.__engine_mask = (gt->info.type == XE_GT_TYPE_MEDIA) ?
+			gt->info.__engine_mask = xe_gt_is_media_type(gt) ?
 				media_desc->hw_engine_mask :
 				graphics_desc->hw_engine_mask;
 			gt->mmio.adj_limit =
-- 
2.40.0


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [Intel-xe] [PATCH 02/26] drm/xe: Introduce xe_tile
  2023-05-11  3:46 [Intel-xe] [PATCH 00/26] Separate GT and tile Matt Roper
  2023-05-11  3:46 ` [Intel-xe] [PATCH 01/26] drm/xe/mtl: Disable media GT Matt Roper
@ 2023-05-11  3:46 ` Matt Roper
  2023-05-11  5:46   ` Lucas De Marchi
                     ` (3 more replies)
  2023-05-11  3:46 ` [Intel-xe] [PATCH 03/26] drm/xe: Add backpointer from gt to tile Matt Roper
                   ` (32 subsequent siblings)
  34 siblings, 4 replies; 75+ messages in thread
From: Matt Roper @ 2023-05-11  3:46 UTC (permalink / raw)
  To: intel-xe; +Cc: matthew.d.roper

Create a new xe_tile structure to begin separating the concept of "tile"
from "GT."  A tile is effectively a complete GPU, and a GT is just one
part of that.  On platforms like MTL, there's only a single full GPU
(tile) which has its IP blocks provided by two GTs.  In contrast, a
"multi-tile" platform like PVC is basically multiple complete GPUs
packed behind a single PCI device.

For now, just create xe_tile as a simple wrapper around xe_gt.  The
items in xe_gt that are truly tied to the tile rather than the GT will
be moved in future patches.  Support for multiple GTs per tile (i.e.,
the MTL standalone media case) will also be re-introduced in a future
patch.

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/xe/xe_device.h       | 11 +++++---
 drivers/gpu/drm/xe/xe_device_types.h | 40 +++++++++++++++++++++++++---
 drivers/gpu/drm/xe/xe_gt_types.h     | 15 +++++++----
 drivers/gpu/drm/xe/xe_mmio.c         | 13 ++++-----
 drivers/gpu/drm/xe/xe_pci.c          |  5 +++-
 drivers/gpu/drm/xe/xe_vm.c           |  2 +-
 drivers/gpu/drm/xe/xe_vm_types.h     |  8 +++---
 7 files changed, 71 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_device.h b/drivers/gpu/drm/xe/xe_device.h
index cbae480a2092..f7acaf51a1fc 100644
--- a/drivers/gpu/drm/xe/xe_device.h
+++ b/drivers/gpu/drm/xe/xe_device.h
@@ -48,12 +48,17 @@ static inline struct xe_file *to_xe_file(const struct drm_file *file)
 	return file->driver_priv;
 }
 
+static inline struct xe_tile *xe_device_get_root_tile(struct xe_device *xe)
+{
+	return &xe->tiles[0];
+}
+
 static inline struct xe_gt *xe_device_get_gt(struct xe_device *xe, u8 gt_id)
 {
 	struct xe_gt *gt;
 
-	XE_BUG_ON(gt_id > XE_MAX_GT);
-	gt = xe->gt + gt_id;
+	XE_BUG_ON(gt_id > XE_MAX_TILES_PER_DEVICE);
+	gt = &xe->tiles[gt_id].primary_gt;
 	XE_BUG_ON(gt->info.id != gt_id);
 	XE_BUG_ON(gt->info.type == XE_GT_TYPE_UNINITIALIZED);
 
@@ -65,7 +70,7 @@ static inline struct xe_gt *xe_device_get_gt(struct xe_device *xe, u8 gt_id)
  */
 static inline struct xe_gt *to_gt(struct xe_device *xe)
 {
-	return xe->gt;
+	return &xe_device_get_root_tile(xe)->primary_gt;
 }
 
 static inline bool xe_device_guc_submission_enabled(struct xe_device *xe)
diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
index 6490a04614ce..5dcf1695925f 100644
--- a/drivers/gpu/drm/xe/xe_device_types.h
+++ b/drivers/gpu/drm/xe/xe_device_types.h
@@ -34,7 +34,7 @@
 
 #define XE_GT0		0
 #define XE_GT1		1
-#define XE_MAX_GT	(XE_GT1 + 1)
+#define XE_MAX_TILES_PER_DEVICE	(XE_GT1 + 1)
 
 #define XE_MAX_ASID	(BIT(20))
 
@@ -48,6 +48,40 @@
 	 (_xe)->info.step.graphics >= (min_step) &&			\
 	 (_xe)->info.step.graphics < (max_step))
 
+#define tile_to_xe(tile__)								\
+	_Generic(tile__,								\
+		 const struct xe_tile *: (const struct xe_device *)((tile__)->xe),	\
+		 struct xe_tile *: (tile__)->xe)
+
+/**
+ * struct xe_tile - hardware tile structure
+ *
+ * From a driver perspective, a "tile" is effectively a complete GPU, containing
+ * an SGunit, 1-2 GTs, and (for discrete platforms) VRAM.
+ *
+ * Multi-tile platforms effectively bundle multiple GPUs behind a single PCI
+ * device and designate one "root" tile as being responsible for external PCI
+ * communication.  PCI BAR0 exposes the GGTT and MMIO register space for each
+ * tile in a stacked layout, and PCI BAR2 exposes the local memory associated
+ * with each tile similarly.  Device-wide interrupts can be enabled/disabled
+ * at the root tile, and the MSTR_TILE_INTR register will report which tiles
+ * have interrupts that need servicing.
+ */
+struct xe_tile {
+	/** @xe: Backpointer to tile's PCI device */
+	struct xe_device *xe;
+
+	/** @id: ID of the tile */
+	u8 id;
+
+	/**
+	 * @primary_gt: Primary GT
+	 */
+	struct xe_gt primary_gt;
+
+	/* TODO: Add media GT here */
+};
+
 /**
  * struct xe_device - Top level struct of XE device
  */
@@ -248,8 +282,8 @@ struct xe_device {
 	/** @ordered_wq: used to serialize compute mode resume */
 	struct workqueue_struct *ordered_wq;
 
-	/** @gt: graphics tile */
-	struct xe_gt gt[XE_MAX_GT];
+	/** @tiles: device tiles */
+	struct xe_tile tiles[XE_MAX_TILES_PER_DEVICE];
 
 	/**
 	 * @mem_access: keep track of memory access in the device, possibly
diff --git a/drivers/gpu/drm/xe/xe_gt_types.h b/drivers/gpu/drm/xe/xe_gt_types.h
index 7c47d67aa8be..e0ed4508269b 100644
--- a/drivers/gpu/drm/xe/xe_gt_types.h
+++ b/drivers/gpu/drm/xe/xe_gt_types.h
@@ -77,12 +77,17 @@ enum xe_steering_type {
 };
 
 /**
- * struct xe_gt - Top level struct of a graphics tile
+ * struct xe_gt - A "Graphics Technology" unit of the GPU
  *
- * A graphics tile may be a physical split (duplicate pieces of silicon,
- * different GGTT + VRAM) or a virtual split (shared GGTT + VRAM). Either way
- * this structure encapsulates of everything a GT is (MMIO, VRAM, memory
- * management, microcontrols, and a hardware set of engines).
+ * A GT ("Graphics Technology") is the subset of a GPU primarily responsible
+ * for implementing the graphics and/or media IP.  It encapsulates the hardware
+ * engines, programmable execution units, and GuC.   Each GT has its own
+ * handling of power management (RC6+forcewake) and multicast register
+ * steering.
+ *
+ * A GPU/tile may have a single GT that supplies all graphics and media
+ * functionality, or the graphics and media may be split into separate GTs
+ * within a tile.
  */
 struct xe_gt {
 	/** @xe: backpointer to XE device */
diff --git a/drivers/gpu/drm/xe/xe_mmio.c b/drivers/gpu/drm/xe/xe_mmio.c
index 4804616a3c44..254b4a63d901 100644
--- a/drivers/gpu/drm/xe/xe_mmio.c
+++ b/drivers/gpu/drm/xe/xe_mmio.c
@@ -399,6 +399,7 @@ int xe_mmio_ioctl(struct drm_device *dev, void *data,
 		  struct drm_file *file)
 {
 	struct xe_device *xe = to_xe_device(dev);
+	struct xe_gt *gt = xe_device_get_gt(xe, 0);
 	struct drm_xe_mmio *args = data;
 	unsigned int bits_flag, bytes;
 	struct xe_reg reg;
@@ -440,7 +441,7 @@ int xe_mmio_ioctl(struct drm_device *dev, void *data,
 	 */
 	reg = XE_REG(args->addr);
 
-	xe_force_wake_get(gt_to_fw(&xe->gt[0]), XE_FORCEWAKE_ALL);
+	xe_force_wake_get(gt_to_fw(gt), XE_FORCEWAKE_ALL);
 
 	if (args->flags & DRM_XE_MMIO_WRITE) {
 		switch (bits_flag) {
@@ -449,10 +450,10 @@ int xe_mmio_ioctl(struct drm_device *dev, void *data,
 				ret = -EINVAL;
 				goto exit;
 			}
-			xe_mmio_write32(to_gt(xe), reg, args->value);
+			xe_mmio_write32(gt, reg, args->value);
 			break;
 		case DRM_XE_MMIO_64BIT:
-			xe_mmio_write64(to_gt(xe), reg, args->value);
+			xe_mmio_write64(gt, reg, args->value);
 			break;
 		default:
 			drm_dbg(&xe->drm, "Invalid MMIO bit size");
@@ -467,10 +468,10 @@ int xe_mmio_ioctl(struct drm_device *dev, void *data,
 	if (args->flags & DRM_XE_MMIO_READ) {
 		switch (bits_flag) {
 		case DRM_XE_MMIO_32BIT:
-			args->value = xe_mmio_read32(to_gt(xe), reg);
+			args->value = xe_mmio_read32(gt, reg);
 			break;
 		case DRM_XE_MMIO_64BIT:
-			args->value = xe_mmio_read64(to_gt(xe), reg);
+			args->value = xe_mmio_read64(gt, reg);
 			break;
 		default:
 			drm_dbg(&xe->drm, "Invalid MMIO bit size");
@@ -482,7 +483,7 @@ int xe_mmio_ioctl(struct drm_device *dev, void *data,
 	}
 
 exit:
-	xe_force_wake_put(gt_to_fw(&xe->gt[0]), XE_FORCEWAKE_ALL);
+	xe_force_wake_put(gt_to_fw(gt), XE_FORCEWAKE_ALL);
 
 	return ret;
 }
diff --git a/drivers/gpu/drm/xe/xe_pci.c b/drivers/gpu/drm/xe/xe_pci.c
index bf2c234c4f6e..e79b16d8bf7f 100644
--- a/drivers/gpu/drm/xe/xe_pci.c
+++ b/drivers/gpu/drm/xe/xe_pci.c
@@ -525,7 +525,10 @@ static int xe_info_init(struct xe_device *xe,
 	xe->info.step = xe_step_get(xe);
 
 	for (id = 0; id < xe->info.tile_count; ++id) {
-		gt = xe->gt + id;
+		xe->tiles[id].xe = xe;
+		xe->tiles[id].id = id;
+
+		gt = &xe->tiles[id].primary_gt;
 		gt->info.id = id;
 		gt->xe = xe;
 
diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
index 0a4becdf4675..fe6abb6ed6ca 100644
--- a/drivers/gpu/drm/xe/xe_vm.c
+++ b/drivers/gpu/drm/xe/xe_vm.c
@@ -3347,7 +3347,7 @@ int xe_vm_invalidate_vma(struct xe_vma *vma)
 	struct xe_device *xe = vma->vm->xe;
 	struct xe_gt *gt;
 	u32 gt_needs_invalidate = 0;
-	int seqno[XE_MAX_GT];
+	int seqno[XE_MAX_TILES_PER_DEVICE];
 	u8 id;
 	int ret;
 
diff --git a/drivers/gpu/drm/xe/xe_vm_types.h b/drivers/gpu/drm/xe/xe_vm_types.h
index fada7896867f..203ba9d946b8 100644
--- a/drivers/gpu/drm/xe/xe_vm_types.h
+++ b/drivers/gpu/drm/xe/xe_vm_types.h
@@ -159,7 +159,7 @@ struct xe_vm {
 	struct kref refcount;
 
 	/* engine used for (un)binding vma's */
-	struct xe_engine *eng[XE_MAX_GT];
+	struct xe_engine *eng[XE_MAX_TILES_PER_DEVICE];
 
 	/** Protects @rebind_list and the page-table structures */
 	struct dma_resv resv;
@@ -167,9 +167,9 @@ struct xe_vm {
 	u64 size;
 	struct rb_root vmas;
 
-	struct xe_pt *pt_root[XE_MAX_GT];
-	struct xe_bo *scratch_bo[XE_MAX_GT];
-	struct xe_pt *scratch_pt[XE_MAX_GT][XE_VM_MAX_LEVEL];
+	struct xe_pt *pt_root[XE_MAX_TILES_PER_DEVICE];
+	struct xe_bo *scratch_bo[XE_MAX_TILES_PER_DEVICE];
+	struct xe_pt *scratch_pt[XE_MAX_TILES_PER_DEVICE][XE_VM_MAX_LEVEL];
 
 	/** @flags: flags for this VM, statically setup a creation time */
 #define XE_VM_FLAGS_64K			BIT(0)
-- 
2.40.0


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [Intel-xe] [PATCH 03/26] drm/xe: Add backpointer from gt to tile
  2023-05-11  3:46 [Intel-xe] [PATCH 00/26] Separate GT and tile Matt Roper
  2023-05-11  3:46 ` [Intel-xe] [PATCH 01/26] drm/xe/mtl: Disable media GT Matt Roper
  2023-05-11  3:46 ` [Intel-xe] [PATCH 02/26] drm/xe: Introduce xe_tile Matt Roper
@ 2023-05-11  3:46 ` Matt Roper
  2023-05-11 21:10   ` Matt Atwood
  2023-05-12  0:07   ` Lucas De Marchi
  2023-05-11  3:47 ` [Intel-xe] [PATCH 04/26] drm/xe: Add for_each_tile iterator Matt Roper
                   ` (31 subsequent siblings)
  34 siblings, 2 replies; 75+ messages in thread
From: Matt Roper @ 2023-05-11  3:46 UTC (permalink / raw)
  To: intel-xe; +Cc: matthew.d.roper

Rather than a backpointer to the xe_device, a GT should have a
backpointer to its tile (which can then be used to lookup the device if
necessary).

The gt_to_xe() helper macro (which moves from xe_gt.h to xe_gt_types.h)
can and should still be used to jump directly from an xe_gt to
xe_device.

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/xe/xe_bb.c                  |  2 +-
 drivers/gpu/drm/xe/xe_gt.h                  |  5 -----
 drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c |  4 ++--
 drivers/gpu/drm/xe/xe_gt_types.h            | 14 ++++++++++++--
 drivers/gpu/drm/xe/xe_mocs.c                | 14 +++++++-------
 drivers/gpu/drm/xe/xe_pci.c                 | 11 +++++++----
 drivers/gpu/drm/xe/xe_pt.c                  |  2 +-
 7 files changed, 30 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_bb.c b/drivers/gpu/drm/xe/xe_bb.c
index 3deb2d55f421..bf7c94b769d7 100644
--- a/drivers/gpu/drm/xe/xe_bb.c
+++ b/drivers/gpu/drm/xe/xe_bb.c
@@ -16,7 +16,7 @@
 
 static int bb_prefetch(struct xe_gt *gt)
 {
-	struct xe_device *xe = gt->xe;
+	struct xe_device *xe = gt_to_xe(gt);
 
 	if (GRAPHICS_VERx100(xe) >= 1250 && !xe_gt_is_media_type(gt))
 		/*
diff --git a/drivers/gpu/drm/xe/xe_gt.h b/drivers/gpu/drm/xe/xe_gt.h
index 086369f7ee6d..f4e98f499b36 100644
--- a/drivers/gpu/drm/xe/xe_gt.h
+++ b/drivers/gpu/drm/xe/xe_gt.h
@@ -49,11 +49,6 @@ static inline bool xe_gt_is_media_type(struct xe_gt *gt)
 	return gt->info.type == XE_GT_TYPE_MEDIA;
 }
 
-#define gt_to_xe(gt__)								\
-	_Generic(gt__,								\
-		 const struct xe_gt *: (const struct xe_device *)((gt__)->xe),	\
-		 struct xe_gt *: (gt__)->xe)
-
 static inline bool xe_gt_is_usm_hwe(struct xe_gt *gt, struct xe_hw_engine *hwe)
 {
 	struct xe_device *xe = gt_to_xe(gt);
diff --git a/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c b/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c
index c815a42e2cdb..c9e8825c02aa 100644
--- a/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c
+++ b/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c
@@ -322,8 +322,8 @@ int xe_guc_tlb_invalidation_done_handler(struct xe_guc *guc, u32 *msg, u32 len)
 		TLB_INVALIDATION_SEQNO_MAX;
 	if (!expected_seqno)
 		expected_seqno = 1;
-	if (drm_WARN_ON(&gt->xe->drm, expected_seqno != msg[0])) {
-		drm_err(&gt->xe->drm, "TLB expected_seqno(%d) != msg(%u)\n",
+	if (drm_WARN_ON(&gt_to_xe(gt)->drm, expected_seqno != msg[0])) {
+		drm_err(&gt_to_xe(gt)->drm, "TLB expected_seqno(%d) != msg(%u)\n",
 			expected_seqno, msg[0]);
 	}
 
diff --git a/drivers/gpu/drm/xe/xe_gt_types.h b/drivers/gpu/drm/xe/xe_gt_types.h
index e0ed4508269b..c4376d50786b 100644
--- a/drivers/gpu/drm/xe/xe_gt_types.h
+++ b/drivers/gpu/drm/xe/xe_gt_types.h
@@ -76,6 +76,16 @@ enum xe_steering_type {
 	NUM_STEERING_TYPES
 };
 
+#define gt_to_tile(gt__)							\
+	_Generic(gt__,								\
+		 const struct xe_gt *: (const struct xe_tile *)((gt__)->tile),	\
+		 struct xe_gt *: (gt__)->tile)
+
+#define gt_to_xe(gt__)										\
+	_Generic(gt__,										\
+		 const struct xe_gt *: (const struct xe_device *)(gt_to_tile(gt__)->xe),	\
+		 struct xe_gt *: gt_to_tile(gt__)->xe)
+
 /**
  * struct xe_gt - A "Graphics Technology" unit of the GPU
  *
@@ -90,8 +100,8 @@ enum xe_steering_type {
  * within a tile.
  */
 struct xe_gt {
-	/** @xe: backpointer to XE device */
-	struct xe_device *xe;
+	/** @tile: Backpointer to GT's tile */
+	struct xe_tile *tile;
 
 	/** @info: GT info */
 	struct {
diff --git a/drivers/gpu/drm/xe/xe_mocs.c b/drivers/gpu/drm/xe/xe_mocs.c
index 817afd301d52..d57fbf16a3ef 100644
--- a/drivers/gpu/drm/xe/xe_mocs.c
+++ b/drivers/gpu/drm/xe/xe_mocs.c
@@ -471,7 +471,7 @@ static void __init_mocs_table(struct xe_gt *gt,
 	unsigned int i;
 	u32 mocs;
 
-	mocs_dbg(&gt->xe->drm, "entries:%d\n", info->n_entries);
+	mocs_dbg(&gt_to_xe(gt)->drm, "entries:%d\n", info->n_entries);
 	drm_WARN_ONCE(&xe->drm, !info->unused_entries_index,
 		      "Unused entries index should have been defined\n");
 	for (i = 0;
@@ -479,7 +479,7 @@ static void __init_mocs_table(struct xe_gt *gt,
 	     i++) {
 		struct xe_reg reg = XE_REG(addr + i * 4);
 
-		mocs_dbg(&gt->xe->drm, "%d 0x%x 0x%x\n", i, reg.addr, mocs);
+		mocs_dbg(&gt_to_xe(gt)->drm, "%d 0x%x 0x%x\n", i, reg.addr, mocs);
 		xe_mmio_write32(gt, reg, mocs);
 	}
 }
@@ -508,13 +508,13 @@ static void init_l3cc_table(struct xe_gt *gt,
 	unsigned int i;
 	u32 l3cc;
 
-	mocs_dbg(&gt->xe->drm, "entries:%d\n", info->n_entries);
+	mocs_dbg(&gt_to_xe(gt)->drm, "entries:%d\n", info->n_entries);
 	for (i = 0;
 	     i < (info->n_entries + 1) / 2 ?
 	     (l3cc = l3cc_combine(get_entry_l3cc(info, 2 * i),
 				  get_entry_l3cc(info, 2 * i + 1))), 1 : 0;
 	     i++) {
-		mocs_dbg(&gt->xe->drm, "%d 0x%x 0x%x\n", i, LNCFCMOCS(i).addr,
+		mocs_dbg(&gt_to_xe(gt)->drm, "%d 0x%x 0x%x\n", i, LNCFCMOCS(i).addr,
 			 l3cc);
 		xe_mmio_write32(gt, LNCFCMOCS(i), l3cc);
 	}
@@ -524,7 +524,7 @@ void xe_mocs_init_early(struct xe_gt *gt)
 {
 	struct xe_mocs_info table;
 
-	get_mocs_settings(gt->xe, &table);
+	get_mocs_settings(gt_to_xe(gt), &table);
 	gt->mocs.uc_index = table.uc_index;
 	gt->mocs.wb_index = table.wb_index;
 }
@@ -537,8 +537,8 @@ void xe_mocs_init(struct xe_gt *gt)
 	/*
 	 * LLC and eDRAM control values are not applicable to dgfx
 	 */
-	flags = get_mocs_settings(gt->xe, &table);
-	mocs_dbg(&gt->xe->drm, "flag:0x%x\n", flags);
+	flags = get_mocs_settings(gt_to_xe(gt), &table);
+	mocs_dbg(&gt_to_xe(gt)->drm, "flag:0x%x\n", flags);
 
 	if (flags & HAS_GLOBAL_MOCS)
 		__init_mocs_table(gt, &table, GLOBAL_MOCS(0).addr);
diff --git a/drivers/gpu/drm/xe/xe_pci.c b/drivers/gpu/drm/xe/xe_pci.c
index e79b16d8bf7f..87c328106aca 100644
--- a/drivers/gpu/drm/xe/xe_pci.c
+++ b/drivers/gpu/drm/xe/xe_pci.c
@@ -471,6 +471,7 @@ static int xe_info_init(struct xe_device *xe,
 {
 	const struct xe_graphics_desc *graphics_desc = NULL;
 	const struct xe_media_desc *media_desc = NULL;
+	struct xe_tile *tile;
 	struct xe_gt *gt;
 	u8 id;
 
@@ -525,13 +526,15 @@ static int xe_info_init(struct xe_device *xe,
 	xe->info.step = xe_step_get(xe);
 
 	for (id = 0; id < xe->info.tile_count; ++id) {
-		xe->tiles[id].xe = xe;
-		xe->tiles[id].id = id;
+		tile = &xe->tiles[id];
+		tile->xe = xe;
+		tile->id = id;
 
-		gt = &xe->tiles[id].primary_gt;
+		gt = &tile->primary_gt;
 		gt->info.id = id;
-		gt->xe = xe;
+		gt->tile = tile;
 
+		gt->info.id = id;
 		if (id == 0) {
 			gt->info.type = XE_GT_TYPE_MAIN;
 			gt->info.vram_id = id;
diff --git a/drivers/gpu/drm/xe/xe_pt.c b/drivers/gpu/drm/xe/xe_pt.c
index f15282996c3b..61126cefe0b5 100644
--- a/drivers/gpu/drm/xe/xe_pt.c
+++ b/drivers/gpu/drm/xe/xe_pt.c
@@ -695,7 +695,7 @@ xe_pt_stage_bind_entry(struct xe_ptw *parent, pgoff_t offset,
 		 * TODO: Suballocate the pt bo to avoid wasting a lot of
 		 * memory.
 		 */
-		if (GRAPHICS_VERx100(xe_walk->gt->xe) >= 1250 && level == 1 &&
+		if (GRAPHICS_VERx100(gt_to_xe(xe_walk->gt)) >= 1250 && level == 1 &&
 		    covers && xe_pt_scan_64K(addr, next, xe_walk)) {
 			walk->shifts = xe_compact_pt_shifts;
 			flags |= XE_PDE_64K;
-- 
2.40.0


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [Intel-xe] [PATCH 04/26] drm/xe: Add for_each_tile iterator
  2023-05-11  3:46 [Intel-xe] [PATCH 00/26] Separate GT and tile Matt Roper
                   ` (2 preceding siblings ...)
  2023-05-11  3:46 ` [Intel-xe] [PATCH 03/26] drm/xe: Add backpointer from gt to tile Matt Roper
@ 2023-05-11  3:47 ` Matt Roper
  2023-05-11 23:23   ` Lucas De Marchi
  2023-05-12  5:45   ` Iddamsetty, Aravind
  2023-05-11  3:47 ` [Intel-xe] [PATCH 05/26] drm/xe: Move register MMIO into xe_tile Matt Roper
                   ` (30 subsequent siblings)
  34 siblings, 2 replies; 75+ messages in thread
From: Matt Roper @ 2023-05-11  3:47 UTC (permalink / raw)
  To: intel-xe; +Cc: matthew.d.roper

As we start splitting tile handling out from GT handling, we'll need to
be able to iterate over tiles separately from GTs.  This iterator will
be used in upcoming patches.

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/xe/xe_device.h | 4 ++++
 drivers/gpu/drm/xe/xe_pci.c    | 3 +--
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_device.h b/drivers/gpu/drm/xe/xe_device.h
index f7acaf51a1fc..745dbb16d417 100644
--- a/drivers/gpu/drm/xe/xe_device.h
+++ b/drivers/gpu/drm/xe/xe_device.h
@@ -83,6 +83,10 @@ static inline void xe_device_guc_submission_disable(struct xe_device *xe)
 	xe->info.enable_guc = false;
 }
 
+#define for_each_tile(tile__, xe__, id__) \
+	for ((id__) = 0; (id__) < (xe__)->info.tile_count; (id__++)) \
+		for_each_if ((tile__) = &(xe__)->tiles[(id__)])
+
 #define for_each_gt(gt__, xe__, id__) \
 	for ((id__) = 0; (id__) < (xe__)->info.tile_count; (id__++)) \
 		for_each_if ((gt__) = xe_device_get_gt((xe__), (id__)))
diff --git a/drivers/gpu/drm/xe/xe_pci.c b/drivers/gpu/drm/xe/xe_pci.c
index 87c328106aca..bef65d3a440e 100644
--- a/drivers/gpu/drm/xe/xe_pci.c
+++ b/drivers/gpu/drm/xe/xe_pci.c
@@ -525,8 +525,7 @@ static int xe_info_init(struct xe_device *xe,
 		subplatform_desc->subplatform : XE_SUBPLATFORM_NONE;
 	xe->info.step = xe_step_get(xe);
 
-	for (id = 0; id < xe->info.tile_count; ++id) {
-		tile = &xe->tiles[id];
+	for_each_tile(tile, xe, id) {
 		tile->xe = xe;
 		tile->id = id;
 
-- 
2.40.0


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [Intel-xe] [PATCH 05/26] drm/xe: Move register MMIO into xe_tile
  2023-05-11  3:46 [Intel-xe] [PATCH 00/26] Separate GT and tile Matt Roper
                   ` (3 preceding siblings ...)
  2023-05-11  3:47 ` [Intel-xe] [PATCH 04/26] drm/xe: Add for_each_tile iterator Matt Roper
@ 2023-05-11  3:47 ` Matt Roper
  2023-05-11 12:20   ` Jani Nikula
  2023-05-13  5:53   ` Lucas De Marchi
  2023-05-11  3:47 ` [Intel-xe] [PATCH 06/26] drm/xe: Move VRAM from GT to tile Matt Roper
                   ` (29 subsequent siblings)
  34 siblings, 2 replies; 75+ messages in thread
From: Matt Roper @ 2023-05-11  3:47 UTC (permalink / raw)
  To: intel-xe; +Cc: matthew.d.roper

Each tile has its own register region in the BAR, containing instances
of all registers for the platform.  In contrast, the multiple GTs within
a tile share the same MMIO space; there's just a small subset of
registers (the GSI registers) which have multiple copies at different
offsets (0x0 for primary GT, 0x380000 for media GT).  Move the register
MMIO region size/pointers to the tile structure, leaving just the GSI
offset information in the GT structure.

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/xe/display/ext/i915_irq.c |  2 +-
 drivers/gpu/drm/xe/xe_device_types.h      | 16 ++++++++++++++
 drivers/gpu/drm/xe/xe_ggtt.c              |  3 ++-
 drivers/gpu/drm/xe/xe_gt_types.h          |  9 +++-----
 drivers/gpu/drm/xe/xe_mmio.c              | 26 ++++++++++++-----------
 drivers/gpu/drm/xe/xe_mmio.h              | 21 +++++++++++++-----
 6 files changed, 52 insertions(+), 25 deletions(-)

diff --git a/drivers/gpu/drm/xe/display/ext/i915_irq.c b/drivers/gpu/drm/xe/display/ext/i915_irq.c
index afde97b6faa6..a9cbd7b59360 100644
--- a/drivers/gpu/drm/xe/display/ext/i915_irq.c
+++ b/drivers/gpu/drm/xe/display/ext/i915_irq.c
@@ -920,7 +920,7 @@ gen8_de_irq_handler(struct drm_i915_private *dev_priv, u32 master_ctl)
 
 void gen11_display_irq_handler(struct drm_i915_private *i915)
 {
-	void __iomem * const regs = to_gt(i915)->mmio.regs;
+	void __iomem * const regs = xe_device_get_root_tile(i915)->mmio.regs;
 	const u32 disp_ctl = raw_reg_read(regs, GEN11_DISPLAY_INT_CTL);
 
 	/*
diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
index 5dcf1695925f..2481b2045284 100644
--- a/drivers/gpu/drm/xe/xe_device_types.h
+++ b/drivers/gpu/drm/xe/xe_device_types.h
@@ -80,6 +80,22 @@ struct xe_tile {
 	struct xe_gt primary_gt;
 
 	/* TODO: Add media GT here */
+
+	/**
+	 * @mmio: MMIO info for a tile.
+	 *
+	 * Each tile has its own 16MB space in BAR0, laid out as:
+	 * * 0-4MB: registers
+	 * * 4MB-8MB: reserved
+	 * * 8MB-16MB: global GTT
+	 */
+	struct {
+		/** @size: size of tile's MMIO space */
+		size_t size;
+
+		/** @regs: pointer to tile's MMIO space (starting with registers) */
+		void *regs;
+	} mmio;
 };
 
 /**
diff --git a/drivers/gpu/drm/xe/xe_ggtt.c b/drivers/gpu/drm/xe/xe_ggtt.c
index 546240261e0a..200976da3dc1 100644
--- a/drivers/gpu/drm/xe/xe_ggtt.c
+++ b/drivers/gpu/drm/xe/xe_ggtt.c
@@ -93,6 +93,7 @@ static void ggtt_fini_noalloc(struct drm_device *drm, void *arg)
 int xe_ggtt_init_noalloc(struct xe_gt *gt, struct xe_ggtt *ggtt)
 {
 	struct xe_device *xe = gt_to_xe(gt);
+	struct xe_tile *tile = gt_to_tile(gt);
 	struct pci_dev *pdev = to_pci_dev(xe->drm.dev);
 	unsigned int gsm_size;
 
@@ -106,7 +107,7 @@ int xe_ggtt_init_noalloc(struct xe_gt *gt, struct xe_ggtt *ggtt)
 		return -ENOMEM;
 	}
 
-	ggtt->gsm = gt->mmio.regs + SZ_8M;
+	ggtt->gsm = tile->mmio.regs + SZ_8M;
 	ggtt->size = (gsm_size / 8) * (u64) XE_PAGE_SIZE;
 
 	if (IS_DGFX(xe) && xe->info.vram_flags & XE_VRAM_FLAGS_NEED64K)
diff --git a/drivers/gpu/drm/xe/xe_gt_types.h b/drivers/gpu/drm/xe/xe_gt_types.h
index c4376d50786b..03dd625b2781 100644
--- a/drivers/gpu/drm/xe/xe_gt_types.h
+++ b/drivers/gpu/drm/xe/xe_gt_types.h
@@ -124,14 +124,11 @@ struct xe_gt {
 	} info;
 
 	/**
-	 * @mmio: mmio info for GT, can be subset of the global device mmio
-	 * space
+	 * @mmio: mmio info for GT.  All GTs within a tile share the same
+	 * register space, but have their own copy of GSI registers at a
+	 * specific offset, as well as their own forcewake handling.
 	 */
 	struct {
-		/** @size: size of MMIO space on GT */
-		size_t size;
-		/** @regs: pointer to MMIO space on GT */
-		void *regs;
 		/** @fw: force wake for GT */
 		struct xe_force_wake fw;
 		/**
diff --git a/drivers/gpu/drm/xe/xe_mmio.c b/drivers/gpu/drm/xe/xe_mmio.c
index 254b4a63d901..54fa1212fcd9 100644
--- a/drivers/gpu/drm/xe/xe_mmio.c
+++ b/drivers/gpu/drm/xe/xe_mmio.c
@@ -307,6 +307,7 @@ static void xe_mmio_probe_tiles(struct xe_device *xe)
 
 	if (xe->info.tile_count > 1) {
 		const int mmio_bar = 0;
+		struct xe_tile *tile;
 		size_t size;
 		void *regs;
 
@@ -320,11 +321,11 @@ static void xe_mmio_probe_tiles(struct xe_device *xe)
 		size = xe->mmio.size / adj_tile_count;
 		regs = xe->mmio.regs;
 
-		for_each_gt(gt, xe, id) {
-			if (id && !xe_gt_is_media_type(gt))
-				regs += size;
-			gt->mmio.size = size;
-			gt->mmio.regs = regs;
+		for_each_tile(tile, xe, id) {
+			tile->mmio.size = size;
+			tile->mmio.regs = regs;
+
+			regs += size;
 		}
 	}
 }
@@ -340,15 +341,16 @@ static void mmio_fini(struct drm_device *drm, void *arg)
 
 int xe_mmio_init(struct xe_device *xe)
 {
+	struct xe_tile *root_tile = xe_device_get_root_tile(xe);
 	struct xe_gt *gt = xe_device_get_gt(xe, 0);
 	const int mmio_bar = 0;
 	int err;
 
 	/*
-	 * Map the entire BAR, which includes registers (0-4MB), reserved space
-	 * (4MB-8MB), and GGTT (8MB-16MB). Other parts of the driver (GTs,
-	 * GGTTs) will derive the pointers they need from the mapping in the
-	 * device structure.
+	 * Map the first 16MB of th BAR, which includes the registers (0-4MB),
+	 * reserved space (4MB-8MB), and GGTT (8MB-16MB) for a single tile.
+	 * This will get remapped later if we determine that we're running
+	 * on a multi-tile system.
 	 */
 	xe->mmio.size = SZ_16M;
 	xe->mmio.regs = pci_iomap(to_pci_dev(xe->drm.dev), mmio_bar,
@@ -362,9 +364,9 @@ int xe_mmio_init(struct xe_device *xe)
 	if (err)
 		return err;
 
-	/* 1 GT for now, 1 to 1 mapping, may change on multi-GT devices */
-	gt->mmio.size = xe->mmio.size;
-	gt->mmio.regs = xe->mmio.regs;
+	/* Setup first tile; other tiles (if present) will be setup later. */
+	root_tile->mmio.size = xe->mmio.size;
+	root_tile->mmio.regs = xe->mmio.regs;
 
 	/*
 	 * The boot firmware initializes local memory and assesses its health.
diff --git a/drivers/gpu/drm/xe/xe_mmio.h b/drivers/gpu/drm/xe/xe_mmio.h
index 1407f1189b0d..acf0b18f3111 100644
--- a/drivers/gpu/drm/xe/xe_mmio.h
+++ b/drivers/gpu/drm/xe/xe_mmio.h
@@ -10,6 +10,7 @@
 #include <linux/io-64-nonatomic-lo-hi.h>
 
 #include "regs/xe_reg_defs.h"
+#include "xe_device_types.h"
 #include "xe_gt_types.h"
 
 struct drm_device;
@@ -20,27 +21,33 @@ int xe_mmio_init(struct xe_device *xe);
 
 static inline u8 xe_mmio_read8(struct xe_gt *gt, struct xe_reg reg)
 {
+	struct xe_tile *tile = gt_to_tile(gt);
+
 	if (reg.addr < gt->mmio.adj_limit)
 		reg.addr += gt->mmio.adj_offset;
 
-	return readb(gt->mmio.regs + reg.addr);
+	return readb(tile->mmio.regs + reg.addr);
 }
 
 static inline void xe_mmio_write32(struct xe_gt *gt,
 				   struct xe_reg reg, u32 val)
 {
+	struct xe_tile *tile = gt_to_tile(gt);
+
 	if (reg.addr < gt->mmio.adj_limit)
 		reg.addr += gt->mmio.adj_offset;
 
-	writel(val, gt->mmio.regs + reg.addr);
+	writel(val, tile->mmio.regs + reg.addr);
 }
 
 static inline u32 xe_mmio_read32(struct xe_gt *gt, struct xe_reg reg)
 {
+	struct xe_tile *tile = gt_to_tile(gt);
+
 	if (reg.addr < gt->mmio.adj_limit)
 		reg.addr += gt->mmio.adj_offset;
 
-	return readl(gt->mmio.regs + reg.addr);
+	return readl(tile->mmio.regs + reg.addr);
 }
 
 static inline u32 xe_mmio_rmw32(struct xe_gt *gt, struct xe_reg reg, u32 clr,
@@ -58,18 +65,22 @@ static inline u32 xe_mmio_rmw32(struct xe_gt *gt, struct xe_reg reg, u32 clr,
 static inline void xe_mmio_write64(struct xe_gt *gt,
 				   struct xe_reg reg, u64 val)
 {
+	struct xe_tile *tile = gt_to_tile(gt);
+
 	if (reg.addr < gt->mmio.adj_limit)
 		reg.addr += gt->mmio.adj_offset;
 
-	writeq(val, gt->mmio.regs + reg.addr);
+	writeq(val, tile->mmio.regs + reg.addr);
 }
 
 static inline u64 xe_mmio_read64(struct xe_gt *gt, struct xe_reg reg)
 {
+	struct xe_tile *tile = gt_to_tile(gt);
+
 	if (reg.addr < gt->mmio.adj_limit)
 		reg.addr += gt->mmio.adj_offset;
 
-	return readq(gt->mmio.regs + reg.addr);
+	return readq(tile->mmio.regs + reg.addr);
 }
 
 static inline int xe_mmio_write32_and_verify(struct xe_gt *gt,
-- 
2.40.0


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [Intel-xe] [PATCH 06/26] drm/xe: Move VRAM from GT to tile
  2023-05-11  3:46 [Intel-xe] [PATCH 00/26] Separate GT and tile Matt Roper
                   ` (4 preceding siblings ...)
  2023-05-11  3:47 ` [Intel-xe] [PATCH 05/26] drm/xe: Move register MMIO into xe_tile Matt Roper
@ 2023-05-11  3:47 ` Matt Roper
  2023-05-15 22:40   ` Lucas De Marchi
  2023-05-11  3:47 ` [Intel-xe] [PATCH 07/26] drm/xe: Memory allocations are tile-based, not GT-based Matt Roper
                   ` (28 subsequent siblings)
  34 siblings, 1 reply; 75+ messages in thread
From: Matt Roper @ 2023-05-11  3:47 UTC (permalink / raw)
  To: intel-xe; +Cc: matthew.d.roper

On platforms with VRAM, the VRAM is associated with the tile, not the
GT.

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/xe/Makefile                   |  1 +
 drivers/gpu/drm/xe/display/xe_fb_pin.c        |  6 +-
 drivers/gpu/drm/xe/display/xe_plane_initial.c |  8 +-
 drivers/gpu/drm/xe/xe_bo.c                    | 50 +++++-----
 drivers/gpu/drm/xe/xe_bo.h                    |  4 +-
 drivers/gpu/drm/xe/xe_bo_evict.c              |  8 +-
 drivers/gpu/drm/xe/xe_device.c                | 14 ++-
 drivers/gpu/drm/xe/xe_device_types.h          | 36 +++++++
 drivers/gpu/drm/xe/xe_ggtt.c                  | 30 +++---
 drivers/gpu/drm/xe/xe_ggtt.h                  |  6 +-
 drivers/gpu/drm/xe/xe_ggtt_types.h            |  2 +-
 drivers/gpu/drm/xe/xe_gt.c                    | 93 ++-----------------
 drivers/gpu/drm/xe/xe_gt_debugfs.c            |  2 +-
 drivers/gpu/drm/xe/xe_gt_pagefault.c          |  6 +-
 drivers/gpu/drm/xe/xe_gt_types.h              | 38 --------
 drivers/gpu/drm/xe/xe_irq.c                   |  2 +-
 drivers/gpu/drm/xe/xe_mmio.c                  | 45 +++++----
 drivers/gpu/drm/xe/xe_pci.c                   |  2 -
 drivers/gpu/drm/xe/xe_pt.c                    |  4 +-
 drivers/gpu/drm/xe/xe_query.c                 |  4 +-
 drivers/gpu/drm/xe/xe_res_cursor.h            |  2 +-
 drivers/gpu/drm/xe/xe_tile.c                  | 74 +++++++++++++++
 drivers/gpu/drm/xe/xe_tile.h                  | 14 +++
 drivers/gpu/drm/xe/xe_ttm_vram_mgr.c          | 16 ++--
 drivers/gpu/drm/xe/xe_ttm_vram_mgr.h          |  4 +-
 drivers/gpu/drm/xe/xe_ttm_vram_mgr_types.h    |  6 +-
 26 files changed, 245 insertions(+), 232 deletions(-)
 create mode 100644 drivers/gpu/drm/xe/xe_tile.c
 create mode 100644 drivers/gpu/drm/xe/xe_tile.h

diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
index b6c41cd7dbe3..7da33bf5a0e4 100644
--- a/drivers/gpu/drm/xe/Makefile
+++ b/drivers/gpu/drm/xe/Makefile
@@ -83,6 +83,7 @@ xe-y += xe_bb.o \
 	xe_sched_job.o \
 	xe_step.o \
 	xe_sync.o \
+	xe_tile.o \
 	xe_trace.o \
 	xe_ttm_sys_mgr.o \
 	xe_ttm_stolen_mgr.o \
diff --git a/drivers/gpu/drm/xe/display/xe_fb_pin.c b/drivers/gpu/drm/xe/display/xe_fb_pin.c
index ed691d28b34d..78ac58244f24 100644
--- a/drivers/gpu/drm/xe/display/xe_fb_pin.c
+++ b/drivers/gpu/drm/xe/display/xe_fb_pin.c
@@ -123,7 +123,7 @@ static int __xe_pin_fb_vma_ggtt(struct intel_framebuffer *fb,
 {
 	struct xe_bo *bo = intel_fb_obj(&fb->base);
 	struct xe_device *xe = to_xe_device(fb->base.dev);
-	struct xe_ggtt *ggtt = to_gt(xe)->mem.ggtt;
+	struct xe_ggtt *ggtt = xe_device_get_root_tile(xe)->mem.ggtt;
 	u32 align;
 	int ret;
 
@@ -173,7 +173,7 @@ static int __xe_pin_fb_vma_ggtt(struct intel_framebuffer *fb,
 					   rot_info->plane[i].dst_stride);
 	}
 
-	xe_ggtt_invalidate(to_gt(xe));
+	xe_ggtt_invalidate(ggtt);
 
 out:
 	mutex_unlock(&ggtt->lock);
@@ -233,7 +233,7 @@ static struct i915_vma *__xe_pin_fb_vma(struct intel_framebuffer *fb,
 static void __xe_unpin_fb_vma(struct i915_vma *vma)
 {
 	struct xe_device *xe = to_xe_device(vma->bo->ttm.base.dev);
-	struct xe_ggtt *ggtt = to_gt(xe)->mem.ggtt;
+	struct xe_ggtt *ggtt = xe_device_get_root_tile(xe)->mem.ggtt;
 
 	if (vma->dpt)
 		xe_bo_unpin_map_no_vm(vma->dpt);
diff --git a/drivers/gpu/drm/xe/display/xe_plane_initial.c b/drivers/gpu/drm/xe/display/xe_plane_initial.c
index d0f91f37b6d8..556ede2e459e 100644
--- a/drivers/gpu/drm/xe/display/xe_plane_initial.c
+++ b/drivers/gpu/drm/xe/display/xe_plane_initial.c
@@ -51,7 +51,7 @@ static struct xe_bo *
 initial_plane_bo(struct xe_device *xe,
 		 struct intel_initial_plane_config *plane_config)
 {
-	struct xe_gt *gt0 = xe_device_get_gt(xe, 0);
+	struct xe_tile *tile0 = xe_device_get_root_tile(xe);
 	struct xe_bo *bo;
 	resource_size_t phys_base;
 	u32 base, size, flags;
@@ -64,7 +64,7 @@ initial_plane_bo(struct xe_device *xe,
 
 	base = round_down(plane_config->base, page_size);
 	if (IS_DGFX(xe)) {
-		u64 __iomem *gte = gt0->mem.ggtt->gsm;
+		u64 __iomem *gte = tile0->mem.ggtt->gsm;
 		u64 pte;
 
 		gte += base / XE_PAGE_SIZE;
@@ -83,7 +83,7 @@ initial_plane_bo(struct xe_device *xe,
 		 * We don't currently expect this to ever be placed in the
 		 * stolen portion.
 		 */
-		if (phys_base >= gt0->mem.vram.size) {
+		if (phys_base >= tile0->mem.vram.size) {
 			drm_err(&xe->drm,
 				"Initial plane programming using invalid range, phys_base=%pa\n",
 				&phys_base);
@@ -115,7 +115,7 @@ initial_plane_bo(struct xe_device *xe,
 			page_size);
 	size -= base;
 
-	bo = xe_bo_create_pin_map_at(xe, gt0, NULL, size, phys_base,
+	bo = xe_bo_create_pin_map_at(xe, &tile0->primary_gt, NULL, size, phys_base,
 				     ttm_bo_type_kernel, flags);
 	if (IS_ERR(bo)) {
 		drm_dbg(&xe->drm,
diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
index c82e995df779..5dbca5bbca8f 100644
--- a/drivers/gpu/drm/xe/xe_bo.c
+++ b/drivers/gpu/drm/xe/xe_bo.c
@@ -71,25 +71,25 @@ static bool xe_bo_is_user(struct xe_bo *bo)
 	return bo->flags & XE_BO_CREATE_USER_BIT;
 }
 
-static struct xe_gt *
-mem_type_to_gt(struct xe_device *xe, u32 mem_type)
+static struct xe_tile *
+mem_type_to_tile(struct xe_device *xe, u32 mem_type)
 {
 	XE_BUG_ON(mem_type != XE_PL_STOLEN && !mem_type_is_vram(mem_type));
 
-	return xe_device_get_gt(xe, mem_type == XE_PL_STOLEN ? 0 : (mem_type - XE_PL_VRAM0));
+	return &xe->tiles[mem_type == XE_PL_STOLEN ? 0 : (mem_type - XE_PL_VRAM0)];
 }
 
 /**
- * xe_bo_to_gt() - Get a GT from a BO's memory location
+ * xe_bo_to_tile() - Get a tile from a BO's memory location
  * @bo: The buffer object
  *
- * Get a GT from a BO's memory location, should be called on BOs in VRAM only.
+ * Get a tile from a BO's memory location, should be called on BOs in VRAM only.
  *
- * Return: xe_gt object which is closest to the BO
+ * Return: xe_tile object which is closest to the BO
  */
-struct xe_gt *xe_bo_to_gt(struct xe_bo *bo)
+struct xe_tile *xe_bo_to_tile(struct xe_bo *bo)
 {
-	return mem_type_to_gt(xe_bo_device(bo), bo->ttm.resource->mem_type);
+	return mem_type_to_tile(xe_bo_device(bo), bo->ttm.resource->mem_type);
 }
 
 static void try_add_system(struct xe_bo *bo, struct ttm_place *places,
@@ -109,9 +109,9 @@ static void try_add_system(struct xe_bo *bo, struct ttm_place *places,
 static void add_vram(struct xe_device *xe, struct xe_bo *bo,
 		     struct ttm_place *places, u32 bo_flags, u32 mem_type, u32 *c)
 {
-	struct xe_gt *gt = mem_type_to_gt(xe, mem_type);
+	struct xe_tile *tile = mem_type_to_tile(xe, mem_type);
 
-	XE_BUG_ON(!gt->mem.vram.size);
+	XE_BUG_ON(!tile->mem.vram.size);
 
 	places[*c] = (struct ttm_place) {
 		.mem_type = mem_type,
@@ -356,7 +356,7 @@ static int xe_ttm_io_mem_reserve(struct ttm_device *bdev,
 				 struct ttm_resource *mem)
 {
 	struct xe_device *xe = ttm_to_xe_device(bdev);
-	struct xe_gt *gt;
+	struct xe_tile *tile;
 
 	switch (mem->mem_type) {
 	case XE_PL_SYSTEM:
@@ -364,15 +364,15 @@ static int xe_ttm_io_mem_reserve(struct ttm_device *bdev,
 		return 0;
 	case XE_PL_VRAM0:
 	case XE_PL_VRAM1:
-		gt = mem_type_to_gt(xe, mem->mem_type);
+		tile = mem_type_to_tile(xe, mem->mem_type);
 		mem->bus.offset = mem->start << PAGE_SHIFT;
 
-		if (gt->mem.vram.mapping &&
+		if (tile->mem.vram.mapping &&
 		    mem->placement & TTM_PL_FLAG_CONTIGUOUS)
-			mem->bus.addr = (u8 *)gt->mem.vram.mapping +
+			mem->bus.addr = (u8 *)tile->mem.vram.mapping +
 				mem->bus.offset;
 
-		mem->bus.offset += gt->mem.vram.io_start;
+		mem->bus.offset += tile->mem.vram.io_start;
 		mem->bus.is_iomem = true;
 
 #if  !defined(CONFIG_X86)
@@ -632,9 +632,9 @@ static int xe_bo_move(struct ttm_buffer_object *ttm_bo, bool evict,
 	if (bo->gt)
 		gt = bo->gt;
 	else if (resource_is_vram(new_mem))
-		gt = mem_type_to_gt(xe, new_mem->mem_type);
+		gt = &mem_type_to_tile(xe, new_mem->mem_type)->primary_gt;
 	else if (resource_is_vram(old_mem))
-		gt = mem_type_to_gt(xe, old_mem->mem_type);
+		gt = &mem_type_to_tile(xe, old_mem->mem_type)->primary_gt;
 
 	XE_BUG_ON(!gt);
 	XE_BUG_ON(!gt->migrate);
@@ -658,7 +658,7 @@ static int xe_bo_move(struct ttm_buffer_object *ttm_bo, bool evict,
 
 			/* Create a new VMAP once kernel BO back in VRAM */
 			if (!ret && resource_is_vram(new_mem)) {
-				void *new_addr = gt->mem.vram.mapping +
+				void *new_addr = gt_to_tile(gt)->mem.vram.mapping +
 					(new_mem->start << PAGE_SHIFT);
 
 				if (XE_WARN_ON(new_mem->start == XE_BO_INVALID_OFFSET)) {
@@ -830,14 +830,14 @@ static unsigned long xe_ttm_io_mem_pfn(struct ttm_buffer_object *ttm_bo,
 {
 	struct xe_device *xe = ttm_to_xe_device(ttm_bo->bdev);
 	struct xe_bo *bo = ttm_to_xe_bo(ttm_bo);
-	struct xe_gt *gt = mem_type_to_gt(xe, ttm_bo->resource->mem_type);
+	struct xe_tile *tile = mem_type_to_tile(xe, ttm_bo->resource->mem_type);
 	struct xe_res_cursor cursor;
 
 	if (ttm_bo->resource->mem_type == XE_PL_STOLEN)
 		return xe_ttm_stolen_io_offset(bo, page_offset << PAGE_SHIFT) >> PAGE_SHIFT;
 
 	xe_res_first(ttm_bo->resource, (u64)page_offset << PAGE_SHIFT, 0, &cursor);
-	return (gt->mem.vram.io_start + cursor.start) >> PAGE_SHIFT;
+	return (tile->mem.vram.io_start + cursor.start) >> PAGE_SHIFT;
 }
 
 static void __xe_bo_vunmap(struct xe_bo *bo);
@@ -958,7 +958,7 @@ static void xe_ttm_bo_destroy(struct ttm_buffer_object *ttm_bo)
 	WARN_ON(!list_empty(&bo->vmas));
 
 	if (bo->ggtt_node.size)
-		xe_ggtt_remove_bo(bo->gt->mem.ggtt, bo);
+		xe_ggtt_remove_bo(gt_to_tile(bo->gt)->mem.ggtt, bo);
 
 	if (bo->vm && xe_bo_is_user(bo))
 		xe_vm_put(bo->vm);
@@ -1235,10 +1235,10 @@ xe_bo_create_locked_range(struct xe_device *xe,
 		XE_BUG_ON(!gt);
 
 		if (flags & XE_BO_FIXED_PLACEMENT_BIT) {
-			err = xe_ggtt_insert_bo_at(gt->mem.ggtt, bo,
+			err = xe_ggtt_insert_bo_at(gt_to_tile(gt)->mem.ggtt, bo,
 						   start + bo->size, U64_MAX);
 		} else {
-			err = xe_ggtt_insert_bo(gt->mem.ggtt, bo);
+			err = xe_ggtt_insert_bo(gt_to_tile(gt)->mem.ggtt, bo);
 		}
 		if (err)
 			goto err_unlock_put_bo;
@@ -1338,12 +1338,12 @@ struct xe_bo *xe_bo_create_from_data(struct xe_device *xe, struct xe_gt *gt,
 uint64_t vram_region_io_offset(struct ttm_resource *res)
 {
 	struct xe_device *xe = ttm_to_xe_device(res->bo->bdev);
-	struct xe_gt *gt = mem_type_to_gt(xe, res->mem_type);
+	struct xe_tile *tile = mem_type_to_tile(xe, res->mem_type);
 
 	if (res->mem_type == XE_PL_STOLEN)
 		return xe_ttm_stolen_gpu_offset(xe);
 
-	return gt->mem.vram.io_start - xe->mem.vram.io_start;
+	return tile->mem.vram.io_start - xe->mem.vram.io_start;
 }
 
 /**
diff --git a/drivers/gpu/drm/xe/xe_bo.h b/drivers/gpu/drm/xe/xe_bo.h
index 7e111332c35a..7a79f3893260 100644
--- a/drivers/gpu/drm/xe/xe_bo.h
+++ b/drivers/gpu/drm/xe/xe_bo.h
@@ -22,7 +22,7 @@
 /* -- */
 #define XE_BO_CREATE_STOLEN_BIT		BIT(4)
 #define XE_BO_CREATE_VRAM_IF_DGFX(gt) \
-	(IS_DGFX(gt_to_xe(gt)) ? XE_BO_CREATE_VRAM0_BIT << gt->info.vram_id : \
+	(IS_DGFX(gt_to_xe(gt)) ? XE_BO_CREATE_VRAM0_BIT << gt_to_tile(gt)->id : \
 	 XE_BO_CREATE_SYSTEM_BIT)
 #define XE_BO_CREATE_GGTT_BIT		BIT(5)
 #define XE_BO_CREATE_IGNORE_MIN_PAGE_SIZE_BIT BIT(6)
@@ -107,7 +107,7 @@ struct xe_bo *xe_bo_create_from_data(struct xe_device *xe, struct xe_gt *gt,
 int xe_bo_placement_for_flags(struct xe_device *xe, struct xe_bo *bo,
 			      u32 bo_flags);
 
-struct xe_gt *xe_bo_to_gt(struct xe_bo *bo);
+struct xe_tile *xe_bo_to_tile(struct xe_bo *bo);
 
 static inline struct xe_bo *ttm_to_xe_bo(const struct ttm_buffer_object *bo)
 {
diff --git a/drivers/gpu/drm/xe/xe_bo_evict.c b/drivers/gpu/drm/xe/xe_bo_evict.c
index 6642c5f52009..a72963c54bf3 100644
--- a/drivers/gpu/drm/xe/xe_bo_evict.c
+++ b/drivers/gpu/drm/xe/xe_bo_evict.c
@@ -149,9 +149,11 @@ int xe_bo_restore_kernel(struct xe_device *xe)
 		}
 
 		if (bo->flags & XE_BO_CREATE_GGTT_BIT) {
-			mutex_lock(&bo->gt->mem.ggtt->lock);
-			xe_ggtt_map_bo(bo->gt->mem.ggtt, bo);
-			mutex_unlock(&bo->gt->mem.ggtt->lock);
+			struct xe_tile *tile = gt_to_tile(bo->gt);
+
+			mutex_lock(&tile->mem.ggtt->lock);
+			xe_ggtt_map_bo(tile->mem.ggtt, bo);
+			mutex_unlock(&tile->mem.ggtt->lock);
 		}
 
 		/*
diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
index 32cc83c43b2a..038074a90584 100644
--- a/drivers/gpu/drm/xe/xe_device.c
+++ b/drivers/gpu/drm/xe/xe_device.c
@@ -27,6 +27,7 @@
 #include "xe_pcode.h"
 #include "xe_pm.h"
 #include "xe_query.h"
+#include "xe_tile.h"
 #include "xe_ttm_stolen_mgr.h"
 #include "xe_ttm_sys_mgr.h"
 #include "xe_vm.h"
@@ -239,6 +240,7 @@ static void xe_device_sanitize(struct drm_device *drm, void *arg)
 
 int xe_device_probe(struct xe_device *xe)
 {
+	struct xe_tile *tile;
 	struct xe_gt *gt;
 	int err;
 	u8 id;
@@ -248,8 +250,12 @@ int xe_device_probe(struct xe_device *xe)
 	if (err)
 		return err;
 
-	for_each_gt(gt, xe, id) {
-		err = xe_gt_alloc(xe, gt);
+	for_each_tile(tile, xe, id) {
+		err = xe_tile_alloc(tile);
+		if (err)
+			return err;
+
+		err = xe_gt_alloc(xe, &tile->primary_gt);
 		if (err)
 			return err;
 	}
@@ -284,8 +290,8 @@ int xe_device_probe(struct xe_device *xe)
 
 	xe_ttm_sys_mgr_init(xe);
 
-	for_each_gt(gt, xe, id) {
-		err = xe_gt_init_noalloc(gt);
+	for_each_tile(tile, xe, id) {
+		err = xe_tile_init_noalloc(tile);
 		if (err)
 			goto err_irq_shutdown;
 	}
diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
index 2481b2045284..6b9e7847161c 100644
--- a/drivers/gpu/drm/xe/xe_device_types.h
+++ b/drivers/gpu/drm/xe/xe_device_types.h
@@ -53,6 +53,8 @@
 		 const struct xe_tile *: (const struct xe_device *)((tile__)->xe),	\
 		 struct xe_tile *: (tile__)->xe)
 
+struct xe_ggtt;
+
 /**
  * struct xe_tile - hardware tile structure
  *
@@ -96,6 +98,40 @@ struct xe_tile {
 		/** @regs: pointer to tile's MMIO space (starting with registers) */
 		void *regs;
 	} mmio;
+
+	/** @mem: memory management info for tile */
+	struct {
+		/**
+		 * @vram: VRAM info for tile.
+		 *
+		 * Although VRAM is associated with a specific tile, it can
+		 * still be accessed by all tiles' GTs.
+		 */
+		struct {
+			/** @io_start: IO start address of this VRAM instance */
+			resource_size_t io_start;
+			/**
+			 * @io_size: IO size of this VRAM instance
+			 *
+			 * This represents how much of this VRAM we can access
+			 * via the CPU through the VRAM BAR. This can be smaller
+			 * than @size, in which case only part of VRAM is CPU
+			 * accessible (typically the first 256M). This
+			 * configuration is known as small-bar.
+			 */
+			resource_size_t io_size;
+			/** @size: size of VRAM. */
+			resource_size_t size;
+			/** @mapping: pointer to VRAM mappable space */
+			void *__iomem mapping;
+		} vram;
+
+		/** @vram_mgr: VRAM TTM manager */
+		struct xe_ttm_vram_mgr *vram_mgr;
+
+		/** @ggtt: Global graphics translation table */
+		struct xe_ggtt *ggtt;
+	} mem;
 };
 
 /**
diff --git a/drivers/gpu/drm/xe/xe_ggtt.c b/drivers/gpu/drm/xe/xe_ggtt.c
index 200976da3dc1..52d293d61cc0 100644
--- a/drivers/gpu/drm/xe/xe_ggtt.c
+++ b/drivers/gpu/drm/xe/xe_ggtt.c
@@ -90,24 +90,19 @@ static void ggtt_fini_noalloc(struct drm_device *drm, void *arg)
 	xe_bo_unpin_map_no_vm(ggtt->scratch);
 }
 
-int xe_ggtt_init_noalloc(struct xe_gt *gt, struct xe_ggtt *ggtt)
+int xe_ggtt_init_noalloc(struct xe_ggtt *ggtt)
 {
-	struct xe_device *xe = gt_to_xe(gt);
-	struct xe_tile *tile = gt_to_tile(gt);
+	struct xe_device *xe = tile_to_xe(ggtt->tile);
 	struct pci_dev *pdev = to_pci_dev(xe->drm.dev);
 	unsigned int gsm_size;
 
-	XE_BUG_ON(xe_gt_is_media_type(gt));
-
-	ggtt->gt = gt;
-
 	gsm_size = probe_gsm_size(pdev);
 	if (gsm_size == 0) {
 		drm_err(&xe->drm, "Hardware reported no preallocated GSM\n");
 		return -ENOMEM;
 	}
 
-	ggtt->gsm = tile->mmio.regs + SZ_8M;
+	ggtt->gsm = ggtt->tile->mmio.regs + SZ_8M;
 	ggtt->size = (gsm_size / 8) * (u64) XE_PAGE_SIZE;
 
 	if (IS_DGFX(xe) && xe->info.vram_flags & XE_VRAM_FLAGS_NEED64K)
@@ -147,13 +142,14 @@ static void xe_ggtt_initial_clear(struct xe_ggtt *ggtt)
 	drm_mm_for_each_hole(hole, &ggtt->mm, start, end)
 		xe_ggtt_clear(ggtt, start, end - start);
 
-	xe_ggtt_invalidate(ggtt->gt);
+	xe_ggtt_invalidate(ggtt);
 	mutex_unlock(&ggtt->lock);
 }
 
-int xe_ggtt_init(struct xe_gt *gt, struct xe_ggtt *ggtt)
+int xe_ggtt_init(struct xe_ggtt *ggtt)
 {
-	struct xe_device *xe = gt_to_xe(gt);
+	struct xe_device *xe = tile_to_xe(ggtt->tile);
+	struct xe_gt *gt = &ggtt->tile->primary_gt;
 	unsigned int flags;
 	int err;
 
@@ -193,8 +189,14 @@ int xe_ggtt_init(struct xe_gt *gt, struct xe_ggtt *ggtt)
 #define PVC_GUC_TLB_INV_DESC1			XE_REG(0xcf80)
 #define   PVC_GUC_TLB_INV_DESC1_INVALIDATE	REG_BIT(6)
 
-void xe_ggtt_invalidate(struct xe_gt *gt)
+void xe_ggtt_invalidate(struct xe_ggtt *ggtt)
 {
+	/*
+	 * TODO: Loop over each GT in tile once media GT support is
+	 * re-added
+	 */
+	struct xe_gt *gt = &ggtt->tile->primary_gt;
+
 	/* TODO: vfunc for GuC vs. non-GuC */
 
 	if (gt->uc.guc.submission_state.enabled) {
@@ -267,7 +269,7 @@ void xe_ggtt_map_bo(struct xe_ggtt *ggtt, struct xe_bo *bo)
 		xe_ggtt_set_pte(ggtt, start + offset, pte);
 	}
 
-	xe_ggtt_invalidate(ggtt->gt);
+	xe_ggtt_invalidate(ggtt);
 }
 
 static int __xe_ggtt_insert_bo_at(struct xe_ggtt *ggtt, struct xe_bo *bo,
@@ -318,7 +320,7 @@ void xe_ggtt_remove_node(struct xe_ggtt *ggtt, struct drm_mm_node *node)
 	drm_mm_remove_node(node);
 	node->size = 0;
 
-	xe_ggtt_invalidate(ggtt->gt);
+	xe_ggtt_invalidate(ggtt);
 
 	mutex_unlock(&ggtt->lock);
 }
diff --git a/drivers/gpu/drm/xe/xe_ggtt.h b/drivers/gpu/drm/xe/xe_ggtt.h
index 333947100504..205a6d058bbd 100644
--- a/drivers/gpu/drm/xe/xe_ggtt.h
+++ b/drivers/gpu/drm/xe/xe_ggtt.h
@@ -12,9 +12,9 @@ struct drm_printer;
 
 u64 xe_ggtt_pte_encode(struct xe_bo *bo, u64 bo_offset);
 void xe_ggtt_set_pte(struct xe_ggtt *ggtt, u64 addr, u64 pte);
-void xe_ggtt_invalidate(struct xe_gt *gt);
-int xe_ggtt_init_noalloc(struct xe_gt *gt, struct xe_ggtt *ggtt);
-int xe_ggtt_init(struct xe_gt *gt, struct xe_ggtt *ggtt);
+void xe_ggtt_invalidate(struct xe_ggtt *ggtt);
+int xe_ggtt_init_noalloc(struct xe_ggtt *ggtt);
+int xe_ggtt_init(struct xe_ggtt *ggtt);
 void xe_ggtt_printk(struct xe_ggtt *ggtt, const char *prefix);
 
 int xe_ggtt_insert_special_node(struct xe_ggtt *ggtt, struct drm_mm_node *node,
diff --git a/drivers/gpu/drm/xe/xe_ggtt_types.h b/drivers/gpu/drm/xe/xe_ggtt_types.h
index ea70aaef4b31..d34b3e733945 100644
--- a/drivers/gpu/drm/xe/xe_ggtt_types.h
+++ b/drivers/gpu/drm/xe/xe_ggtt_types.h
@@ -12,7 +12,7 @@ struct xe_bo;
 struct xe_gt;
 
 struct xe_ggtt {
-	struct xe_gt *gt;
+	struct xe_tile *tile;
 
 	u64 size;
 
diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c
index cbe063a40aca..1e424ce8ef3e 100644
--- a/drivers/gpu/drm/xe/xe_gt.c
+++ b/drivers/gpu/drm/xe/xe_gt.c
@@ -36,7 +36,6 @@
 #include "xe_ring_ops.h"
 #include "xe_sa.h"
 #include "xe_sched_job.h"
-#include "xe_ttm_vram_mgr.h"
 #include "xe_tuning.h"
 #include "xe_uc.h"
 #include "xe_vm.h"
@@ -45,64 +44,23 @@
 
 struct xe_gt *xe_find_full_gt(struct xe_gt *gt)
 {
-	struct xe_gt *search;
-	u8 id;
-
-	XE_BUG_ON(!xe_gt_is_media_type(gt));
-
-	for_each_gt(search, gt_to_xe(gt), id) {
-		if (search->info.vram_id == gt->info.vram_id)
-			return search;
-	}
-
-	XE_BUG_ON("NOT POSSIBLE");
-	return NULL;
+	/*
+	 * FIXME: Media GTs are disabled at the moment.  Once re-enabled,
+	 * the proper handling here is to return the primary GT from the
+	 * parameter GT's tile.
+	 */
+	return gt;
 }
 
 int xe_gt_alloc(struct xe_device *xe, struct xe_gt *gt)
 {
-	struct drm_device *drm = &xe->drm;
-
 	XE_BUG_ON(gt->info.type == XE_GT_TYPE_UNINITIALIZED);
 
-	if (!xe_gt_is_media_type(gt)) {
-		gt->mem.ggtt = drmm_kzalloc(drm, sizeof(*gt->mem.ggtt),
-					    GFP_KERNEL);
-		if (!gt->mem.ggtt)
-			return -ENOMEM;
-
-		gt->mem.vram_mgr = drmm_kzalloc(drm, sizeof(*gt->mem.vram_mgr),
-						GFP_KERNEL);
-		if (!gt->mem.vram_mgr)
-			return -ENOMEM;
-
-	} else {
-		struct xe_gt *full_gt = xe_find_full_gt(gt);
-
-		gt->mem.ggtt = full_gt->mem.ggtt;
-		gt->mem.vram_mgr = full_gt->mem.vram_mgr;
-	}
-
 	gt->ordered_wq = alloc_ordered_workqueue("gt-ordered-wq", 0);
 
 	return 0;
 }
 
-static int gt_ttm_mgr_init(struct xe_gt *gt)
-{
-	struct xe_device *xe = gt_to_xe(gt);
-	int err;
-
-	if (gt->mem.vram.size) {
-		err = xe_ttm_vram_mgr_init(gt, gt->mem.vram_mgr);
-		if (err)
-			return err;
-		xe->info.mem_region_mask |= BIT(gt->info.vram_id) << 1;
-	}
-
-	return 0;
-}
-
 void xe_gt_sanitize(struct xe_gt *gt)
 {
 	/*
@@ -320,43 +278,6 @@ int xe_gt_init_early(struct xe_gt *gt)
 	return 0;
 }
 
-/**
- * xe_gt_init_noalloc - Init GT up to the point where allocations can happen.
- * @gt: The GT to initialize.
- *
- * This function prepares the GT to allow memory allocations to VRAM, but is not
- * allowed to allocate memory itself. This state is useful for display readout,
- * because the inherited display framebuffer will otherwise be overwritten as it
- * is usually put at the start of VRAM.
- *
- * Returns: 0 on success, negative error code on error.
- */
-int xe_gt_init_noalloc(struct xe_gt *gt)
-{
-	int err, err2;
-
-	if (xe_gt_is_media_type(gt))
-		return 0;
-
-	xe_device_mem_access_get(gt_to_xe(gt));
-	err = xe_force_wake_get(gt_to_fw(gt), XE_FW_GT);
-	if (err)
-		goto err;
-
-	err = gt_ttm_mgr_init(gt);
-	if (err)
-		goto err_force_wake;
-
-	err = xe_ggtt_init_noalloc(gt, gt->mem.ggtt);
-
-err_force_wake:
-	err2 = xe_force_wake_put(gt_to_fw(gt), XE_FW_GT);
-	XE_WARN_ON(err2);
-	xe_device_mem_access_put(gt_to_xe(gt));
-err:
-	return err;
-}
-
 static int gt_fw_domain_init(struct xe_gt *gt)
 {
 	int err, i;
@@ -369,7 +290,7 @@ static int gt_fw_domain_init(struct xe_gt *gt)
 	xe_pat_init(gt);
 
 	if (!xe_gt_is_media_type(gt)) {
-		err = xe_ggtt_init(gt, gt->mem.ggtt);
+		err = xe_ggtt_init(gt_to_tile(gt)->mem.ggtt);
 		if (err)
 			goto err_force_wake;
 	}
diff --git a/drivers/gpu/drm/xe/xe_gt_debugfs.c b/drivers/gpu/drm/xe/xe_gt_debugfs.c
index c45486c2015a..b71b584c9bdc 100644
--- a/drivers/gpu/drm/xe/xe_gt_debugfs.c
+++ b/drivers/gpu/drm/xe/xe_gt_debugfs.c
@@ -97,7 +97,7 @@ static int ggtt(struct seq_file *m, void *data)
 	struct xe_gt *gt = node_to_gt(m->private);
 	struct drm_printer p = drm_seq_file_printer(m);
 
-	return xe_ggtt_dump(gt->mem.ggtt, &p);
+	return xe_ggtt_dump(gt_to_tile(gt)->mem.ggtt, &p);
 }
 
 static int register_save_restore(struct seq_file *m, void *data)
diff --git a/drivers/gpu/drm/xe/xe_gt_pagefault.c b/drivers/gpu/drm/xe/xe_gt_pagefault.c
index 1677640e1075..f4f3d95ae6b1 100644
--- a/drivers/gpu/drm/xe/xe_gt_pagefault.c
+++ b/drivers/gpu/drm/xe/xe_gt_pagefault.c
@@ -107,6 +107,7 @@ static struct xe_vma *lookup_vma(struct xe_vm *vm, u64 page_addr)
 static int handle_pagefault(struct xe_gt *gt, struct pagefault *pf)
 {
 	struct xe_device *xe = gt_to_xe(gt);
+	struct xe_tile *tile = gt_to_tile(gt);
 	struct xe_vm *vm;
 	struct xe_vma *vma = NULL;
 	struct xe_bo *bo;
@@ -195,7 +196,7 @@ static int handle_pagefault(struct xe_gt *gt, struct pagefault *pf)
 		}
 
 		/* Migrate to VRAM, move should invalidate the VMA first */
-		ret = xe_bo_migrate(bo, XE_PL_VRAM0 + gt->info.vram_id);
+		ret = xe_bo_migrate(bo, XE_PL_VRAM0 + tile->id);
 		if (ret)
 			goto unlock_dma_resv;
 	} else if (bo) {
@@ -498,6 +499,7 @@ static struct xe_vma *get_acc_vma(struct xe_vm *vm, struct acc *acc)
 static int handle_acc(struct xe_gt *gt, struct acc *acc)
 {
 	struct xe_device *xe = gt_to_xe(gt);
+	struct xe_tile *tile = gt_to_tile(gt);
 	struct xe_vm *vm;
 	struct xe_vma *vma;
 	struct xe_bo *bo;
@@ -553,7 +555,7 @@ static int handle_acc(struct xe_gt *gt, struct acc *acc)
 		goto unlock_vm;
 
 	/* Migrate to VRAM, move should invalidate the VMA first */
-	ret = xe_bo_migrate(bo, XE_PL_VRAM0 + gt->info.vram_id);
+	ret = xe_bo_migrate(bo, XE_PL_VRAM0 + tile->id);
 
 	if (only_needs_bo_lock(bo))
 		xe_bo_unlock(bo, &ww);
diff --git a/drivers/gpu/drm/xe/xe_gt_types.h b/drivers/gpu/drm/xe/xe_gt_types.h
index 03dd625b2781..bb5271277e3b 100644
--- a/drivers/gpu/drm/xe/xe_gt_types.h
+++ b/drivers/gpu/drm/xe/xe_gt_types.h
@@ -14,11 +14,8 @@
 #include "xe_uc_types.h"
 
 struct xe_engine_ops;
-struct xe_ggtt;
 struct xe_migrate;
 struct xe_ring_ops;
-struct xe_ttm_gtt_mgr;
-struct xe_ttm_vram_mgr;
 
 enum xe_gt_type {
 	XE_GT_TYPE_UNINITIALIZED,
@@ -109,8 +106,6 @@ struct xe_gt {
 		enum xe_gt_type type;
 		/** @id: id of GT */
 		u8 id;
-		/** @vram: id of the VRAM for this GT */
-		u8 vram_id;
 		/** @clock_freq: clock frequency */
 		u32 clock_freq;
 		/** @engine_mask: mask of engines present on GT */
@@ -145,39 +140,6 @@ struct xe_gt {
 	 */
 	struct xe_reg_sr reg_sr;
 
-	/**
-	 * @mem: memory management info for GT, multiple GTs can point to same
-	 * objects (virtual split)
-	 */
-	struct {
-		/**
-		 * @vram: VRAM info for GT, multiple GTs can point to same info
-		 * (virtual split), can be subset of global device VRAM
-		 */
-		struct {
-			/** @io_start: IO start address of this VRAM instance */
-			resource_size_t io_start;
-			/**
-			 * @io_size: IO size of this VRAM instance
-			 *
-			 * This represents how much of this VRAM we can access
-			 * via the CPU through the VRAM BAR. This can be smaller
-			 * than @size, in which case only part of VRAM is CPU
-			 * accessible (typically the first 256M). This
-			 * configuration is known as small-bar.
-			 */
-			resource_size_t io_size;
-			/** @size: size of VRAM. */
-			resource_size_t size;
-			/** @mapping: pointer to VRAM mappable space */
-			void *__iomem mapping;
-		} vram;
-		/** @vram_mgr: VRAM TTM manager */
-		struct xe_ttm_vram_mgr *vram_mgr;
-		/** @ggtt: Global graphics translation table */
-		struct xe_ggtt *ggtt;
-	} mem;
-
 	/** @reset: state for GT resets */
 	struct {
 		/**
diff --git a/drivers/gpu/drm/xe/xe_irq.c b/drivers/gpu/drm/xe/xe_irq.c
index 5bf359c81cc5..5be31855d789 100644
--- a/drivers/gpu/drm/xe/xe_irq.c
+++ b/drivers/gpu/drm/xe/xe_irq.c
@@ -369,7 +369,7 @@ static irqreturn_t dg1_irq_handler(int irq, void *arg)
 	}
 
 	for_each_gt(gt, xe, id) {
-		if ((master_tile_ctl & DG1_MSTR_TILE(gt->info.vram_id)) == 0)
+		if ((master_tile_ctl & DG1_MSTR_TILE(gt_to_tile(gt)->id)) == 0)
 			continue;
 
 		if (!xe_gt_is_media_type(gt))
diff --git a/drivers/gpu/drm/xe/xe_mmio.c b/drivers/gpu/drm/xe/xe_mmio.c
index 54fa1212fcd9..17b3a9880409 100644
--- a/drivers/gpu/drm/xe/xe_mmio.c
+++ b/drivers/gpu/drm/xe/xe_mmio.c
@@ -182,7 +182,7 @@ int xe_mmio_total_vram_size(struct xe_device *xe, u64 *vram_size, u64 *usable_si
 int xe_mmio_probe_vram(struct xe_device *xe)
 {
 	struct pci_dev *pdev = to_pci_dev(xe->drm.dev);
-	struct xe_gt *gt;
+	struct xe_tile *tile;
 	u8 id;
 	u64 vram_size;
 	u64 original_size;
@@ -195,11 +195,11 @@ int xe_mmio_probe_vram(struct xe_device *xe)
 		xe->mem.vram.io_start = 0;
 		xe->mem.vram.io_size = 0;
 
-		for_each_gt(gt, xe, id) {
-			gt->mem.vram.mapping = 0;
-			gt->mem.vram.size = 0;
-			gt->mem.vram.io_start = 0;
-			gt->mem.vram.io_size = 0;
+		for_each_tile(tile, xe, id) {
+			tile->mem.vram.mapping = 0;
+			tile->mem.vram.size = 0;
+			tile->mem.vram.io_start = 0;
+			tile->mem.vram.io_size = 0;
 		}
 		return 0;
 	}
@@ -209,7 +209,6 @@ int xe_mmio_probe_vram(struct xe_device *xe)
 		return -ENXIO;
 	}
 
-	gt = xe_device_get_gt(xe, 0);
 	original_size = pci_resource_len(pdev, GEN12_LMEM_BAR);
 
 	err = xe_mmio_total_vram_size(xe, &vram_size, &usable_size);
@@ -239,23 +238,19 @@ int xe_mmio_probe_vram(struct xe_device *xe)
 		u8 adj_tile_count = xe->info.tile_count;
 		resource_size_t size, io_start, io_size;
 
-		for_each_gt(gt, xe, id)
-			if (xe_gt_is_media_type(gt))
-				--adj_tile_count;
-
 		XE_BUG_ON(!adj_tile_count);
 
 		size = xe->mem.vram.size / adj_tile_count;
 		io_start = xe->mem.vram.io_start;
 		io_size = xe->mem.vram.io_size;
 
-		for_each_gt(gt, xe, id) {
-			if (id && !xe_gt_is_media_type(gt)) {
+		for_each_tile(tile, xe, id) {
+			if (id) {
 				io_size -= min(io_size, size);
 				io_start += io_size;
 			}
 
-			gt->mem.vram.size = size;
+			tile->mem.vram.size = size;
 
 			/*
 			 * XXX: multi-tile small-bar might be wild. Hopefully
@@ -263,10 +258,10 @@ int xe_mmio_probe_vram(struct xe_device *xe)
 			 * we care about.
 			 */
 
-			gt->mem.vram.io_size = min(size, io_size);
+			tile->mem.vram.io_size = min(size, io_size);
 			if (io_size) {
-				gt->mem.vram.io_start = io_start;
-				gt->mem.vram.mapping = xe->mem.vram.mapping +
+				tile->mem.vram.io_start = io_start;
+				tile->mem.vram.mapping = xe->mem.vram.mapping +
 					(io_start - xe->mem.vram.io_start);
 			} else {
 				drm_err(&xe->drm, "Tile without any CPU visible VRAM. Aborting.\n");
@@ -274,16 +269,18 @@ int xe_mmio_probe_vram(struct xe_device *xe)
 			}
 
 			drm_info(&xe->drm, "VRAM[%u, %u]: %pa, %pa\n",
-				 id, gt->info.vram_id, &gt->mem.vram.io_start,
-				 &gt->mem.vram.size);
+				 id, tile->id, &tile->mem.vram.io_start,
+				 &tile->mem.vram.size);
 		}
 	} else {
-		gt->mem.vram.size = xe->mem.vram.size;
-		gt->mem.vram.io_start = xe->mem.vram.io_start;
-		gt->mem.vram.io_size = xe->mem.vram.io_size;
-		gt->mem.vram.mapping = xe->mem.vram.mapping;
+		tile = xe_device_get_root_tile(xe);
+
+		tile->mem.vram.size = xe->mem.vram.size;
+		tile->mem.vram.io_start = xe->mem.vram.io_start;
+		tile->mem.vram.io_size = xe->mem.vram.io_size;
+		tile->mem.vram.mapping = xe->mem.vram.mapping;
 
-		drm_info(&xe->drm, "VRAM: %pa\n", &gt->mem.vram.size);
+		drm_info(&xe->drm, "VRAM: %pa\n", &tile->mem.vram.size);
 	}
 	return 0;
 }
diff --git a/drivers/gpu/drm/xe/xe_pci.c b/drivers/gpu/drm/xe/xe_pci.c
index bef65d3a440e..be7c41024838 100644
--- a/drivers/gpu/drm/xe/xe_pci.c
+++ b/drivers/gpu/drm/xe/xe_pci.c
@@ -536,7 +536,6 @@ static int xe_info_init(struct xe_device *xe,
 		gt->info.id = id;
 		if (id == 0) {
 			gt->info.type = XE_GT_TYPE_MAIN;
-			gt->info.vram_id = id;
 
 			gt->info.__engine_mask = graphics_desc->hw_engine_mask;
 			if (MEDIA_VER(xe) < 13 && media_desc)
@@ -546,7 +545,6 @@ static int xe_info_init(struct xe_device *xe,
 			gt->mmio.adj_offset = 0;
 		} else {
 			gt->info.type = desc->extra_gts[id - 1].type;
-			gt->info.vram_id = desc->extra_gts[id - 1].vram_id;
 			gt->info.__engine_mask = xe_gt_is_media_type(gt) ?
 				media_desc->hw_engine_mask :
 				graphics_desc->hw_engine_mask;
diff --git a/drivers/gpu/drm/xe/xe_pt.c b/drivers/gpu/drm/xe/xe_pt.c
index 61126cefe0b5..ad42a21c0e22 100644
--- a/drivers/gpu/drm/xe/xe_pt.c
+++ b/drivers/gpu/drm/xe/xe_pt.c
@@ -758,12 +758,12 @@ xe_pt_stage_bind(struct xe_gt *gt, struct xe_vma *vma,
 	int ret;
 
 	if (is_vram) {
-		struct xe_gt *bo_gt = xe_bo_to_gt(bo);
+		struct xe_tile *bo_tile = xe_bo_to_tile(bo);
 
 		xe_walk.default_pte = XE_PPGTT_PTE_LM;
 		if (vma && vma->use_atomic_access_pte_bit)
 			xe_walk.default_pte |= XE_USM_PPGTT_PTE_AE;
-		xe_walk.dma_offset = bo_gt->mem.vram.io_start -
+		xe_walk.dma_offset = bo_tile->mem.vram.io_start -
 			gt_to_xe(gt)->mem.vram.io_start;
 		xe_walk.cache = XE_CACHE_WB;
 	} else {
diff --git a/drivers/gpu/drm/xe/xe_query.c b/drivers/gpu/drm/xe/xe_query.c
index dd64ff0d2a57..c81652d7f4ec 100644
--- a/drivers/gpu/drm/xe/xe_query.c
+++ b/drivers/gpu/drm/xe/xe_query.c
@@ -182,7 +182,7 @@ static int query_config(struct xe_device *xe, struct drm_xe_device_query *query)
 	config->num_params = num_params;
 	config->info[XE_QUERY_CONFIG_REV_AND_DEVICE_ID] =
 		xe->info.devid | (xe->info.revid << 16);
-	if (to_gt(xe)->mem.vram.size)
+	if (xe_device_get_root_tile(xe)->mem.vram.size)
 		config->info[XE_QUERY_CONFIG_FLAGS] =
 			XE_QUERY_CONFIG_FLAGS_HAS_VRAM;
 	if (xe->info.enable_guc)
@@ -242,7 +242,7 @@ static int query_gts(struct xe_device *xe, struct drm_xe_device_query *query)
 			gts->gts[id].native_mem_regions = 0x1;
 		else
 			gts->gts[id].native_mem_regions =
-				BIT(gt->info.vram_id) << 1;
+				BIT(gt_to_tile(gt)->id) << 1;
 		gts->gts[id].slow_mem_regions = xe->info.mem_region_mask ^
 			gts->gts[id].native_mem_regions;
 	}
diff --git a/drivers/gpu/drm/xe/xe_res_cursor.h b/drivers/gpu/drm/xe/xe_res_cursor.h
index 4e99fae26b4c..f2ba609712d3 100644
--- a/drivers/gpu/drm/xe/xe_res_cursor.h
+++ b/drivers/gpu/drm/xe/xe_res_cursor.h
@@ -53,7 +53,7 @@ static struct drm_buddy *xe_res_get_buddy(struct ttm_resource *res)
 	struct xe_device *xe = ttm_to_xe_device(res->bo->bdev);
 
 	if (res->mem_type != XE_PL_STOLEN) {
-		return &xe_device_get_gt(xe, res->mem_type - XE_PL_VRAM0)->mem.vram_mgr->mm;
+		return &xe->tiles[res->mem_type - XE_PL_VRAM0].mem.vram_mgr->mm;
 	} else {
 		struct ttm_resource_manager *mgr =
 			ttm_manager_type(&xe->ttm, XE_PL_STOLEN);
diff --git a/drivers/gpu/drm/xe/xe_tile.c b/drivers/gpu/drm/xe/xe_tile.c
new file mode 100644
index 000000000000..9553d252b56c
--- /dev/null
+++ b/drivers/gpu/drm/xe/xe_tile.c
@@ -0,0 +1,74 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2023 Intel Corporation
+ */
+
+#include <drm/drm_managed.h>
+
+#include "xe_device.h"
+#include "xe_ggtt.h"
+#include "xe_tile.h"
+#include "xe_ttm_vram_mgr.h"
+
+int xe_tile_alloc(struct xe_tile *tile)
+{
+	struct drm_device *drm = &tile_to_xe(tile)->drm;
+
+	tile->mem.ggtt = drmm_kzalloc(drm, sizeof(*tile->mem.ggtt),
+				      GFP_KERNEL);
+	if (!tile->mem.ggtt)
+		return -ENOMEM;
+	tile->mem.ggtt->tile = tile;
+
+	tile->mem.vram_mgr = drmm_kzalloc(drm, sizeof(*tile->mem.vram_mgr), GFP_KERNEL);
+	if (!tile->mem.vram_mgr)
+		return -ENOMEM;
+
+	return 0;
+}
+
+static int tile_ttm_mgr_init(struct xe_tile *tile)
+{
+	struct xe_device *xe = tile_to_xe(tile);
+	int err;
+
+	if (tile->mem.vram.size) {
+		err = xe_ttm_vram_mgr_init(tile, tile->mem.vram_mgr);
+		if (err)
+			return err;
+		xe->info.mem_region_mask |= BIT(tile->id) << 1;
+	}
+
+	return 0;
+}
+
+/**
+ * xe_tile_init_noalloc - Init tile up to the point where allocations can happen.
+ * @tile: The tile to initialize.
+ *
+ * This function prepares the tile to allow memory allocations to VRAM, but is
+ * not allowed to allocate memory itself. This state is useful for display
+ * readout, because the inherited display framebuffer will otherwise be
+ * overwritten as it is usually put at the start of VRAM.
+ *
+ * Note that since this is tile initialization, it should not perform any
+ * GT-specific operations, and thus does not need to hold GT forcewake.
+ *
+ * Returns: 0 on success, negative error code on error.
+ */
+int xe_tile_init_noalloc(struct xe_tile *tile)
+{
+	int err;
+
+	xe_device_mem_access_get(tile_to_xe(tile));
+
+	err = tile_ttm_mgr_init(tile);
+	if (err)
+		goto err_mem_access;
+
+	err = xe_ggtt_init_noalloc(tile->mem.ggtt);
+
+err_mem_access:
+	xe_device_mem_access_put(tile_to_xe(tile));
+	return err;
+}
diff --git a/drivers/gpu/drm/xe/xe_tile.h b/drivers/gpu/drm/xe/xe_tile.h
new file mode 100644
index 000000000000..49b64d83ce91
--- /dev/null
+++ b/drivers/gpu/drm/xe/xe_tile.h
@@ -0,0 +1,14 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2023 Intel Corporation
+ */
+
+#ifndef __XE_TILE_H__
+#define __XE_TILE_H__
+
+struct xe_tile;
+
+int xe_tile_alloc(struct xe_tile *tile);
+int xe_tile_init_noalloc(struct xe_tile *tile);
+
+#endif
diff --git a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
index 73836b9b7fed..1a84abd35fcf 100644
--- a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
+++ b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
@@ -353,16 +353,14 @@ int __xe_ttm_vram_mgr_init(struct xe_device *xe, struct xe_ttm_vram_mgr *mgr,
 	return drmm_add_action_or_reset(&xe->drm, ttm_vram_mgr_fini, mgr);
 }
 
-int xe_ttm_vram_mgr_init(struct xe_gt *gt, struct xe_ttm_vram_mgr *mgr)
+int xe_ttm_vram_mgr_init(struct xe_tile *tile, struct xe_ttm_vram_mgr *mgr)
 {
-	struct xe_device *xe = gt_to_xe(gt);
+	struct xe_device *xe = tile_to_xe(tile);
 
-	XE_BUG_ON(xe_gt_is_media_type(gt));
+	mgr->tile = tile;
 
-	mgr->gt = gt;
-
-	return __xe_ttm_vram_mgr_init(xe, mgr, XE_PL_VRAM0 + gt->info.vram_id,
-				      gt->mem.vram.size, gt->mem.vram.io_size,
+	return __xe_ttm_vram_mgr_init(xe, mgr, XE_PL_VRAM0 + tile->id,
+				      tile->mem.vram.size, tile->mem.vram.io_size,
 				      PAGE_SIZE);
 }
 
@@ -373,7 +371,7 @@ int xe_ttm_vram_mgr_alloc_sgt(struct xe_device *xe,
 			      enum dma_data_direction dir,
 			      struct sg_table **sgt)
 {
-	struct xe_gt *gt = xe_device_get_gt(xe, res->mem_type - XE_PL_VRAM0);
+	struct xe_tile *tile = &xe->tiles[res->mem_type - XE_PL_VRAM0];
 	struct xe_res_cursor cursor;
 	struct scatterlist *sg;
 	int num_entries = 0;
@@ -406,7 +404,7 @@ int xe_ttm_vram_mgr_alloc_sgt(struct xe_device *xe,
 	 */
 	xe_res_first(res, offset, length, &cursor);
 	for_each_sgtable_sg((*sgt), sg, i) {
-		phys_addr_t phys = cursor.start + gt->mem.vram.io_start;
+		phys_addr_t phys = cursor.start + tile->mem.vram.io_start;
 		size_t size = cursor.size;
 		dma_addr_t addr;
 
diff --git a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.h b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.h
index 35e5367a79fb..6e1d6033d739 100644
--- a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.h
+++ b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.h
@@ -10,12 +10,12 @@
 
 enum dma_data_direction;
 struct xe_device;
-struct xe_gt;
+struct xe_tile;
 
 int __xe_ttm_vram_mgr_init(struct xe_device *xe, struct xe_ttm_vram_mgr *mgr,
 			   u32 mem_type, u64 size, u64 io_size,
 			   u64 default_page_size);
-int xe_ttm_vram_mgr_init(struct xe_gt *gt, struct xe_ttm_vram_mgr *mgr);
+int xe_ttm_vram_mgr_init(struct xe_tile *tile, struct xe_ttm_vram_mgr *mgr);
 int xe_ttm_vram_mgr_alloc_sgt(struct xe_device *xe,
 			      struct ttm_resource *res,
 			      u64 offset, u64 length,
diff --git a/drivers/gpu/drm/xe/xe_ttm_vram_mgr_types.h b/drivers/gpu/drm/xe/xe_ttm_vram_mgr_types.h
index 3d9417ff7434..48bb991c14a5 100644
--- a/drivers/gpu/drm/xe/xe_ttm_vram_mgr_types.h
+++ b/drivers/gpu/drm/xe/xe_ttm_vram_mgr_types.h
@@ -9,7 +9,7 @@
 #include <drm/drm_buddy.h>
 #include <drm/ttm/ttm_device.h>
 
-struct xe_gt;
+struct xe_tile;
 
 /**
  * struct xe_ttm_vram_mgr - XE TTM VRAM manager
@@ -17,8 +17,8 @@ struct xe_gt;
  * Manages placement of TTM resource in VRAM.
  */
 struct xe_ttm_vram_mgr {
-	/** @gt: Graphics tile which the VRAM belongs to */
-	struct xe_gt *gt;
+	/** @tile: Tile which the VRAM belongs to */
+	struct xe_tile *tile;
 	/** @manager: Base TTM resource manager */
 	struct ttm_resource_manager manager;
 	/** @mm: DRM buddy allocator which manages the VRAM */
-- 
2.40.0


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [Intel-xe] [PATCH 07/26] drm/xe: Memory allocations are tile-based, not GT-based
  2023-05-11  3:46 [Intel-xe] [PATCH 00/26] Separate GT and tile Matt Roper
                   ` (5 preceding siblings ...)
  2023-05-11  3:47 ` [Intel-xe] [PATCH 06/26] drm/xe: Move VRAM from GT to tile Matt Roper
@ 2023-05-11  3:47 ` Matt Roper
  2023-05-17  4:56   ` Lucas De Marchi
  2023-05-11  3:47 ` [Intel-xe] [PATCH 08/26] drm/xe: Move migration from GT to tile Matt Roper
                   ` (27 subsequent siblings)
  34 siblings, 1 reply; 75+ messages in thread
From: Matt Roper @ 2023-05-11  3:47 UTC (permalink / raw)
  To: intel-xe; +Cc: matthew.d.roper

Since memory and address spaces are a tile concept rather than a GT
concept, we need to plumb tile-based handling through lots of
memory-related code.

Note that one remaining shortcoming here that will need to be addressed
before media GT support can be re-enabled is that although the address
space is shared between a tile's GTs, each GT caches the PTEs
independently in their own TLB and thus TLB invalidation should be
handled at the GT level.

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/i915/display/intel_dsb.c      |   5 +-
 drivers/gpu/drm/i915/display/intel_fbc.c      |   3 +-
 drivers/gpu/drm/i915/display/intel_fbdev.c    |   7 +-
 drivers/gpu/drm/xe/display/xe_fb_pin.c        |   7 +-
 drivers/gpu/drm/xe/display/xe_plane_initial.c |   2 +-
 drivers/gpu/drm/xe/tests/xe_bo.c              |   2 +-
 drivers/gpu/drm/xe/tests/xe_migrate.c         |  15 +-
 drivers/gpu/drm/xe/xe_bb.c                    |   3 +-
 drivers/gpu/drm/xe/xe_bo.c                    |  66 ++++----
 drivers/gpu/drm/xe/xe_bo.h                    |  18 +--
 drivers/gpu/drm/xe/xe_bo_evict.c              |   2 +-
 drivers/gpu/drm/xe/xe_bo_types.h              |   4 +-
 drivers/gpu/drm/xe/xe_device_types.h          |   7 +
 drivers/gpu/drm/xe/xe_ggtt.c                  |   5 +-
 drivers/gpu/drm/xe/xe_gt.c                    |  21 +--
 drivers/gpu/drm/xe/xe_gt_debugfs.c            |   6 +-
 drivers/gpu/drm/xe/xe_gt_pagefault.c          |  10 +-
 drivers/gpu/drm/xe/xe_gt_types.h              |   7 -
 drivers/gpu/drm/xe/xe_guc_ads.c               |   5 +-
 drivers/gpu/drm/xe/xe_guc_ct.c                |   5 +-
 drivers/gpu/drm/xe/xe_guc_hwconfig.c          |   5 +-
 drivers/gpu/drm/xe/xe_guc_log.c               |   6 +-
 drivers/gpu/drm/xe/xe_guc_pc.c                |   5 +-
 drivers/gpu/drm/xe/xe_hw_engine.c             |   5 +-
 drivers/gpu/drm/xe/xe_lrc.c                   |  13 +-
 drivers/gpu/drm/xe/xe_lrc_types.h             |   4 +-
 drivers/gpu/drm/xe/xe_migrate.c               |  23 +--
 drivers/gpu/drm/xe/xe_migrate.h               |   5 +-
 drivers/gpu/drm/xe/xe_pt.c                    | 146 ++++++++---------
 drivers/gpu/drm/xe/xe_pt.h                    |  14 +-
 drivers/gpu/drm/xe/xe_sa.c                    |  13 +-
 drivers/gpu/drm/xe/xe_sa.h                    |   4 +-
 drivers/gpu/drm/xe/xe_tile.c                  |   7 +
 drivers/gpu/drm/xe/xe_uc_fw.c                 |   5 +-
 drivers/gpu/drm/xe/xe_vm.c                    | 152 +++++++++---------
 drivers/gpu/drm/xe/xe_vm.h                    |   2 +-
 drivers/gpu/drm/xe/xe_vm_types.h              |  12 +-
 include/uapi/drm/xe_drm.h                     |   4 +-
 38 files changed, 307 insertions(+), 318 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_dsb.c b/drivers/gpu/drm/i915/display/intel_dsb.c
index 7c93580282b4..3830309aacf4 100644
--- a/drivers/gpu/drm/i915/display/intel_dsb.c
+++ b/drivers/gpu/drm/i915/display/intel_dsb.c
@@ -379,9 +379,10 @@ struct intel_dsb *intel_dsb_prepare(struct intel_crtc *crtc,
 #else
 	/* ~1 qword per instruction, full cachelines */
 	size = ALIGN(max_cmds * 8, 64);
-	obj = xe_bo_create_pin_map(i915, to_gt(i915), NULL, PAGE_ALIGN(size),
+	obj = xe_bo_create_pin_map(i915, xe_device_get_root_tile(i915),
+				   NULL, PAGE_ALIGN(size),
 				   ttm_bo_type_kernel,
-				   XE_BO_CREATE_VRAM_IF_DGFX(to_gt(i915)) |
+				   XE_BO_CREATE_VRAM_IF_DGFX(xe_device_get_root_tile(i915)) |
 				   XE_BO_CREATE_GGTT_BIT);
 	if (IS_ERR(obj)) {
 		kfree(dsb);
diff --git a/drivers/gpu/drm/i915/display/intel_fbc.c b/drivers/gpu/drm/i915/display/intel_fbc.c
index 9dc7083fe974..0e8e899f596b 100644
--- a/drivers/gpu/drm/i915/display/intel_fbc.c
+++ b/drivers/gpu/drm/i915/display/intel_fbc.c
@@ -71,7 +71,8 @@ static int i915_gem_stolen_insert_node_in_range(struct xe_device *xe, struct xe_
 	int err;
 	u32 flags = XE_BO_CREATE_PINNED_BIT | XE_BO_CREATE_STOLEN_BIT;
 
-	*bo = xe_bo_create_locked_range(xe, to_gt(xe), NULL, size, start, end,
+	*bo = xe_bo_create_locked_range(xe, xe_device_get_root_tile(xe),
+					NULL, size, start, end,
 					ttm_bo_type_kernel, flags);
 	if (IS_ERR(*bo)) {
 		err = PTR_ERR(*bo);
diff --git a/drivers/gpu/drm/i915/display/intel_fbdev.c b/drivers/gpu/drm/i915/display/intel_fbdev.c
index 6362c4ce15b6..814b89b99718 100644
--- a/drivers/gpu/drm/i915/display/intel_fbdev.c
+++ b/drivers/gpu/drm/i915/display/intel_fbdev.c
@@ -205,7 +205,8 @@ static int intelfb_alloc(struct drm_fb_helper *helper,
 	}
 #else
 	if (!IS_DGFX(dev_priv)) {
-		obj = xe_bo_create_pin_map(dev_priv, to_gt(dev_priv), NULL, size,
+		obj = xe_bo_create_pin_map(dev_priv, xe_device_get_root_tile(dev_priv),
+					   NULL, size,
 					   ttm_bo_type_kernel, XE_BO_SCANOUT_BIT |
 					   XE_BO_CREATE_STOLEN_BIT |
 					   XE_BO_CREATE_PINNED_BIT);
@@ -215,9 +216,9 @@ static int intelfb_alloc(struct drm_fb_helper *helper,
 			drm_info(&dev_priv->drm, "Allocated fbdev into stolen failed: %li\n", PTR_ERR(obj));
 	}
 	if (IS_ERR(obj)) {
-		obj = xe_bo_create_pin_map(dev_priv, to_gt(dev_priv), NULL, size,
+		obj = xe_bo_create_pin_map(dev_priv, xe_device_get_root_tile(dev_priv), NULL, size,
 					  ttm_bo_type_kernel, XE_BO_SCANOUT_BIT |
-					  XE_BO_CREATE_VRAM_IF_DGFX(to_gt(dev_priv)) |
+					  XE_BO_CREATE_VRAM_IF_DGFX(xe_device_get_root_tile(dev_priv)) |
 					  XE_BO_CREATE_PINNED_BIT);
 	}
 #endif
diff --git a/drivers/gpu/drm/xe/display/xe_fb_pin.c b/drivers/gpu/drm/xe/display/xe_fb_pin.c
index 78ac58244f24..e5999a01daa1 100644
--- a/drivers/gpu/drm/xe/display/xe_fb_pin.c
+++ b/drivers/gpu/drm/xe/display/xe_fb_pin.c
@@ -45,6 +45,7 @@ static int __xe_pin_fb_vma_dpt(struct intel_framebuffer *fb,
 			       struct i915_vma *vma)
 {
 	struct xe_device *xe = to_xe_device(fb->base.dev);
+	struct xe_tile *tile0 = xe_device_get_root_tile(xe);
 	struct xe_bo *bo = intel_fb_obj(&fb->base), *dpt;
 	u32 dpt_size, size = bo->ttm.base.size;
 
@@ -55,17 +56,17 @@ static int __xe_pin_fb_vma_dpt(struct intel_framebuffer *fb,
 		dpt_size = ALIGN(intel_rotation_info_size(&view->rotated) * 8,
 				 XE_PAGE_SIZE);
 
-	dpt = xe_bo_create_pin_map(xe, to_gt(xe), NULL, dpt_size,
+	dpt = xe_bo_create_pin_map(xe, tile0, NULL, dpt_size,
 				  ttm_bo_type_kernel,
 				  XE_BO_CREATE_VRAM0_BIT |
 				  XE_BO_CREATE_GGTT_BIT);
 	if (IS_ERR(dpt))
-		dpt = xe_bo_create_pin_map(xe, to_gt(xe), NULL, dpt_size,
+		dpt = xe_bo_create_pin_map(xe, tile0, NULL, dpt_size,
 					   ttm_bo_type_kernel,
 					   XE_BO_CREATE_STOLEN_BIT |
 					   XE_BO_CREATE_GGTT_BIT);
 	if (IS_ERR(dpt))
-		dpt = xe_bo_create_pin_map(xe, to_gt(xe), NULL, dpt_size,
+		dpt = xe_bo_create_pin_map(xe, tile0, NULL, dpt_size,
 					   ttm_bo_type_kernel,
 					   XE_BO_CREATE_SYSTEM_BIT |
 					   XE_BO_CREATE_GGTT_BIT);
diff --git a/drivers/gpu/drm/xe/display/xe_plane_initial.c b/drivers/gpu/drm/xe/display/xe_plane_initial.c
index 556ede2e459e..5e43ae9f9c4b 100644
--- a/drivers/gpu/drm/xe/display/xe_plane_initial.c
+++ b/drivers/gpu/drm/xe/display/xe_plane_initial.c
@@ -115,7 +115,7 @@ initial_plane_bo(struct xe_device *xe,
 			page_size);
 	size -= base;
 
-	bo = xe_bo_create_pin_map_at(xe, &tile0->primary_gt, NULL, size, phys_base,
+	bo = xe_bo_create_pin_map_at(xe, tile0, NULL, size, phys_base,
 				     ttm_bo_type_kernel, flags);
 	if (IS_ERR(bo)) {
 		drm_dbg(&xe->drm,
diff --git a/drivers/gpu/drm/xe/tests/xe_bo.c b/drivers/gpu/drm/xe/tests/xe_bo.c
index 9bd381e5b7a6..bee5a2031153 100644
--- a/drivers/gpu/drm/xe/tests/xe_bo.c
+++ b/drivers/gpu/drm/xe/tests/xe_bo.c
@@ -173,7 +173,7 @@ static int evict_test_run_gt(struct xe_device *xe, struct xe_gt *gt, struct kuni
 {
 	struct xe_bo *bo, *external;
 	unsigned int bo_flags = XE_BO_CREATE_USER_BIT |
-		XE_BO_CREATE_VRAM_IF_DGFX(gt);
+		XE_BO_CREATE_VRAM_IF_DGFX(gt_to_tile(gt));
 	struct xe_vm *vm = xe_migrate_get_vm(xe->gt[0].migrate);
 	struct ww_acquire_ctx ww;
 	int err, i;
diff --git a/drivers/gpu/drm/xe/tests/xe_migrate.c b/drivers/gpu/drm/xe/tests/xe_migrate.c
index 0f4371ad1fd9..fe8331f116c2 100644
--- a/drivers/gpu/drm/xe/tests/xe_migrate.c
+++ b/drivers/gpu/drm/xe/tests/xe_migrate.c
@@ -240,6 +240,7 @@ static void test_pt_update(struct xe_migrate *m, struct xe_bo *pt,
 static void xe_migrate_sanity_test(struct xe_migrate *m, struct kunit *test)
 {
 	struct xe_gt *gt = m->gt;
+	struct xe_tile *tile = gt_to_tile(m->gt);
 	struct xe_device *xe = gt_to_xe(gt);
 	struct xe_bo *pt, *bo = m->pt_bo, *big, *tiny;
 	struct xe_res_cursor src_it;
@@ -256,18 +257,18 @@ static void xe_migrate_sanity_test(struct xe_migrate *m, struct kunit *test)
 		return;
 	}
 
-	big = xe_bo_create_pin_map(xe, m->gt, m->eng->vm, SZ_4M,
+	big = xe_bo_create_pin_map(xe, tile, m->eng->vm, SZ_4M,
 				   ttm_bo_type_kernel,
-				   XE_BO_CREATE_VRAM_IF_DGFX(m->gt) |
+				   XE_BO_CREATE_VRAM_IF_DGFX(tile) |
 				   XE_BO_CREATE_PINNED_BIT);
 	if (IS_ERR(big)) {
 		KUNIT_FAIL(test, "Failed to allocate bo: %li\n", PTR_ERR(big));
 		goto vunmap;
 	}
 
-	pt = xe_bo_create_pin_map(xe, m->gt, m->eng->vm, XE_PAGE_SIZE,
+	pt = xe_bo_create_pin_map(xe, tile, m->eng->vm, XE_PAGE_SIZE,
 				  ttm_bo_type_kernel,
-				  XE_BO_CREATE_VRAM_IF_DGFX(m->gt) |
+				  XE_BO_CREATE_VRAM_IF_DGFX(tile) |
 				  XE_BO_CREATE_PINNED_BIT);
 	if (IS_ERR(pt)) {
 		KUNIT_FAIL(test, "Failed to allocate fake pt: %li\n",
@@ -275,10 +276,10 @@ static void xe_migrate_sanity_test(struct xe_migrate *m, struct kunit *test)
 		goto free_big;
 	}
 
-	tiny = xe_bo_create_pin_map(xe, m->gt, m->eng->vm,
+	tiny = xe_bo_create_pin_map(xe, tile, m->eng->vm,
 				    2 * SZ_4K,
 				    ttm_bo_type_kernel,
-				    XE_BO_CREATE_VRAM_IF_DGFX(m->gt) |
+				    XE_BO_CREATE_VRAM_IF_DGFX(tile) |
 				    XE_BO_CREATE_PINNED_BIT);
 	if (IS_ERR(tiny)) {
 		KUNIT_FAIL(test, "Failed to allocate fake pt: %li\n",
@@ -286,7 +287,7 @@ static void xe_migrate_sanity_test(struct xe_migrate *m, struct kunit *test)
 		goto free_pt;
 	}
 
-	bb = xe_bb_new(m->gt, 32, xe->info.supports_usm);
+	bb = xe_bb_new(gt, 32, xe->info.supports_usm);
 	if (IS_ERR(bb)) {
 		KUNIT_FAIL(test, "Failed to create batchbuffer: %li\n",
 			   PTR_ERR(bb));
diff --git a/drivers/gpu/drm/xe/xe_bb.c b/drivers/gpu/drm/xe/xe_bb.c
index bf7c94b769d7..f9b6b7adf99f 100644
--- a/drivers/gpu/drm/xe/xe_bb.c
+++ b/drivers/gpu/drm/xe/xe_bb.c
@@ -30,6 +30,7 @@ static int bb_prefetch(struct xe_gt *gt)
 
 struct xe_bb *xe_bb_new(struct xe_gt *gt, u32 dwords, bool usm)
 {
+	struct xe_tile *tile = gt_to_tile(gt);
 	struct xe_bb *bb = kmalloc(sizeof(*bb), GFP_KERNEL);
 	int err;
 
@@ -42,7 +43,7 @@ struct xe_bb *xe_bb_new(struct xe_gt *gt, u32 dwords, bool usm)
 	 * space to accomodate the platform-specific hardware prefetch
 	 * requirements.
 	 */
-	bb->bo = xe_sa_bo_new(!usm ? gt->kernel_bb_pool : gt->usm.bb_pool,
+	bb->bo = xe_sa_bo_new(!usm ? tile->mem.kernel_bb_pool : gt->usm.bb_pool,
 			      4 * (dwords + 1) + bb_prefetch(gt));
 	if (IS_ERR(bb->bo)) {
 		err = PTR_ERR(bb->bo);
diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
index 5dbca5bbca8f..9d613fc5d309 100644
--- a/drivers/gpu/drm/xe/xe_bo.c
+++ b/drivers/gpu/drm/xe/xe_bo.c
@@ -452,7 +452,7 @@ static int xe_bo_trigger_rebind(struct xe_device *xe, struct xe_bo *bo,
 			}
 
 			xe_vm_assert_held(vm);
-			if (list_empty(&vma->rebind_link) && vma->gt_present)
+			if (list_empty(&vma->rebind_link) && vma->tile_present)
 				list_add_tail(&vma->rebind_link, &vm->rebind_list);
 
 			if (vm_resv_locked)
@@ -559,7 +559,7 @@ static int xe_bo_move(struct ttm_buffer_object *ttm_bo, bool evict,
 	struct xe_bo *bo = ttm_to_xe_bo(ttm_bo);
 	struct ttm_resource *old_mem = ttm_bo->resource;
 	struct ttm_tt *ttm = ttm_bo->ttm;
-	struct xe_gt *gt = NULL;
+	struct xe_tile *tile = NULL;
 	struct dma_fence *fence;
 	bool move_lacks_source;
 	bool needs_clear;
@@ -629,15 +629,15 @@ static int xe_bo_move(struct ttm_buffer_object *ttm_bo, bool evict,
 		goto out;
 	}
 
-	if (bo->gt)
-		gt = bo->gt;
+	if (bo->tile)
+		tile = bo->tile;
 	else if (resource_is_vram(new_mem))
-		gt = &mem_type_to_tile(xe, new_mem->mem_type)->primary_gt;
+		tile = mem_type_to_tile(xe, new_mem->mem_type);
 	else if (resource_is_vram(old_mem))
-		gt = &mem_type_to_tile(xe, old_mem->mem_type)->primary_gt;
+		tile = mem_type_to_tile(xe, old_mem->mem_type);
 
-	XE_BUG_ON(!gt);
-	XE_BUG_ON(!gt->migrate);
+	XE_BUG_ON(!tile);
+	XE_BUG_ON(!tile->primary_gt.migrate);
 
 	trace_xe_bo_move(bo);
 	xe_device_mem_access_get(xe);
@@ -658,7 +658,7 @@ static int xe_bo_move(struct ttm_buffer_object *ttm_bo, bool evict,
 
 			/* Create a new VMAP once kernel BO back in VRAM */
 			if (!ret && resource_is_vram(new_mem)) {
-				void *new_addr = gt_to_tile(gt)->mem.vram.mapping +
+				void *new_addr = tile->mem.vram.mapping +
 					(new_mem->start << PAGE_SHIFT);
 
 				if (XE_WARN_ON(new_mem->start == XE_BO_INVALID_OFFSET)) {
@@ -675,9 +675,9 @@ static int xe_bo_move(struct ttm_buffer_object *ttm_bo, bool evict,
 		}
 	} else {
 		if (move_lacks_source)
-			fence = xe_migrate_clear(gt->migrate, bo, new_mem);
+			fence = xe_migrate_clear(tile->primary_gt.migrate, bo, new_mem);
 		else
-			fence = xe_migrate_copy(gt->migrate, bo, old_mem, new_mem);
+			fence = xe_migrate_copy(tile->primary_gt.migrate, bo, old_mem, new_mem);
 		if (IS_ERR(fence)) {
 			ret = PTR_ERR(fence);
 			xe_device_mem_access_put(xe);
@@ -958,7 +958,7 @@ static void xe_ttm_bo_destroy(struct ttm_buffer_object *ttm_bo)
 	WARN_ON(!list_empty(&bo->vmas));
 
 	if (bo->ggtt_node.size)
-		xe_ggtt_remove_bo(gt_to_tile(bo->gt)->mem.ggtt, bo);
+		xe_ggtt_remove_bo(bo->tile->mem.ggtt, bo);
 
 	if (bo->vm && xe_bo_is_user(bo))
 		xe_vm_put(bo->vm);
@@ -1080,7 +1080,7 @@ void xe_bo_free(struct xe_bo *bo)
 }
 
 struct xe_bo *__xe_bo_create_locked(struct xe_device *xe, struct xe_bo *bo,
-				    struct xe_gt *gt, struct dma_resv *resv,
+				    struct xe_tile *tile, struct dma_resv *resv,
 				    size_t size, enum ttm_bo_type type,
 				    u32 flags)
 {
@@ -1093,7 +1093,7 @@ struct xe_bo *__xe_bo_create_locked(struct xe_device *xe, struct xe_bo *bo,
 	int err;
 
 	/* Only kernel objects should set GT */
-	XE_BUG_ON(gt && type != ttm_bo_type_kernel);
+	XE_BUG_ON(tile && type != ttm_bo_type_kernel);
 
 	if (XE_WARN_ON(!size))
 		return ERR_PTR(-EINVAL);
@@ -1114,7 +1114,7 @@ struct xe_bo *__xe_bo_create_locked(struct xe_device *xe, struct xe_bo *bo,
 		alignment = SZ_4K >> PAGE_SHIFT;
 	}
 
-	bo->gt = gt;
+	bo->tile = tile;
 	bo->size = size;
 	bo->flags = flags;
 	bo->ttm.base.funcs = &xe_gem_object_funcs;
@@ -1196,7 +1196,7 @@ static int __xe_bo_fixed_placement(struct xe_device *xe,
 
 struct xe_bo *
 xe_bo_create_locked_range(struct xe_device *xe,
-			  struct xe_gt *gt, struct xe_vm *vm,
+			  struct xe_tile *tile, struct xe_vm *vm,
 			  size_t size, u64 start, u64 end,
 			  enum ttm_bo_type type, u32 flags)
 {
@@ -1219,7 +1219,7 @@ xe_bo_create_locked_range(struct xe_device *xe,
 		}
 	}
 
-	bo = __xe_bo_create_locked(xe, bo, gt, vm ? &vm->resv : NULL, size,
+	bo = __xe_bo_create_locked(xe, bo, tile, vm ? &vm->resv : NULL, size,
 				   type, flags);
 	if (IS_ERR(bo))
 		return bo;
@@ -1229,16 +1229,16 @@ xe_bo_create_locked_range(struct xe_device *xe,
 	bo->vm = vm;
 
 	if (bo->flags & XE_BO_CREATE_GGTT_BIT) {
-		if (!gt && flags & XE_BO_CREATE_STOLEN_BIT)
-			gt = xe_device_get_gt(xe, 0);
+		if (!tile && flags & XE_BO_CREATE_STOLEN_BIT)
+			tile = xe_device_get_root_tile(xe);
 
-		XE_BUG_ON(!gt);
+		XE_BUG_ON(!tile);
 
 		if (flags & XE_BO_FIXED_PLACEMENT_BIT) {
-			err = xe_ggtt_insert_bo_at(gt_to_tile(gt)->mem.ggtt, bo,
+			err = xe_ggtt_insert_bo_at(tile->mem.ggtt, bo,
 						   start + bo->size, U64_MAX);
 		} else {
-			err = xe_ggtt_insert_bo(gt_to_tile(gt)->mem.ggtt, bo);
+			err = xe_ggtt_insert_bo(tile->mem.ggtt, bo);
 		}
 		if (err)
 			goto err_unlock_put_bo;
@@ -1252,18 +1252,18 @@ xe_bo_create_locked_range(struct xe_device *xe,
 	return ERR_PTR(err);
 }
 
-struct xe_bo *xe_bo_create_locked(struct xe_device *xe, struct xe_gt *gt,
+struct xe_bo *xe_bo_create_locked(struct xe_device *xe, struct xe_tile *tile,
 				  struct xe_vm *vm, size_t size,
 				  enum ttm_bo_type type, u32 flags)
 {
-	return xe_bo_create_locked_range(xe, gt, vm, size, 0, ~0ULL, type, flags);
+	return xe_bo_create_locked_range(xe, tile, vm, size, 0, ~0ULL, type, flags);
 }
 
-struct xe_bo *xe_bo_create(struct xe_device *xe, struct xe_gt *gt,
+struct xe_bo *xe_bo_create(struct xe_device *xe, struct xe_tile *tile,
 			   struct xe_vm *vm, size_t size,
 			   enum ttm_bo_type type, u32 flags)
 {
-	struct xe_bo *bo = xe_bo_create_locked(xe, gt, vm, size, type, flags);
+	struct xe_bo *bo = xe_bo_create_locked(xe, tile, vm, size, type, flags);
 
 	if (!IS_ERR(bo))
 		xe_bo_unlock_vm_held(bo);
@@ -1271,7 +1271,7 @@ struct xe_bo *xe_bo_create(struct xe_device *xe, struct xe_gt *gt,
 	return bo;
 }
 
-struct xe_bo *xe_bo_create_pin_map_at(struct xe_device *xe, struct xe_gt *gt,
+struct xe_bo *xe_bo_create_pin_map_at(struct xe_device *xe, struct xe_tile *tile,
 				      struct xe_vm *vm,
 				      size_t size, u64 offset,
 				      enum ttm_bo_type type, u32 flags)
@@ -1285,7 +1285,7 @@ struct xe_bo *xe_bo_create_pin_map_at(struct xe_device *xe, struct xe_gt *gt,
 	    xe_ttm_stolen_cpu_access_needs_ggtt(xe))
 		flags |= XE_BO_CREATE_GGTT_BIT;
 
-	bo = xe_bo_create_locked_range(xe, gt, vm, size, start, end, type, flags);
+	bo = xe_bo_create_locked_range(xe, tile, vm, size, start, end, type, flags);
 	if (IS_ERR(bo))
 		return bo;
 
@@ -1309,18 +1309,18 @@ struct xe_bo *xe_bo_create_pin_map_at(struct xe_device *xe, struct xe_gt *gt,
 	return ERR_PTR(err);
 }
 
-struct xe_bo *xe_bo_create_pin_map(struct xe_device *xe, struct xe_gt *gt,
+struct xe_bo *xe_bo_create_pin_map(struct xe_device *xe, struct xe_tile *tile,
 				   struct xe_vm *vm, size_t size,
 				   enum ttm_bo_type type, u32 flags)
 {
-	return xe_bo_create_pin_map_at(xe, gt, vm, size, ~0ull, type, flags);
+	return xe_bo_create_pin_map_at(xe, tile, vm, size, ~0ull, type, flags);
 }
 
-struct xe_bo *xe_bo_create_from_data(struct xe_device *xe, struct xe_gt *gt,
+struct xe_bo *xe_bo_create_from_data(struct xe_device *xe, struct xe_tile *tile,
 				     const void *data, size_t size,
 				     enum ttm_bo_type type, u32 flags)
 {
-	struct xe_bo *bo = xe_bo_create_pin_map(xe, gt, NULL,
+	struct xe_bo *bo = xe_bo_create_pin_map(xe, tile, NULL,
 						ALIGN(size, PAGE_SIZE),
 						type, flags);
 	if (IS_ERR(bo))
@@ -1949,7 +1949,7 @@ int xe_bo_dumb_create(struct drm_file *file_priv,
 			   page_size);
 
 	bo = xe_bo_create(xe, NULL, NULL, args->size, ttm_bo_type_device,
-			  XE_BO_CREATE_VRAM_IF_DGFX(to_gt(xe)) |
+			  XE_BO_CREATE_VRAM_IF_DGFX(xe_device_get_root_tile(xe)) |
 			  XE_BO_CREATE_USER_BIT | XE_BO_SCANOUT_BIT);
 	if (IS_ERR(bo))
 		return PTR_ERR(bo);
diff --git a/drivers/gpu/drm/xe/xe_bo.h b/drivers/gpu/drm/xe/xe_bo.h
index 7a79f3893260..ccb0fae2966e 100644
--- a/drivers/gpu/drm/xe/xe_bo.h
+++ b/drivers/gpu/drm/xe/xe_bo.h
@@ -21,8 +21,8 @@
 					 XE_BO_CREATE_VRAM1_BIT)
 /* -- */
 #define XE_BO_CREATE_STOLEN_BIT		BIT(4)
-#define XE_BO_CREATE_VRAM_IF_DGFX(gt) \
-	(IS_DGFX(gt_to_xe(gt)) ? XE_BO_CREATE_VRAM0_BIT << gt_to_tile(gt)->id : \
+#define XE_BO_CREATE_VRAM_IF_DGFX(tile) \
+	(IS_DGFX(tile_to_xe(tile)) ? XE_BO_CREATE_VRAM0_BIT << (tile)->id : \
 	 XE_BO_CREATE_SYSTEM_BIT)
 #define XE_BO_CREATE_GGTT_BIT		BIT(5)
 #define XE_BO_CREATE_IGNORE_MIN_PAGE_SIZE_BIT BIT(6)
@@ -80,27 +80,27 @@ struct xe_bo *xe_bo_alloc(void);
 void xe_bo_free(struct xe_bo *bo);
 
 struct xe_bo *__xe_bo_create_locked(struct xe_device *xe, struct xe_bo *bo,
-				    struct xe_gt *gt, struct dma_resv *resv,
+				    struct xe_tile *tile, struct dma_resv *resv,
 				    size_t size, enum ttm_bo_type type,
 				    u32 flags);
 struct xe_bo *
 xe_bo_create_locked_range(struct xe_device *xe,
-			  struct xe_gt *gt, struct xe_vm *vm,
+			  struct xe_tile *tile, struct xe_vm *vm,
 			  size_t size, u64 start, u64 end,
 			  enum ttm_bo_type type, u32 flags);
-struct xe_bo *xe_bo_create_locked(struct xe_device *xe, struct xe_gt *gt,
+struct xe_bo *xe_bo_create_locked(struct xe_device *xe, struct xe_tile *tile,
 				  struct xe_vm *vm, size_t size,
 				  enum ttm_bo_type type, u32 flags);
-struct xe_bo *xe_bo_create(struct xe_device *xe, struct xe_gt *gt,
+struct xe_bo *xe_bo_create(struct xe_device *xe, struct xe_tile *tile,
 			   struct xe_vm *vm, size_t size,
 			   enum ttm_bo_type type, u32 flags);
-struct xe_bo *xe_bo_create_pin_map(struct xe_device *xe, struct xe_gt *gt,
+struct xe_bo *xe_bo_create_pin_map(struct xe_device *xe, struct xe_tile *tile,
 				   struct xe_vm *vm, size_t size,
 				   enum ttm_bo_type type, u32 flags);
-struct xe_bo *xe_bo_create_pin_map_at(struct xe_device *xe, struct xe_gt *gt,
+struct xe_bo *xe_bo_create_pin_map_at(struct xe_device *xe, struct xe_tile *tile,
 				      struct xe_vm *vm, size_t size, u64 offset,
 				      enum ttm_bo_type type, u32 flags);
-struct xe_bo *xe_bo_create_from_data(struct xe_device *xe, struct xe_gt *gt,
+struct xe_bo *xe_bo_create_from_data(struct xe_device *xe, struct xe_tile *tile,
 				     const void *data, size_t size,
 				     enum ttm_bo_type type, u32 flags);
 
diff --git a/drivers/gpu/drm/xe/xe_bo_evict.c b/drivers/gpu/drm/xe/xe_bo_evict.c
index a72963c54bf3..9226195bd560 100644
--- a/drivers/gpu/drm/xe/xe_bo_evict.c
+++ b/drivers/gpu/drm/xe/xe_bo_evict.c
@@ -149,7 +149,7 @@ int xe_bo_restore_kernel(struct xe_device *xe)
 		}
 
 		if (bo->flags & XE_BO_CREATE_GGTT_BIT) {
-			struct xe_tile *tile = gt_to_tile(bo->gt);
+			struct xe_tile *tile = bo->tile;
 
 			mutex_lock(&tile->mem.ggtt->lock);
 			xe_ggtt_map_bo(tile->mem.ggtt, bo);
diff --git a/drivers/gpu/drm/xe/xe_bo_types.h b/drivers/gpu/drm/xe/xe_bo_types.h
index 06de3330211d..f6ee920303af 100644
--- a/drivers/gpu/drm/xe/xe_bo_types.h
+++ b/drivers/gpu/drm/xe/xe_bo_types.h
@@ -29,8 +29,8 @@ struct xe_bo {
 	u32 flags;
 	/** @vm: VM this BO is attached to, for extobj this will be NULL */
 	struct xe_vm *vm;
-	/** @gt: GT this BO is attached to (kernel BO only) */
-	struct xe_gt *gt;
+	/** @tile: Tile this BO is attached to (kernel BO only) */
+	struct xe_tile *tile;
 	/** @vmas: List of VMAs for this BO */
 	struct list_head vmas;
 	/** @placements: valid placements for this BO */
diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
index 6b9e7847161c..c6365b6f14ba 100644
--- a/drivers/gpu/drm/xe/xe_device_types.h
+++ b/drivers/gpu/drm/xe/xe_device_types.h
@@ -131,6 +131,13 @@ struct xe_tile {
 
 		/** @ggtt: Global graphics translation table */
 		struct xe_ggtt *ggtt;
+
+		/**
+		 * @kernel_bb_pool: Pool from which batchbuffers are allocated.
+		 *
+		 * Media GT shares a pool with its primary GT.
+		 */
+		struct xe_sa_manager *kernel_bb_pool;
 	} mem;
 };
 
diff --git a/drivers/gpu/drm/xe/xe_ggtt.c b/drivers/gpu/drm/xe/xe_ggtt.c
index 52d293d61cc0..b11f22b68bb8 100644
--- a/drivers/gpu/drm/xe/xe_ggtt.c
+++ b/drivers/gpu/drm/xe/xe_ggtt.c
@@ -149,7 +149,6 @@ static void xe_ggtt_initial_clear(struct xe_ggtt *ggtt)
 int xe_ggtt_init(struct xe_ggtt *ggtt)
 {
 	struct xe_device *xe = tile_to_xe(ggtt->tile);
-	struct xe_gt *gt = &ggtt->tile->primary_gt;
 	unsigned int flags;
 	int err;
 
@@ -162,9 +161,9 @@ int xe_ggtt_init(struct xe_ggtt *ggtt)
 	if (ggtt->flags & XE_GGTT_FLAGS_64K)
 		flags |= XE_BO_CREATE_SYSTEM_BIT;
 	else
-		flags |= XE_BO_CREATE_VRAM_IF_DGFX(gt);
+		flags |= XE_BO_CREATE_VRAM_IF_DGFX(ggtt->tile);
 
-	ggtt->scratch = xe_bo_create_pin_map(xe, gt, NULL, XE_PAGE_SIZE,
+	ggtt->scratch = xe_bo_create_pin_map(xe, ggtt->tile, NULL, XE_PAGE_SIZE,
 					     ttm_bo_type_kernel,
 					     flags);
 
diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c
index 1e424ce8ef3e..d769bc93d15c 100644
--- a/drivers/gpu/drm/xe/xe_gt.c
+++ b/drivers/gpu/drm/xe/xe_gt.c
@@ -95,7 +95,7 @@ static int emit_nop_job(struct xe_gt *gt, struct xe_engine *e)
 	if (IS_ERR(bb))
 		return PTR_ERR(bb);
 
-	batch_ofs = xe_bo_ggtt_addr(gt->kernel_bb_pool->bo);
+	batch_ofs = xe_bo_ggtt_addr(gt_to_tile(gt)->mem.kernel_bb_pool->bo);
 	job = xe_bb_create_wa_job(e, bb, batch_ofs);
 	if (IS_ERR(job)) {
 		xe_bb_free(bb, NULL);
@@ -144,7 +144,7 @@ static int emit_wa_job(struct xe_gt *gt, struct xe_engine *e)
 		}
 	}
 
-	batch_ofs = xe_bo_ggtt_addr(gt->kernel_bb_pool->bo);
+	batch_ofs = xe_bo_ggtt_addr(gt_to_tile(gt)->mem.kernel_bb_pool->bo);
 	job = xe_bb_create_wa_job(e, bb, batch_ofs);
 	if (IS_ERR(job)) {
 		xe_bb_free(bb, NULL);
@@ -364,31 +364,16 @@ static int all_fw_domain_init(struct xe_gt *gt)
 		goto err_force_wake;
 
 	if (!xe_gt_is_media_type(gt)) {
-		gt->kernel_bb_pool = xe_sa_bo_manager_init(gt, SZ_1M, 16);
-		if (IS_ERR(gt->kernel_bb_pool)) {
-			err = PTR_ERR(gt->kernel_bb_pool);
-			goto err_force_wake;
-		}
-
 		/*
 		 * USM has its only SA pool to non-block behind user operations
 		 */
 		if (gt_to_xe(gt)->info.supports_usm) {
-			gt->usm.bb_pool = xe_sa_bo_manager_init(gt, SZ_1M, 16);
+			gt->usm.bb_pool = xe_sa_bo_manager_init(gt_to_tile(gt), SZ_1M, 16);
 			if (IS_ERR(gt->usm.bb_pool)) {
 				err = PTR_ERR(gt->usm.bb_pool);
 				goto err_force_wake;
 			}
 		}
-	} else {
-		struct xe_gt *full_gt = xe_find_full_gt(gt);
-
-		/*
-		 * Media GT's kernel_bb_pool is only used while recording the
-		 * default context during GT init.  The USM pool should never
-		 * be needed on the media GT.
-		 */
-		gt->kernel_bb_pool = full_gt->kernel_bb_pool;
 	}
 
 	if (!xe_gt_is_media_type(gt)) {
diff --git a/drivers/gpu/drm/xe/xe_gt_debugfs.c b/drivers/gpu/drm/xe/xe_gt_debugfs.c
index b71b584c9bdc..9c72849f07a2 100644
--- a/drivers/gpu/drm/xe/xe_gt_debugfs.c
+++ b/drivers/gpu/drm/xe/xe_gt_debugfs.c
@@ -63,11 +63,11 @@ static int force_reset(struct seq_file *m, void *data)
 
 static int sa_info(struct seq_file *m, void *data)
 {
-	struct xe_gt *gt = node_to_gt(m->private);
+	struct xe_tile *tile = gt_to_tile(node_to_gt(m->private));
 	struct drm_printer p = drm_seq_file_printer(m);
 
-	drm_suballoc_dump_debug_info(&gt->kernel_bb_pool->base, &p,
-				     gt->kernel_bb_pool->gpu_addr);
+	drm_suballoc_dump_debug_info(&tile->mem.kernel_bb_pool->base, &p,
+				     tile->mem.kernel_bb_pool->gpu_addr);
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/xe/xe_gt_pagefault.c b/drivers/gpu/drm/xe/xe_gt_pagefault.c
index f4f3d95ae6b1..1c2b23ae89cf 100644
--- a/drivers/gpu/drm/xe/xe_gt_pagefault.c
+++ b/drivers/gpu/drm/xe/xe_gt_pagefault.c
@@ -71,8 +71,8 @@ static bool access_is_atomic(enum access_type access_type)
 
 static bool vma_is_valid(struct xe_gt *gt, struct xe_vma *vma)
 {
-	return BIT(gt->info.id) & vma->gt_present &&
-		!(BIT(gt->info.id) & vma->usm.gt_invalidated);
+	return BIT(gt_to_tile(gt)->id) & vma->tile_present &&
+		!(BIT(gt->info.id) & vma->usm.tile_invalidated);
 }
 
 static bool vma_matches(struct xe_vma *vma, struct xe_vma *lookup)
@@ -208,8 +208,8 @@ static int handle_pagefault(struct xe_gt *gt, struct pagefault *pf)
 
 	/* Bind VMA only to the GT that has faulted */
 	trace_xe_vma_pf_bind(vma);
-	fence = __xe_pt_bind_vma(gt, vma, xe_gt_migrate_engine(gt), NULL, 0,
-				 vma->gt_present & BIT(gt->info.id));
+	fence = __xe_pt_bind_vma(tile, vma, xe_gt_migrate_engine(gt), NULL, 0,
+				 vma->tile_present & BIT(tile->id));
 	if (IS_ERR(fence)) {
 		ret = PTR_ERR(fence);
 		goto unlock_dma_resv;
@@ -225,7 +225,7 @@ static int handle_pagefault(struct xe_gt *gt, struct pagefault *pf)
 
 	if (xe_vma_is_userptr(vma))
 		ret = xe_vma_userptr_check_repin(vma);
-	vma->usm.gt_invalidated &= ~BIT(gt->info.id);
+	vma->usm.tile_invalidated &= ~BIT(gt_to_tile(gt)->id);
 
 unlock_dma_resv:
 	if (only_needs_bo_lock(bo))
diff --git a/drivers/gpu/drm/xe/xe_gt_types.h b/drivers/gpu/drm/xe/xe_gt_types.h
index bb5271277e3b..6e239ce738c1 100644
--- a/drivers/gpu/drm/xe/xe_gt_types.h
+++ b/drivers/gpu/drm/xe/xe_gt_types.h
@@ -278,13 +278,6 @@ struct xe_gt {
 	/** @hw_engines: hardware engines on the GT */
 	struct xe_hw_engine hw_engines[XE_NUM_HW_ENGINES];
 
-	/**
-	 * @kernel_bb_pool: Pool from which batchbuffers are allocated.
-	 *
-	 * Media GT shares a pool with its primary GT.
-	 */
-	struct xe_sa_manager *kernel_bb_pool;
-
 	/** @migrate: Migration helper for vram blits and clearing */
 	struct xe_migrate *migrate;
 
diff --git a/drivers/gpu/drm/xe/xe_guc_ads.c b/drivers/gpu/drm/xe/xe_guc_ads.c
index 6d550d746909..dd69d097b920 100644
--- a/drivers/gpu/drm/xe/xe_guc_ads.c
+++ b/drivers/gpu/drm/xe/xe_guc_ads.c
@@ -273,16 +273,17 @@ int xe_guc_ads_init(struct xe_guc_ads *ads)
 {
 	struct xe_device *xe = ads_to_xe(ads);
 	struct xe_gt *gt = ads_to_gt(ads);
+	struct xe_tile *tile = gt_to_tile(gt);
 	struct xe_bo *bo;
 	int err;
 
 	ads->golden_lrc_size = calculate_golden_lrc_size(ads);
 	ads->regset_size = calculate_regset_size(gt);
 
-	bo = xe_bo_create_pin_map(xe, gt, NULL, guc_ads_size(ads) +
+	bo = xe_bo_create_pin_map(xe, tile, NULL, guc_ads_size(ads) +
 				  MAX_GOLDEN_LRC_SIZE,
 				  ttm_bo_type_kernel,
-				  XE_BO_CREATE_VRAM_IF_DGFX(gt) |
+				  XE_BO_CREATE_VRAM_IF_DGFX(tile) |
 				  XE_BO_CREATE_GGTT_BIT);
 	if (IS_ERR(bo))
 		return PTR_ERR(bo);
diff --git a/drivers/gpu/drm/xe/xe_guc_ct.c b/drivers/gpu/drm/xe/xe_guc_ct.c
index 9055ff133a7c..01e6ff405549 100644
--- a/drivers/gpu/drm/xe/xe_guc_ct.c
+++ b/drivers/gpu/drm/xe/xe_guc_ct.c
@@ -130,6 +130,7 @@ int xe_guc_ct_init(struct xe_guc_ct *ct)
 {
 	struct xe_device *xe = ct_to_xe(ct);
 	struct xe_gt *gt = ct_to_gt(ct);
+	struct xe_tile *tile = gt_to_tile(gt);
 	struct xe_bo *bo;
 	int err;
 
@@ -145,9 +146,9 @@ int xe_guc_ct_init(struct xe_guc_ct *ct)
 
 	primelockdep(ct);
 
-	bo = xe_bo_create_pin_map(xe, gt, NULL, guc_ct_size(),
+	bo = xe_bo_create_pin_map(xe, tile, NULL, guc_ct_size(),
 				  ttm_bo_type_kernel,
-				  XE_BO_CREATE_VRAM_IF_DGFX(gt) |
+				  XE_BO_CREATE_VRAM_IF_DGFX(tile) |
 				  XE_BO_CREATE_GGTT_BIT);
 	if (IS_ERR(bo))
 		return PTR_ERR(bo);
diff --git a/drivers/gpu/drm/xe/xe_guc_hwconfig.c b/drivers/gpu/drm/xe/xe_guc_hwconfig.c
index a6982f323ed1..c8f875e970ab 100644
--- a/drivers/gpu/drm/xe/xe_guc_hwconfig.c
+++ b/drivers/gpu/drm/xe/xe_guc_hwconfig.c
@@ -70,6 +70,7 @@ int xe_guc_hwconfig_init(struct xe_guc *guc)
 {
 	struct xe_device *xe = guc_to_xe(guc);
 	struct xe_gt *gt = guc_to_gt(guc);
+	struct xe_tile *tile = gt_to_tile(gt);
 	struct xe_bo *bo;
 	u32 size;
 	int err;
@@ -94,9 +95,9 @@ int xe_guc_hwconfig_init(struct xe_guc *guc)
 	if (!size)
 		return -EINVAL;
 
-	bo = xe_bo_create_pin_map(xe, gt, NULL, PAGE_ALIGN(size),
+	bo = xe_bo_create_pin_map(xe, tile, NULL, PAGE_ALIGN(size),
 				  ttm_bo_type_kernel,
-				  XE_BO_CREATE_VRAM_IF_DGFX(gt) |
+				  XE_BO_CREATE_VRAM_IF_DGFX(tile) |
 				  XE_BO_CREATE_GGTT_BIT);
 	if (IS_ERR(bo))
 		return PTR_ERR(bo);
diff --git a/drivers/gpu/drm/xe/xe_guc_log.c b/drivers/gpu/drm/xe/xe_guc_log.c
index 9a7b5d5906c1..403aaafcaba6 100644
--- a/drivers/gpu/drm/xe/xe_guc_log.c
+++ b/drivers/gpu/drm/xe/xe_guc_log.c
@@ -87,13 +87,13 @@ static void guc_log_fini(struct drm_device *drm, void *arg)
 int xe_guc_log_init(struct xe_guc_log *log)
 {
 	struct xe_device *xe = log_to_xe(log);
-	struct xe_gt *gt = log_to_gt(log);
+	struct xe_tile *tile = gt_to_tile(log_to_gt(log));
 	struct xe_bo *bo;
 	int err;
 
-	bo = xe_bo_create_pin_map(xe, gt, NULL, guc_log_size(),
+	bo = xe_bo_create_pin_map(xe, tile, NULL, guc_log_size(),
 				  ttm_bo_type_kernel,
-				  XE_BO_CREATE_VRAM_IF_DGFX(gt) |
+				  XE_BO_CREATE_VRAM_IF_DGFX(tile) |
 				  XE_BO_CREATE_GGTT_BIT);
 	if (IS_ERR(bo))
 		return PTR_ERR(bo);
diff --git a/drivers/gpu/drm/xe/xe_guc_pc.c b/drivers/gpu/drm/xe/xe_guc_pc.c
index e799faa1c6b8..67faa9ee0006 100644
--- a/drivers/gpu/drm/xe/xe_guc_pc.c
+++ b/drivers/gpu/drm/xe/xe_guc_pc.c
@@ -888,6 +888,7 @@ static void pc_fini(struct drm_device *drm, void *arg)
 int xe_guc_pc_init(struct xe_guc_pc *pc)
 {
 	struct xe_gt *gt = pc_to_gt(pc);
+	struct xe_tile *tile = gt_to_tile(gt);
 	struct xe_device *xe = gt_to_xe(gt);
 	struct xe_bo *bo;
 	u32 size = PAGE_ALIGN(sizeof(struct slpc_shared_data));
@@ -895,9 +896,9 @@ int xe_guc_pc_init(struct xe_guc_pc *pc)
 
 	mutex_init(&pc->freq_lock);
 
-	bo = xe_bo_create_pin_map(xe, gt, NULL, size,
+	bo = xe_bo_create_pin_map(xe, tile, NULL, size,
 				  ttm_bo_type_kernel,
-				  XE_BO_CREATE_VRAM_IF_DGFX(gt) |
+				  XE_BO_CREATE_VRAM_IF_DGFX(tile) |
 				  XE_BO_CREATE_GGTT_BIT);
 
 	if (IS_ERR(bo))
diff --git a/drivers/gpu/drm/xe/xe_hw_engine.c b/drivers/gpu/drm/xe/xe_hw_engine.c
index 751f6c3bba17..fe8af54ea8bd 100644
--- a/drivers/gpu/drm/xe/xe_hw_engine.c
+++ b/drivers/gpu/drm/xe/xe_hw_engine.c
@@ -371,6 +371,7 @@ static int hw_engine_init(struct xe_gt *gt, struct xe_hw_engine *hwe,
 			  enum xe_hw_engine_id id)
 {
 	struct xe_device *xe = gt_to_xe(gt);
+	struct xe_tile *tile = gt_to_tile(gt);
 	int err;
 
 	XE_BUG_ON(id >= ARRAY_SIZE(engine_infos) || !engine_infos[id].name);
@@ -379,8 +380,8 @@ static int hw_engine_init(struct xe_gt *gt, struct xe_hw_engine *hwe,
 	xe_reg_sr_apply_mmio(&hwe->reg_sr, gt);
 	xe_reg_sr_apply_whitelist(&hwe->reg_whitelist, hwe->mmio_base, gt);
 
-	hwe->hwsp = xe_bo_create_pin_map(xe, gt, NULL, SZ_4K, ttm_bo_type_kernel,
-					 XE_BO_CREATE_VRAM_IF_DGFX(gt) |
+	hwe->hwsp = xe_bo_create_pin_map(xe, tile, NULL, SZ_4K, ttm_bo_type_kernel,
+					 XE_BO_CREATE_VRAM_IF_DGFX(tile) |
 					 XE_BO_CREATE_GGTT_BIT);
 	if (IS_ERR(hwe->hwsp)) {
 		err = PTR_ERR(hwe->hwsp);
diff --git a/drivers/gpu/drm/xe/xe_lrc.c b/drivers/gpu/drm/xe/xe_lrc.c
index ae605e7805de..8f25a38f36a5 100644
--- a/drivers/gpu/drm/xe/xe_lrc.c
+++ b/drivers/gpu/drm/xe/xe_lrc.c
@@ -592,7 +592,7 @@ static void *empty_lrc_data(struct xe_hw_engine *hwe)
 
 static void xe_lrc_set_ppgtt(struct xe_lrc *lrc, struct xe_vm *vm)
 {
-	u64 desc = xe_vm_pdp4_descriptor(vm, lrc->full_gt);
+	u64 desc = xe_vm_pdp4_descriptor(vm, lrc->tile);
 
 	xe_lrc_write_ctx_reg(lrc, CTX_PDP0_UDW, upper_32_bits(desc));
 	xe_lrc_write_ctx_reg(lrc, CTX_PDP0_LDW, lower_32_bits(desc));
@@ -607,6 +607,7 @@ int xe_lrc_init(struct xe_lrc *lrc, struct xe_hw_engine *hwe,
 		struct xe_engine *e, struct xe_vm *vm, u32 ring_size)
 {
 	struct xe_gt *gt = hwe->gt;
+	struct xe_tile *tile = gt_to_tile(gt);
 	struct xe_device *xe = gt_to_xe(gt);
 	struct iosys_map map;
 	void *init_data = NULL;
@@ -619,19 +620,15 @@ int xe_lrc_init(struct xe_lrc *lrc, struct xe_hw_engine *hwe,
 	 * FIXME: Perma-pinning LRC as we don't yet support moving GGTT address
 	 * via VM bind calls.
 	 */
-	lrc->bo = xe_bo_create_pin_map(xe, hwe->gt, vm,
+	lrc->bo = xe_bo_create_pin_map(xe, tile, vm,
 				      ring_size + xe_lrc_size(xe, hwe->class),
 				      ttm_bo_type_kernel,
-				      XE_BO_CREATE_VRAM_IF_DGFX(hwe->gt) |
+				      XE_BO_CREATE_VRAM_IF_DGFX(tile) |
 				      XE_BO_CREATE_GGTT_BIT);
 	if (IS_ERR(lrc->bo))
 		return PTR_ERR(lrc->bo);
 
-	if (xe_gt_is_media_type(hwe->gt))
-		lrc->full_gt = xe_find_full_gt(hwe->gt);
-	else
-		lrc->full_gt = hwe->gt;
-
+	lrc->tile = gt_to_tile(hwe->gt);
 	lrc->ring.size = ring_size;
 	lrc->ring.tail = 0;
 
diff --git a/drivers/gpu/drm/xe/xe_lrc_types.h b/drivers/gpu/drm/xe/xe_lrc_types.h
index 8fe08535873d..78220336062c 100644
--- a/drivers/gpu/drm/xe/xe_lrc_types.h
+++ b/drivers/gpu/drm/xe/xe_lrc_types.h
@@ -20,8 +20,8 @@ struct xe_lrc {
 	 */
 	struct xe_bo *bo;
 
-	/** @full_gt: full GT which this LRC belongs to */
-	struct xe_gt *full_gt;
+	/** @tile: tile which this LRC belongs to */
+	struct xe_tile *tile;
 
 	/** @flags: LRC flags */
 	u32 flags;
diff --git a/drivers/gpu/drm/xe/xe_migrate.c b/drivers/gpu/drm/xe/xe_migrate.c
index f40f47ccb76f..031a0bde5585 100644
--- a/drivers/gpu/drm/xe/xe_migrate.c
+++ b/drivers/gpu/drm/xe/xe_migrate.c
@@ -129,6 +129,7 @@ static u64 xe_migrate_vram_ofs(u64 addr)
 static int xe_migrate_create_cleared_bo(struct xe_migrate *m, struct xe_vm *vm)
 {
 	struct xe_gt *gt = m->gt;
+	struct xe_tile *tile = gt_to_tile(gt);
 	struct xe_device *xe = vm->xe;
 	size_t cleared_size;
 	u64 vram_addr;
@@ -139,9 +140,9 @@ static int xe_migrate_create_cleared_bo(struct xe_migrate *m, struct xe_vm *vm)
 
 	cleared_size = xe_device_ccs_bytes(xe, MAX_PREEMPTDISABLE_TRANSFER);
 	cleared_size = PAGE_ALIGN(cleared_size);
-	m->cleared_bo = xe_bo_create_pin_map(xe, gt, vm, cleared_size,
+	m->cleared_bo = xe_bo_create_pin_map(xe, tile, vm, cleared_size,
 					     ttm_bo_type_kernel,
-					     XE_BO_CREATE_VRAM_IF_DGFX(gt) |
+					     XE_BO_CREATE_VRAM_IF_DGFX(tile) |
 					     XE_BO_CREATE_PINNED_BIT);
 	if (IS_ERR(m->cleared_bo))
 		return PTR_ERR(m->cleared_bo);
@@ -161,7 +162,8 @@ static int xe_migrate_prepare_vm(struct xe_gt *gt, struct xe_migrate *m,
 	u32 num_entries = NUM_PT_SLOTS, num_level = vm->pt_root[id]->level;
 	u32 map_ofs, level, i;
 	struct xe_device *xe = gt_to_xe(m->gt);
-	struct xe_bo *bo, *batch = gt->kernel_bb_pool->bo;
+	struct xe_tile *tile = gt_to_tile(m->gt);
+	struct xe_bo *bo, *batch = tile->mem.kernel_bb_pool->bo;
 	u64 entry;
 	int ret;
 
@@ -175,10 +177,10 @@ static int xe_migrate_prepare_vm(struct xe_gt *gt, struct xe_migrate *m,
 	/* Need to be sure everything fits in the first PT, or create more */
 	XE_BUG_ON(m->batch_base_ofs + batch->size >= SZ_2M);
 
-	bo = xe_bo_create_pin_map(vm->xe, m->gt, vm,
+	bo = xe_bo_create_pin_map(vm->xe, tile, vm,
 				  num_entries * XE_PAGE_SIZE,
 				  ttm_bo_type_kernel,
-				  XE_BO_CREATE_VRAM_IF_DGFX(m->gt) |
+				  XE_BO_CREATE_VRAM_IF_DGFX(tile) |
 				  XE_BO_CREATE_PINNED_BIT);
 	if (IS_ERR(bo))
 		return PTR_ERR(bo);
@@ -964,7 +966,7 @@ struct dma_fence *xe_migrate_clear(struct xe_migrate *m,
 	return fence;
 }
 
-static void write_pgtable(struct xe_gt *gt, struct xe_bb *bb, u64 ppgtt_ofs,
+static void write_pgtable(struct xe_tile *tile, struct xe_bb *bb, u64 ppgtt_ofs,
 			  const struct xe_vm_pgtable_update *update,
 			  struct xe_migrate_pt_update *pt_update)
 {
@@ -1003,7 +1005,7 @@ static void write_pgtable(struct xe_gt *gt, struct xe_bb *bb, u64 ppgtt_ofs,
 			(chunk * 2 + 1);
 		bb->cs[bb->len++] = lower_32_bits(addr);
 		bb->cs[bb->len++] = upper_32_bits(addr);
-		ops->populate(pt_update, gt, NULL, bb->cs + bb->len, ofs, chunk,
+		ops->populate(pt_update, tile, NULL, bb->cs + bb->len, ofs, chunk,
 			      update);
 
 		bb->len += chunk * 2;
@@ -1061,7 +1063,7 @@ xe_migrate_update_pgtables_cpu(struct xe_migrate *m,
 	for (i = 0; i < num_updates; i++) {
 		const struct xe_vm_pgtable_update *update = &updates[i];
 
-		ops->populate(pt_update, m->gt, &update->pt_bo->vmap, NULL,
+		ops->populate(pt_update, gt_to_tile(m->gt), &update->pt_bo->vmap, NULL,
 			      update->ofs, update->qwords, update);
 	}
 
@@ -1129,6 +1131,7 @@ xe_migrate_update_pgtables(struct xe_migrate *m,
 {
 	const struct xe_migrate_pt_update_ops *ops = pt_update->ops;
 	struct xe_gt *gt = m->gt;
+	struct xe_tile *tile = gt_to_tile(m->gt);
 	struct xe_device *xe = gt_to_xe(gt);
 	struct xe_sched_job *job;
 	struct dma_fence *fence;
@@ -1223,7 +1226,7 @@ xe_migrate_update_pgtables(struct xe_migrate *m,
 		addr = xe_migrate_vm_addr(ppgtt_ofs, 0) +
 			(page_ofs / sizeof(u64)) * XE_PAGE_SIZE;
 		for (i = 0; i < num_updates; i++)
-			write_pgtable(m->gt, bb, addr + i * XE_PAGE_SIZE,
+			write_pgtable(tile, bb, addr + i * XE_PAGE_SIZE,
 				      &updates[i], pt_update);
 	} else {
 		/* phys pages, no preamble required */
@@ -1233,7 +1236,7 @@ xe_migrate_update_pgtables(struct xe_migrate *m,
 		/* Preemption is enabled again by the ring ops. */
 		emit_arb_clear(bb);
 		for (i = 0; i < num_updates; i++)
-			write_pgtable(m->gt, bb, 0, &updates[i], pt_update);
+			write_pgtable(tile, bb, 0, &updates[i], pt_update);
 	}
 
 	if (!eng)
diff --git a/drivers/gpu/drm/xe/xe_migrate.h b/drivers/gpu/drm/xe/xe_migrate.h
index 1ff6e0a90de5..e07b2a8845c0 100644
--- a/drivers/gpu/drm/xe/xe_migrate.h
+++ b/drivers/gpu/drm/xe/xe_migrate.h
@@ -19,6 +19,7 @@ struct xe_migrate;
 struct xe_migrate_pt_update;
 struct xe_sync_entry;
 struct xe_pt;
+struct xe_tile;
 struct xe_vm;
 struct xe_vm_pgtable_update;
 struct xe_vma;
@@ -31,7 +32,7 @@ struct xe_migrate_pt_update_ops {
 	/**
 	 * @populate: Populate a command buffer or page-table with ptes.
 	 * @pt_update: Embeddable callback argument.
-	 * @gt: The gt for the current operation.
+	 * @tile: The tile for the current operation.
 	 * @map: struct iosys_map into the memory to be populated.
 	 * @pos: If @map is NULL, map into the memory to be populated.
 	 * @ofs: qword offset into @map, unused if @map is NULL.
@@ -43,7 +44,7 @@ struct xe_migrate_pt_update_ops {
 	 * page-tables with PTEs.
 	 */
 	void (*populate)(struct xe_migrate_pt_update *pt_update,
-			 struct xe_gt *gt, struct iosys_map *map,
+			 struct xe_tile *tile, struct iosys_map *map,
 			 void *pos, u32 ofs, u32 num_qwords,
 			 const struct xe_vm_pgtable_update *update);
 
diff --git a/drivers/gpu/drm/xe/xe_pt.c b/drivers/gpu/drm/xe/xe_pt.c
index ad42a21c0e22..ea68e6b38133 100644
--- a/drivers/gpu/drm/xe/xe_pt.c
+++ b/drivers/gpu/drm/xe/xe_pt.c
@@ -165,12 +165,10 @@ u64 gen8_pte_encode(struct xe_vma *vma, struct xe_bo *bo,
 	return __gen8_pte_encode(pte, cache, flags, pt_level);
 }
 
-static u64 __xe_pt_empty_pte(struct xe_gt *gt, struct xe_vm *vm,
+static u64 __xe_pt_empty_pte(struct xe_tile *tile, struct xe_vm *vm,
 			     unsigned int level)
 {
-	u8 id = gt->info.id;
-
-	XE_BUG_ON(xe_gt_is_media_type(gt));
+	u8 id = tile->id;
 
 	if (!vm->scratch_bo[id])
 		return 0;
@@ -189,7 +187,7 @@ static u64 __xe_pt_empty_pte(struct xe_gt *gt, struct xe_vm *vm,
 /**
  * xe_pt_create() - Create a page-table.
  * @vm: The vm to create for.
- * @gt: The gt to create for.
+ * @tile: The tile to create for.
  * @level: The page-table level.
  *
  * Allocate and initialize a single struct xe_pt metadata structure. Also
@@ -201,7 +199,7 @@ static u64 __xe_pt_empty_pte(struct xe_gt *gt, struct xe_vm *vm,
  * Return: A valid struct xe_pt pointer on success, Pointer error code on
  * error.
  */
-struct xe_pt *xe_pt_create(struct xe_vm *vm, struct xe_gt *gt,
+struct xe_pt *xe_pt_create(struct xe_vm *vm, struct xe_tile *tile,
 			   unsigned int level)
 {
 	struct xe_pt *pt;
@@ -215,9 +213,9 @@ struct xe_pt *xe_pt_create(struct xe_vm *vm, struct xe_gt *gt,
 	if (!pt)
 		return ERR_PTR(-ENOMEM);
 
-	bo = xe_bo_create_pin_map(vm->xe, gt, vm, SZ_4K,
+	bo = xe_bo_create_pin_map(vm->xe, tile, vm, SZ_4K,
 				  ttm_bo_type_kernel,
-				  XE_BO_CREATE_VRAM_IF_DGFX(gt) |
+				  XE_BO_CREATE_VRAM_IF_DGFX(tile) |
 				  XE_BO_CREATE_IGNORE_MIN_PAGE_SIZE_BIT |
 				  XE_BO_CREATE_PINNED_BIT);
 	if (IS_ERR(bo)) {
@@ -240,30 +238,28 @@ struct xe_pt *xe_pt_create(struct xe_vm *vm, struct xe_gt *gt,
 /**
  * xe_pt_populate_empty() - Populate a page-table bo with scratch- or zero
  * entries.
- * @gt: The gt the scratch pagetable of which to use.
+ * @tile: The tile the scratch pagetable of which to use.
  * @vm: The vm we populate for.
  * @pt: The pagetable the bo of which to initialize.
  *
- * Populate the page-table bo of @pt with entries pointing into the gt's
+ * Populate the page-table bo of @pt with entries pointing into the tile's
  * scratch page-table tree if any. Otherwise populate with zeros.
  */
-void xe_pt_populate_empty(struct xe_gt *gt, struct xe_vm *vm,
+void xe_pt_populate_empty(struct xe_tile *tile, struct xe_vm *vm,
 			  struct xe_pt *pt)
 {
 	struct iosys_map *map = &pt->bo->vmap;
 	u64 empty;
 	int i;
 
-	XE_BUG_ON(xe_gt_is_media_type(gt));
-
-	if (!vm->scratch_bo[gt->info.id]) {
+	if (!vm->scratch_bo[tile->id]) {
 		/*
 		 * FIXME: Some memory is allocated already allocated to zero?
 		 * Find out which memory that is and avoid this memset...
 		 */
 		xe_map_memset(vm->xe, map, 0, 0, SZ_4K);
 	} else {
-		empty = __xe_pt_empty_pte(gt, vm, pt->level);
+		empty = __xe_pt_empty_pte(tile, vm, pt->level);
 		for (i = 0; i < XE_PDES; i++)
 			xe_pt_write(vm->xe, map, i, empty);
 	}
@@ -317,9 +313,9 @@ void xe_pt_destroy(struct xe_pt *pt, u32 flags, struct llist_head *deferred)
 
 /**
  * xe_pt_create_scratch() - Setup a scratch memory pagetable tree for the
- * given gt and vm.
+ * given tile and vm.
  * @xe: xe device.
- * @gt: gt to set up for.
+ * @tile: tile to set up for.
  * @vm: vm to set up for.
  *
  * Sets up a pagetable tree with one page-table per level and a single
@@ -328,10 +324,10 @@ void xe_pt_destroy(struct xe_pt *pt, u32 flags, struct llist_head *deferred)
  *
  * Return: 0 on success, negative error code on error.
  */
-int xe_pt_create_scratch(struct xe_device *xe, struct xe_gt *gt,
+int xe_pt_create_scratch(struct xe_device *xe, struct xe_tile *tile,
 			 struct xe_vm *vm)
 {
-	u8 id = gt->info.id;
+	u8 id = tile->id;
 	unsigned int flags;
 	int i;
 
@@ -344,9 +340,9 @@ int xe_pt_create_scratch(struct xe_device *xe, struct xe_gt *gt,
 	if (vm->flags & XE_VM_FLAGS_64K)
 		flags |= XE_BO_CREATE_SYSTEM_BIT;
 	else
-		flags |= XE_BO_CREATE_VRAM_IF_DGFX(gt);
+		flags |= XE_BO_CREATE_VRAM_IF_DGFX(tile);
 
-	vm->scratch_bo[id] = xe_bo_create_pin_map(xe, gt, vm, SZ_4K,
+	vm->scratch_bo[id] = xe_bo_create_pin_map(xe, tile, vm, SZ_4K,
 						  ttm_bo_type_kernel,
 						  flags);
 	if (IS_ERR(vm->scratch_bo[id]))
@@ -356,11 +352,11 @@ int xe_pt_create_scratch(struct xe_device *xe, struct xe_gt *gt,
 		      vm->scratch_bo[id]->size);
 
 	for (i = 0; i < vm->pt_root[id]->level; i++) {
-		vm->scratch_pt[id][i] = xe_pt_create(vm, gt, i);
+		vm->scratch_pt[id][i] = xe_pt_create(vm, tile, i);
 		if (IS_ERR(vm->scratch_pt[id][i]))
 			return PTR_ERR(vm->scratch_pt[id][i]);
 
-		xe_pt_populate_empty(gt, vm, vm->scratch_pt[id][i]);
+		xe_pt_populate_empty(tile, vm, vm->scratch_pt[id][i]);
 	}
 
 	return 0;
@@ -409,8 +405,8 @@ struct xe_pt_stage_bind_walk {
 	/* Input parameters for the walk */
 	/** @vm: The vm we're building for. */
 	struct xe_vm *vm;
-	/** @gt: The gt we're building for. */
-	struct xe_gt *gt;
+	/** @tile: The tile we're building for. */
+	struct xe_tile *tile;
 	/** @cache: Desired cache level for the ptes */
 	enum xe_cache_level cache;
 	/** @default_pte: PTE flag only template. No address is associated */
@@ -678,7 +674,7 @@ xe_pt_stage_bind_entry(struct xe_ptw *parent, pgoff_t offset,
 	if (covers || !*child) {
 		u64 flags = 0;
 
-		xe_child = xe_pt_create(xe_walk->vm, xe_walk->gt, level - 1);
+		xe_child = xe_pt_create(xe_walk->vm, xe_walk->tile, level - 1);
 		if (IS_ERR(xe_child))
 			return PTR_ERR(xe_child);
 
@@ -686,7 +682,7 @@ xe_pt_stage_bind_entry(struct xe_ptw *parent, pgoff_t offset,
 			       round_down(addr, 1ull << walk->shifts[level]));
 
 		if (!covers)
-			xe_pt_populate_empty(xe_walk->gt, xe_walk->vm, xe_child);
+			xe_pt_populate_empty(xe_walk->tile, xe_walk->vm, xe_child);
 
 		*child = &xe_child->base;
 
@@ -695,7 +691,7 @@ xe_pt_stage_bind_entry(struct xe_ptw *parent, pgoff_t offset,
 		 * TODO: Suballocate the pt bo to avoid wasting a lot of
 		 * memory.
 		 */
-		if (GRAPHICS_VERx100(gt_to_xe(xe_walk->gt)) >= 1250 && level == 1 &&
+		if (GRAPHICS_VERx100(tile_to_xe(xe_walk->tile)) >= 1250 && level == 1 &&
 		    covers && xe_pt_scan_64K(addr, next, xe_walk)) {
 			walk->shifts = xe_compact_pt_shifts;
 			flags |= XE_PDE_64K;
@@ -718,7 +714,7 @@ static const struct xe_pt_walk_ops xe_pt_stage_bind_ops = {
 /**
  * xe_pt_stage_bind() - Build a disconnected page-table tree for a given address
  * range.
- * @gt: The gt we're building for.
+ * @tile: The tile we're building for.
  * @vma: The vma indicating the address range.
  * @entries: Storage for the update entries used for connecting the tree to
  * the main tree at commit time.
@@ -734,7 +730,7 @@ static const struct xe_pt_walk_ops xe_pt_stage_bind_ops = {
  * Return 0 on success, negative error code on error.
  */
 static int
-xe_pt_stage_bind(struct xe_gt *gt, struct xe_vma *vma,
+xe_pt_stage_bind(struct xe_tile *tile, struct xe_vma *vma,
 		 struct xe_vm_pgtable_update *entries, u32 *num_entries)
 {
 	struct xe_bo *bo = vma->bo;
@@ -747,14 +743,14 @@ xe_pt_stage_bind(struct xe_gt *gt, struct xe_vma *vma,
 			.max_level = XE_PT_HIGHEST_LEVEL,
 		},
 		.vm = vma->vm,
-		.gt = gt,
+		.tile = tile,
 		.curs = &curs,
 		.va_curs_start = vma->start,
 		.pte_flags = vma->pte_flags,
 		.wupd.entries = entries,
 		.needs_64K = (vma->vm->flags & XE_VM_FLAGS_64K) && is_vram,
 	};
-	struct xe_pt *pt = vma->vm->pt_root[gt->info.id];
+	struct xe_pt *pt = vma->vm->pt_root[tile->id];
 	int ret;
 
 	if (is_vram) {
@@ -764,7 +760,7 @@ xe_pt_stage_bind(struct xe_gt *gt, struct xe_vma *vma,
 		if (vma && vma->use_atomic_access_pte_bit)
 			xe_walk.default_pte |= XE_USM_PPGTT_PTE_AE;
 		xe_walk.dma_offset = bo_tile->mem.vram.io_start -
-			gt_to_xe(gt)->mem.vram.io_start;
+			tile_to_xe(tile)->mem.vram.io_start;
 		xe_walk.cache = XE_CACHE_WB;
 	} else {
 		if (!xe_vma_is_userptr(vma) && bo->flags & XE_BO_SCANOUT_BIT)
@@ -851,8 +847,8 @@ struct xe_pt_zap_ptes_walk {
 	struct xe_pt_walk base;
 
 	/* Input parameters for the walk */
-	/** @gt: The gt we're building for */
-	struct xe_gt *gt;
+	/** @tile: The tile we're building for */
+	struct xe_tile *tile;
 
 	/* Output */
 	/** @needs_invalidate: Whether we need to invalidate TLB*/
@@ -880,7 +876,7 @@ static int xe_pt_zap_ptes_entry(struct xe_ptw *parent, pgoff_t offset,
 	 */
 	if (xe_pt_nonshared_offsets(addr, next, --level, walk, action, &offset,
 				    &end_offset)) {
-		xe_map_memset(gt_to_xe(xe_walk->gt), &xe_child->bo->vmap,
+		xe_map_memset(tile_to_xe(xe_walk->tile), &xe_child->bo->vmap,
 			      offset * sizeof(u64), 0,
 			      (end_offset - offset) * sizeof(u64));
 		xe_walk->needs_invalidate = true;
@@ -895,7 +891,7 @@ static const struct xe_pt_walk_ops xe_pt_zap_ptes_ops = {
 
 /**
  * xe_pt_zap_ptes() - Zap (zero) gpu ptes of an address range
- * @gt: The gt we're zapping for.
+ * @tile: The tile we're zapping for.
  * @vma: GPU VMA detailing address range.
  *
  * Eviction and Userptr invalidation needs to be able to zap the
@@ -909,7 +905,7 @@ static const struct xe_pt_walk_ops xe_pt_zap_ptes_ops = {
  * Return: Whether ptes were actually updated and a TLB invalidation is
  * required.
  */
-bool xe_pt_zap_ptes(struct xe_gt *gt, struct xe_vma *vma)
+bool xe_pt_zap_ptes(struct xe_tile *tile, struct xe_vma *vma)
 {
 	struct xe_pt_zap_ptes_walk xe_walk = {
 		.base = {
@@ -917,11 +913,11 @@ bool xe_pt_zap_ptes(struct xe_gt *gt, struct xe_vma *vma)
 			.shifts = xe_normal_pt_shifts,
 			.max_level = XE_PT_HIGHEST_LEVEL,
 		},
-		.gt = gt,
+		.tile = tile,
 	};
-	struct xe_pt *pt = vma->vm->pt_root[gt->info.id];
+	struct xe_pt *pt = vma->vm->pt_root[tile->id];
 
-	if (!(vma->gt_present & BIT(gt->info.id)))
+	if (!(vma->tile_present & BIT(tile->id)))
 		return false;
 
 	(void)xe_pt_walk_shared(&pt->base, pt->level, vma->start, vma->end + 1,
@@ -931,7 +927,7 @@ bool xe_pt_zap_ptes(struct xe_gt *gt, struct xe_vma *vma)
 }
 
 static void
-xe_vm_populate_pgtable(struct xe_migrate_pt_update *pt_update, struct xe_gt *gt,
+xe_vm_populate_pgtable(struct xe_migrate_pt_update *pt_update, struct xe_tile *tile,
 		       struct iosys_map *map, void *data,
 		       u32 qword_ofs, u32 num_qwords,
 		       const struct xe_vm_pgtable_update *update)
@@ -940,11 +936,9 @@ xe_vm_populate_pgtable(struct xe_migrate_pt_update *pt_update, struct xe_gt *gt,
 	u64 *ptr = data;
 	u32 i;
 
-	XE_BUG_ON(xe_gt_is_media_type(gt));
-
 	for (i = 0; i < num_qwords; i++) {
 		if (map)
-			xe_map_wr(gt_to_xe(gt), map, (qword_ofs + i) *
+			xe_map_wr(tile_to_xe(tile), map, (qword_ofs + i) *
 				  sizeof(u64), u64, ptes[i].pte);
 		else
 			ptr[i] = ptes[i].pte;
@@ -1018,14 +1012,14 @@ static void xe_pt_commit_bind(struct xe_vma *vma,
 }
 
 static int
-xe_pt_prepare_bind(struct xe_gt *gt, struct xe_vma *vma,
+xe_pt_prepare_bind(struct xe_tile *tile, struct xe_vma *vma,
 		   struct xe_vm_pgtable_update *entries, u32 *num_entries,
 		   bool rebind)
 {
 	int err;
 
 	*num_entries = 0;
-	err = xe_pt_stage_bind(gt, vma, entries, num_entries);
+	err = xe_pt_stage_bind(tile, vma, entries, num_entries);
 	if (!err)
 		BUG_ON(!*num_entries);
 	else /* abort! */
@@ -1252,7 +1246,7 @@ static int invalidation_fence_init(struct xe_gt *gt,
 /**
  * __xe_pt_bind_vma() - Build and connect a page-table tree for the vma
  * address range.
- * @gt: The gt to bind for.
+ * @tile: The tile to bind for.
  * @vma: The vma to bind.
  * @e: The engine with which to do pipelined page-table updates.
  * @syncs: Entries to sync on before binding the built tree to the live vm tree.
@@ -1272,7 +1266,7 @@ static int invalidation_fence_init(struct xe_gt *gt,
  * on success, an error pointer on error.
  */
 struct dma_fence *
-__xe_pt_bind_vma(struct xe_gt *gt, struct xe_vma *vma, struct xe_engine *e,
+__xe_pt_bind_vma(struct xe_tile *tile, struct xe_vma *vma, struct xe_engine *e,
 		 struct xe_sync_entry *syncs, u32 num_syncs,
 		 bool rebind)
 {
@@ -1293,18 +1287,17 @@ __xe_pt_bind_vma(struct xe_gt *gt, struct xe_vma *vma, struct xe_engine *e,
 	bind_pt_update.locked = false;
 	xe_bo_assert_held(vma->bo);
 	xe_vm_assert_held(vm);
-	XE_BUG_ON(xe_gt_is_media_type(gt));
 
 	vm_dbg(&vma->vm->xe->drm,
 	       "Preparing bind, with range [%llx...%llx) engine %p.\n",
 	       vma->start, vma->end, e);
 
-	err = xe_pt_prepare_bind(gt, vma, entries, &num_entries, rebind);
+	err = xe_pt_prepare_bind(tile, vma, entries, &num_entries, rebind);
 	if (err)
 		goto err;
 	XE_BUG_ON(num_entries > ARRAY_SIZE(entries));
 
-	xe_vm_dbg_print_entries(gt_to_xe(gt), entries, num_entries);
+	xe_vm_dbg_print_entries(tile_to_xe(tile), entries, num_entries);
 
 	if (rebind && !xe_vm_no_dma_fences(vma->vm)) {
 		ifence = kzalloc(sizeof(*ifence), GFP_KERNEL);
@@ -1312,9 +1305,9 @@ __xe_pt_bind_vma(struct xe_gt *gt, struct xe_vma *vma, struct xe_engine *e,
 			return ERR_PTR(-ENOMEM);
 	}
 
-	fence = xe_migrate_update_pgtables(gt->migrate,
+	fence = xe_migrate_update_pgtables(tile->primary_gt.migrate,
 					   vm, vma->bo,
-					   e ? e : vm->eng[gt->info.id],
+					   e ? e : vm->eng[tile->id],
 					   entries, num_entries,
 					   syncs, num_syncs,
 					   &bind_pt_update.base);
@@ -1323,7 +1316,7 @@ __xe_pt_bind_vma(struct xe_gt *gt, struct xe_vma *vma, struct xe_engine *e,
 
 		/* TLB invalidation must be done before signaling rebind */
 		if (rebind && !xe_vm_no_dma_fences(vma->vm)) {
-			int err = invalidation_fence_init(gt, ifence, fence,
+			int err = invalidation_fence_init(&tile->primary_gt, ifence, fence,
 							  vma);
 			if (err) {
 				dma_fence_put(fence);
@@ -1346,7 +1339,7 @@ __xe_pt_bind_vma(struct xe_gt *gt, struct xe_vma *vma, struct xe_engine *e,
 				  bind_pt_update.locked ? &deferred : NULL);
 
 		/* This vma is live (again?) now */
-		vma->gt_present |= BIT(gt->info.id);
+		vma->tile_present |= BIT(tile->id);
 
 		if (bind_pt_update.locked) {
 			vma->userptr.initial_bind = true;
@@ -1375,8 +1368,8 @@ struct xe_pt_stage_unbind_walk {
 	struct xe_pt_walk base;
 
 	/* Input parameters for the walk */
-	/** @gt: The gt we're unbinding from. */
-	struct xe_gt *gt;
+	/** @tile: The tile we're unbinding from. */
+	struct xe_tile *tile;
 
 	/**
 	 * @modified_start: Walk range start, modified to include any
@@ -1481,7 +1474,7 @@ static const struct xe_pt_walk_ops xe_pt_stage_unbind_ops = {
 /**
  * xe_pt_stage_unbind() - Build page-table update structures for an unbind
  * operation
- * @gt: The gt we're unbinding for.
+ * @tile: The tile we're unbinding for.
  * @vma: The vma we're unbinding.
  * @entries: Caller-provided storage for the update structures.
  *
@@ -1492,7 +1485,7 @@ static const struct xe_pt_walk_ops xe_pt_stage_unbind_ops = {
  *
  * Return: The number of entries used.
  */
-static unsigned int xe_pt_stage_unbind(struct xe_gt *gt, struct xe_vma *vma,
+static unsigned int xe_pt_stage_unbind(struct xe_tile *tile, struct xe_vma *vma,
 				       struct xe_vm_pgtable_update *entries)
 {
 	struct xe_pt_stage_unbind_walk xe_walk = {
@@ -1501,12 +1494,12 @@ static unsigned int xe_pt_stage_unbind(struct xe_gt *gt, struct xe_vma *vma,
 			.shifts = xe_normal_pt_shifts,
 			.max_level = XE_PT_HIGHEST_LEVEL,
 		},
-		.gt = gt,
+		.tile = tile,
 		.modified_start = vma->start,
 		.modified_end = vma->end + 1,
 		.wupd.entries = entries,
 	};
-	struct xe_pt *pt = vma->vm->pt_root[gt->info.id];
+	struct xe_pt *pt = vma->vm->pt_root[tile->id];
 
 	(void)xe_pt_walk_shared(&pt->base, pt->level, vma->start, vma->end + 1,
 				 &xe_walk.base);
@@ -1516,19 +1509,17 @@ static unsigned int xe_pt_stage_unbind(struct xe_gt *gt, struct xe_vma *vma,
 
 static void
 xe_migrate_clear_pgtable_callback(struct xe_migrate_pt_update *pt_update,
-				  struct xe_gt *gt, struct iosys_map *map,
+				  struct xe_tile *tile, struct iosys_map *map,
 				  void *ptr, u32 qword_ofs, u32 num_qwords,
 				  const struct xe_vm_pgtable_update *update)
 {
 	struct xe_vma *vma = pt_update->vma;
-	u64 empty = __xe_pt_empty_pte(gt, vma->vm, update->pt->level);
+	u64 empty = __xe_pt_empty_pte(tile, vma->vm, update->pt->level);
 	int i;
 
-	XE_BUG_ON(xe_gt_is_media_type(gt));
-
 	if (map && map->is_iomem)
 		for (i = 0; i < num_qwords; ++i)
-			xe_map_wr(gt_to_xe(gt), map, (qword_ofs + i) *
+			xe_map_wr(tile_to_xe(tile), map, (qword_ofs + i) *
 				  sizeof(u64), u64, empty);
 	else if (map)
 		memset64(map->vaddr + qword_ofs * sizeof(u64), empty,
@@ -1579,7 +1570,7 @@ static const struct xe_migrate_pt_update_ops userptr_unbind_ops = {
 /**
  * __xe_pt_unbind_vma() - Disconnect and free a page-table tree for the vma
  * address range.
- * @gt: The gt to unbind for.
+ * @tile: The tile to unbind for.
  * @vma: The vma to unbind.
  * @e: The engine with which to do pipelined page-table updates.
  * @syncs: Entries to sync on before disconnecting the tree to be destroyed.
@@ -1597,7 +1588,7 @@ static const struct xe_migrate_pt_update_ops userptr_unbind_ops = {
  * on success, an error pointer on error.
  */
 struct dma_fence *
-__xe_pt_unbind_vma(struct xe_gt *gt, struct xe_vma *vma, struct xe_engine *e,
+__xe_pt_unbind_vma(struct xe_tile *tile, struct xe_vma *vma, struct xe_engine *e,
 		   struct xe_sync_entry *syncs, u32 num_syncs)
 {
 	struct xe_vm_pgtable_update entries[XE_VM_MAX_LEVEL * 2 + 1];
@@ -1616,16 +1607,15 @@ __xe_pt_unbind_vma(struct xe_gt *gt, struct xe_vma *vma, struct xe_engine *e,
 
 	xe_bo_assert_held(vma->bo);
 	xe_vm_assert_held(vm);
-	XE_BUG_ON(xe_gt_is_media_type(gt));
 
 	vm_dbg(&vma->vm->xe->drm,
 	       "Preparing unbind, with range [%llx...%llx) engine %p.\n",
 	       vma->start, vma->end, e);
 
-	num_entries = xe_pt_stage_unbind(gt, vma, entries);
+	num_entries = xe_pt_stage_unbind(tile, vma, entries);
 	XE_BUG_ON(num_entries > ARRAY_SIZE(entries));
 
-	xe_vm_dbg_print_entries(gt_to_xe(gt), entries, num_entries);
+	xe_vm_dbg_print_entries(tile_to_xe(tile), entries, num_entries);
 
 	ifence = kzalloc(sizeof(*ifence), GFP_KERNEL);
 	if (!ifence)
@@ -1636,9 +1626,9 @@ __xe_pt_unbind_vma(struct xe_gt *gt, struct xe_vma *vma, struct xe_engine *e,
 	 * clear again here. The eviction may have updated pagetables at a
 	 * lower level, because it needs to be more conservative.
 	 */
-	fence = xe_migrate_update_pgtables(gt->migrate,
+	fence = xe_migrate_update_pgtables(tile->primary_gt.migrate,
 					   vm, NULL, e ? e :
-					   vm->eng[gt->info.id],
+					   vm->eng[tile->id],
 					   entries, num_entries,
 					   syncs, num_syncs,
 					   &unbind_pt_update.base);
@@ -1646,7 +1636,7 @@ __xe_pt_unbind_vma(struct xe_gt *gt, struct xe_vma *vma, struct xe_engine *e,
 		int err;
 
 		/* TLB invalidation must be done before signaling unbind */
-		err = invalidation_fence_init(gt, ifence, fence, vma);
+		err = invalidation_fence_init(&tile->primary_gt, ifence, fence, vma);
 		if (err) {
 			dma_fence_put(fence);
 			kfree(ifence);
@@ -1664,18 +1654,18 @@ __xe_pt_unbind_vma(struct xe_gt *gt, struct xe_vma *vma, struct xe_engine *e,
 					   DMA_RESV_USAGE_BOOKKEEP);
 		xe_pt_commit_unbind(vma, entries, num_entries,
 				    unbind_pt_update.locked ? &deferred : NULL);
-		vma->gt_present &= ~BIT(gt->info.id);
+		vma->tile_present &= ~BIT(tile->id);
 	} else {
 		kfree(ifence);
 	}
 
-	if (!vma->gt_present)
+	if (!vma->tile_present)
 		list_del_init(&vma->rebind_link);
 
 	if (unbind_pt_update.locked) {
 		XE_WARN_ON(!xe_vma_is_userptr(vma));
 
-		if (!vma->gt_present) {
+		if (!vma->tile_present) {
 			spin_lock(&vm->userptr.invalidated_lock);
 			list_del_init(&vma->userptr.invalidate_link);
 			spin_unlock(&vm->userptr.invalidated_lock);
diff --git a/drivers/gpu/drm/xe/xe_pt.h b/drivers/gpu/drm/xe/xe_pt.h
index 1152043e5c63..10f334b9c004 100644
--- a/drivers/gpu/drm/xe/xe_pt.h
+++ b/drivers/gpu/drm/xe/xe_pt.h
@@ -13,8 +13,8 @@ struct dma_fence;
 struct xe_bo;
 struct xe_device;
 struct xe_engine;
-struct xe_gt;
 struct xe_sync_entry;
+struct xe_tile;
 struct xe_vm;
 struct xe_vma;
 
@@ -23,27 +23,27 @@ struct xe_vma;
 
 unsigned int xe_pt_shift(unsigned int level);
 
-struct xe_pt *xe_pt_create(struct xe_vm *vm, struct xe_gt *gt,
+struct xe_pt *xe_pt_create(struct xe_vm *vm, struct xe_tile *tile,
 			   unsigned int level);
 
-int xe_pt_create_scratch(struct xe_device *xe, struct xe_gt *gt,
+int xe_pt_create_scratch(struct xe_device *xe, struct xe_tile *tile,
 			 struct xe_vm *vm);
 
-void xe_pt_populate_empty(struct xe_gt *gt, struct xe_vm *vm,
+void xe_pt_populate_empty(struct xe_tile *tile, struct xe_vm *vm,
 			  struct xe_pt *pt);
 
 void xe_pt_destroy(struct xe_pt *pt, u32 flags, struct llist_head *deferred);
 
 struct dma_fence *
-__xe_pt_bind_vma(struct xe_gt *gt, struct xe_vma *vma, struct xe_engine *e,
+__xe_pt_bind_vma(struct xe_tile *tile, struct xe_vma *vma, struct xe_engine *e,
 		 struct xe_sync_entry *syncs, u32 num_syncs,
 		 bool rebind);
 
 struct dma_fence *
-__xe_pt_unbind_vma(struct xe_gt *gt, struct xe_vma *vma, struct xe_engine *e,
+__xe_pt_unbind_vma(struct xe_tile *tile, struct xe_vma *vma, struct xe_engine *e,
 		   struct xe_sync_entry *syncs, u32 num_syncs);
 
-bool xe_pt_zap_ptes(struct xe_gt *gt, struct xe_vma *vma);
+bool xe_pt_zap_ptes(struct xe_tile *tile, struct xe_vma *vma);
 
 u64 gen8_pde_encode(struct xe_bo *bo, u64 bo_offset,
 		    const enum xe_cache_level level);
diff --git a/drivers/gpu/drm/xe/xe_sa.c b/drivers/gpu/drm/xe/xe_sa.c
index c16f7c14ff52..fee71080bd31 100644
--- a/drivers/gpu/drm/xe/xe_sa.c
+++ b/drivers/gpu/drm/xe/xe_sa.c
@@ -11,7 +11,6 @@
 
 #include "xe_bo.h"
 #include "xe_device.h"
-#include "xe_gt.h"
 #include "xe_map.h"
 
 static void xe_sa_bo_manager_fini(struct drm_device *drm, void *arg)
@@ -33,14 +32,14 @@ static void xe_sa_bo_manager_fini(struct drm_device *drm, void *arg)
 	sa_manager->bo = NULL;
 }
 
-struct xe_sa_manager *xe_sa_bo_manager_init(struct xe_gt *gt, u32 size, u32 align)
+struct xe_sa_manager *xe_sa_bo_manager_init(struct xe_tile *tile, u32 size, u32 align)
 {
-	struct xe_device *xe = gt_to_xe(gt);
+	struct xe_device *xe = tile_to_xe(tile);
 	u32 managed_size = size - SZ_4K;
 	struct xe_bo *bo;
 	int ret;
 
-	struct xe_sa_manager *sa_manager = drmm_kzalloc(&gt_to_xe(gt)->drm,
+	struct xe_sa_manager *sa_manager = drmm_kzalloc(&tile_to_xe(tile)->drm,
 							sizeof(*sa_manager),
 							GFP_KERNEL);
 	if (!sa_manager)
@@ -48,8 +47,8 @@ struct xe_sa_manager *xe_sa_bo_manager_init(struct xe_gt *gt, u32 size, u32 alig
 
 	sa_manager->bo = NULL;
 
-	bo = xe_bo_create_pin_map(xe, gt, NULL, size, ttm_bo_type_kernel,
-				  XE_BO_CREATE_VRAM_IF_DGFX(gt) |
+	bo = xe_bo_create_pin_map(xe, tile, NULL, size, ttm_bo_type_kernel,
+				  XE_BO_CREATE_VRAM_IF_DGFX(tile) |
 				  XE_BO_CREATE_GGTT_BIT);
 	if (IS_ERR(bo)) {
 		drm_err(&xe->drm, "failed to allocate bo for sa manager: %ld\n",
@@ -90,7 +89,7 @@ struct drm_suballoc *xe_sa_bo_new(struct xe_sa_manager *sa_manager,
 void xe_sa_bo_flush_write(struct drm_suballoc *sa_bo)
 {
 	struct xe_sa_manager *sa_manager = to_xe_sa_manager(sa_bo->manager);
-	struct xe_device *xe = gt_to_xe(sa_manager->bo->gt);
+	struct xe_device *xe = tile_to_xe(sa_manager->bo->tile);
 
 	if (!sa_manager->bo->vmap.is_iomem)
 		return;
diff --git a/drivers/gpu/drm/xe/xe_sa.h b/drivers/gpu/drm/xe/xe_sa.h
index 3063fb34c720..4e96483057d7 100644
--- a/drivers/gpu/drm/xe/xe_sa.h
+++ b/drivers/gpu/drm/xe/xe_sa.h
@@ -9,9 +9,9 @@
 
 struct dma_fence;
 struct xe_bo;
-struct xe_gt;
+struct xe_tile;
 
-struct xe_sa_manager *xe_sa_bo_manager_init(struct xe_gt *gt, u32 size, u32 align);
+struct xe_sa_manager *xe_sa_bo_manager_init(struct xe_tile *tile, u32 size, u32 align);
 
 struct drm_suballoc *xe_sa_bo_new(struct xe_sa_manager *sa_manager,
 				  u32 size);
diff --git a/drivers/gpu/drm/xe/xe_tile.c b/drivers/gpu/drm/xe/xe_tile.c
index 9553d252b56c..c322e7a7b677 100644
--- a/drivers/gpu/drm/xe/xe_tile.c
+++ b/drivers/gpu/drm/xe/xe_tile.c
@@ -7,6 +7,7 @@
 
 #include "xe_device.h"
 #include "xe_ggtt.h"
+#include "xe_sa.h"
 #include "xe_tile.h"
 #include "xe_ttm_vram_mgr.h"
 
@@ -67,6 +68,12 @@ int xe_tile_init_noalloc(struct xe_tile *tile)
 		goto err_mem_access;
 
 	err = xe_ggtt_init_noalloc(tile->mem.ggtt);
+	if (err)
+		goto err_mem_access;
+
+	tile->mem.kernel_bb_pool = xe_sa_bo_manager_init(tile, SZ_1M, 16);
+	if (IS_ERR(tile->mem.kernel_bb_pool))
+		err = PTR_ERR(tile->mem.kernel_bb_pool);
 
 err_mem_access:
 	xe_device_mem_access_put(tile_to_xe(tile));
diff --git a/drivers/gpu/drm/xe/xe_uc_fw.c b/drivers/gpu/drm/xe/xe_uc_fw.c
index 5c3a571d2a29..e862e57e1e16 100644
--- a/drivers/gpu/drm/xe/xe_uc_fw.c
+++ b/drivers/gpu/drm/xe/xe_uc_fw.c
@@ -320,6 +320,7 @@ int xe_uc_fw_init(struct xe_uc_fw *uc_fw)
 {
 	struct xe_device *xe = uc_fw_to_xe(uc_fw);
 	struct xe_gt *gt = uc_fw_to_gt(uc_fw);
+	struct xe_tile *tile = gt_to_tile(gt);
 	struct device *dev = xe->drm.dev;
 	const struct firmware *fw = NULL;
 	struct uc_css_header *css;
@@ -409,9 +410,9 @@ int xe_uc_fw_init(struct xe_uc_fw *uc_fw)
 	if (uc_fw->type == XE_UC_FW_TYPE_GUC)
 		guc_read_css_info(uc_fw, css);
 
-	obj = xe_bo_create_from_data(xe, gt, fw->data, fw->size,
+	obj = xe_bo_create_from_data(xe, tile, fw->data, fw->size,
 				     ttm_bo_type_kernel,
-				     XE_BO_CREATE_VRAM_IF_DGFX(gt) |
+				     XE_BO_CREATE_VRAM_IF_DGFX(tile) |
 				     XE_BO_CREATE_GGTT_BIT);
 	if (IS_ERR(obj)) {
 		drm_notice(&xe->drm, "%s firmware %s: failed to create / populate bo",
diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
index fe6abb6ed6ca..632f7538a6d5 100644
--- a/drivers/gpu/drm/xe/xe_vm.c
+++ b/drivers/gpu/drm/xe/xe_vm.c
@@ -463,7 +463,7 @@ int xe_vm_lock_dma_resv(struct xe_vm *vm, struct ww_acquire_ctx *ww,
 		xe_bo_assert_held(vma->bo);
 
 		list_del_init(&vma->notifier.rebind_link);
-		if (vma->gt_present && !vma->destroyed)
+		if (vma->tile_present && !vma->destroyed)
 			list_move_tail(&vma->rebind_link, &vm->rebind_list);
 	}
 	spin_unlock(&vm->notifier.list_lock);
@@ -701,7 +701,7 @@ static bool vma_userptr_invalidate(struct mmu_interval_notifier *mni,
 	 * Tell exec and rebind worker they need to repin and rebind this
 	 * userptr.
 	 */
-	if (!xe_vm_in_fault_mode(vm) && !vma->destroyed && vma->gt_present) {
+	if (!xe_vm_in_fault_mode(vm) && !vma->destroyed && vma->tile_present) {
 		spin_lock(&vm->userptr.invalidated_lock);
 		list_move_tail(&vma->userptr.invalidate_link,
 			       &vm->userptr.invalidated);
@@ -819,7 +819,7 @@ struct dma_fence *xe_vm_rebind(struct xe_vm *vm, bool rebind_worker)
 
 	xe_vm_assert_held(vm);
 	list_for_each_entry_safe(vma, next, &vm->rebind_list, rebind_link) {
-		XE_WARN_ON(!vma->gt_present);
+		XE_WARN_ON(!vma->tile_present);
 
 		list_del_init(&vma->rebind_link);
 		dma_fence_put(fence);
@@ -840,10 +840,10 @@ static struct xe_vma *xe_vma_create(struct xe_vm *vm,
 				    u64 bo_offset_or_userptr,
 				    u64 start, u64 end,
 				    bool read_only,
-				    u64 gt_mask)
+				    u64 tile_mask)
 {
 	struct xe_vma *vma;
-	struct xe_gt *gt;
+	struct xe_tile *tile;
 	u8 id;
 
 	XE_BUG_ON(start >= end);
@@ -868,12 +868,11 @@ static struct xe_vma *xe_vma_create(struct xe_vm *vm,
 	if (read_only)
 		vma->pte_flags = XE_PTE_READ_ONLY;
 
-	if (gt_mask) {
-		vma->gt_mask = gt_mask;
+	if (tile_mask) {
+		vma->tile_mask = tile_mask;
 	} else {
-		for_each_gt(gt, vm->xe, id)
-			if (!xe_gt_is_media_type(gt))
-				vma->gt_mask |= 0x1 << id;
+		for_each_tile(tile, vm->xe, id)
+			vma->tile_mask |= 0x1 << id;
 	}
 
 	if (vm->xe->info.platform == XE_PVC)
@@ -1102,8 +1101,8 @@ static void vm_destroy_work_func(struct work_struct *w);
 struct xe_vm *xe_vm_create(struct xe_device *xe, u32 flags)
 {
 	struct xe_vm *vm;
-	int err, i = 0, number_gts = 0;
-	struct xe_gt *gt;
+	int err, i = 0, number_tiles = 0;
+	struct xe_tile *tile;
 	u8 id;
 
 	vm = kzalloc(sizeof(*vm), GFP_KERNEL);
@@ -1155,15 +1154,12 @@ struct xe_vm *xe_vm_create(struct xe_device *xe, u32 flags)
 	if (IS_DGFX(xe) && xe->info.vram_flags & XE_VRAM_FLAGS_NEED64K)
 		vm->flags |= XE_VM_FLAGS_64K;
 
-	for_each_gt(gt, xe, id) {
-		if (xe_gt_is_media_type(gt))
-			continue;
-
+	for_each_tile(tile, xe, id) {
 		if (flags & XE_VM_FLAG_MIGRATION &&
-		    gt->info.id != XE_VM_FLAG_GT_ID(flags))
+		    tile->id != XE_VM_FLAG_GT_ID(flags))
 			continue;
 
-		vm->pt_root[id] = xe_pt_create(vm, gt, xe->info.vm_max_level);
+		vm->pt_root[id] = xe_pt_create(vm, tile, xe->info.vm_max_level);
 		if (IS_ERR(vm->pt_root[id])) {
 			err = PTR_ERR(vm->pt_root[id]);
 			vm->pt_root[id] = NULL;
@@ -1172,11 +1168,11 @@ struct xe_vm *xe_vm_create(struct xe_device *xe, u32 flags)
 	}
 
 	if (flags & XE_VM_FLAG_SCRATCH_PAGE) {
-		for_each_gt(gt, xe, id) {
+		for_each_tile(tile, xe, id) {
 			if (!vm->pt_root[id])
 				continue;
 
-			err = xe_pt_create_scratch(xe, gt, vm);
+			err = xe_pt_create_scratch(xe, tile, vm);
 			if (err)
 				goto err_scratch_pt;
 		}
@@ -1193,17 +1189,18 @@ struct xe_vm *xe_vm_create(struct xe_device *xe, u32 flags)
 	}
 
 	/* Fill pt_root after allocating scratch tables */
-	for_each_gt(gt, xe, id) {
+	for_each_tile(tile, xe, id) {
 		if (!vm->pt_root[id])
 			continue;
 
-		xe_pt_populate_empty(gt, vm, vm->pt_root[id]);
+		xe_pt_populate_empty(tile, vm, vm->pt_root[id]);
 	}
 	dma_resv_unlock(&vm->resv);
 
 	/* Kernel migration VM shouldn't have a circular loop.. */
 	if (!(flags & XE_VM_FLAG_MIGRATION)) {
-		for_each_gt(gt, xe, id) {
+		for_each_tile(tile, xe, id) {
+			struct xe_gt *gt = &tile->primary_gt;
 			struct xe_vm *migrate_vm;
 			struct xe_engine *eng;
 
@@ -1220,11 +1217,11 @@ struct xe_vm *xe_vm_create(struct xe_device *xe, u32 flags)
 				return ERR_CAST(eng);
 			}
 			vm->eng[id] = eng;
-			number_gts++;
+			number_tiles++;
 		}
 	}
 
-	if (number_gts > 1)
+	if (number_tiles > 1)
 		vm->composite_fence_ctx = dma_fence_context_alloc(1);
 
 	mutex_lock(&xe->usm.lock);
@@ -1239,7 +1236,7 @@ struct xe_vm *xe_vm_create(struct xe_device *xe, u32 flags)
 	return vm;
 
 err_scratch_pt:
-	for_each_gt(gt, xe, id) {
+	for_each_tile(tile, xe, id) {
 		if (!vm->pt_root[id])
 			continue;
 
@@ -1252,7 +1249,7 @@ struct xe_vm *xe_vm_create(struct xe_device *xe, u32 flags)
 		xe_bo_put(vm->scratch_bo[id]);
 	}
 err_destroy_root:
-	for_each_gt(gt, xe, id) {
+	for_each_tile(tile, xe, id) {
 		if (vm->pt_root[id])
 			xe_pt_destroy(vm->pt_root[id], vm->flags, NULL);
 	}
@@ -1309,7 +1306,7 @@ void xe_vm_close_and_put(struct xe_vm *vm)
 	struct rb_root contested = RB_ROOT;
 	struct ww_acquire_ctx ww;
 	struct xe_device *xe = vm->xe;
-	struct xe_gt *gt;
+	struct xe_tile *tile;
 	u8 id;
 
 	XE_BUG_ON(vm->preempt.num_engines);
@@ -1320,7 +1317,7 @@ void xe_vm_close_and_put(struct xe_vm *vm)
 	if (xe_vm_in_compute_mode(vm))
 		flush_work(&vm->preempt.rebind_work);
 
-	for_each_gt(gt, xe, id) {
+	for_each_tile(tile, xe, id) {
 		if (vm->eng[id]) {
 			xe_engine_kill(vm->eng[id]);
 			xe_engine_put(vm->eng[id]);
@@ -1357,7 +1354,7 @@ void xe_vm_close_and_put(struct xe_vm *vm)
 	 * install a fence to resv. Hence it's safe to
 	 * destroy the pagetables immediately.
 	 */
-	for_each_gt(gt, xe, id) {
+	for_each_tile(tile, xe, id) {
 		if (vm->scratch_bo[id]) {
 			u32 i;
 
@@ -1407,7 +1404,7 @@ static void vm_destroy_work_func(struct work_struct *w)
 		container_of(w, struct xe_vm, destroy_work);
 	struct ww_acquire_ctx ww;
 	struct xe_device *xe = vm->xe;
-	struct xe_gt *gt;
+	struct xe_tile *tile;
 	u8 id;
 	void *lookup;
 
@@ -1432,7 +1429,7 @@ static void vm_destroy_work_func(struct work_struct *w)
 	 * can be moved to xe_vm_close_and_put.
 	 */
 	xe_vm_lock(vm, &ww, 0, false);
-	for_each_gt(gt, xe, id) {
+	for_each_tile(tile, xe, id) {
 		if (vm->pt_root[id]) {
 			xe_pt_destroy(vm->pt_root[id], vm->flags, NULL);
 			vm->pt_root[id] = NULL;
@@ -1468,11 +1465,9 @@ struct xe_vm *xe_vm_lookup(struct xe_file *xef, u32 id)
 	return vm;
 }
 
-u64 xe_vm_pdp4_descriptor(struct xe_vm *vm, struct xe_gt *full_gt)
+u64 xe_vm_pdp4_descriptor(struct xe_vm *vm, struct xe_tile *tile)
 {
-	XE_BUG_ON(xe_gt_is_media_type(full_gt));
-
-	return gen8_pde_encode(vm->pt_root[full_gt->info.id]->bo, 0,
+	return gen8_pde_encode(vm->pt_root[tile->id]->bo, 0,
 			       XE_CACHE_WB);
 }
 
@@ -1480,32 +1475,30 @@ static struct dma_fence *
 xe_vm_unbind_vma(struct xe_vma *vma, struct xe_engine *e,
 		 struct xe_sync_entry *syncs, u32 num_syncs)
 {
-	struct xe_gt *gt;
+	struct xe_tile *tile;
 	struct dma_fence *fence = NULL;
 	struct dma_fence **fences = NULL;
 	struct dma_fence_array *cf = NULL;
 	struct xe_vm *vm = vma->vm;
 	int cur_fence = 0, i;
-	int number_gts = hweight_long(vma->gt_present);
+	int number_tiles = hweight_long(vma->tile_present);
 	int err;
 	u8 id;
 
 	trace_xe_vma_unbind(vma);
 
-	if (number_gts > 1) {
-		fences = kmalloc_array(number_gts, sizeof(*fences),
+	if (number_tiles > 1) {
+		fences = kmalloc_array(number_tiles, sizeof(*fences),
 				       GFP_KERNEL);
 		if (!fences)
 			return ERR_PTR(-ENOMEM);
 	}
 
-	for_each_gt(gt, vm->xe, id) {
-		if (!(vma->gt_present & BIT(id)))
+	for_each_tile(tile, vm->xe, id) {
+		if (!(vma->tile_present & BIT(id)))
 			goto next;
 
-		XE_BUG_ON(xe_gt_is_media_type(gt));
-
-		fence = __xe_pt_unbind_vma(gt, vma, e, syncs, num_syncs);
+		fence = __xe_pt_unbind_vma(tile, vma, e, syncs, num_syncs);
 		if (IS_ERR(fence)) {
 			err = PTR_ERR(fence);
 			goto err_fences;
@@ -1520,7 +1513,7 @@ xe_vm_unbind_vma(struct xe_vma *vma, struct xe_engine *e,
 	}
 
 	if (fences) {
-		cf = dma_fence_array_create(number_gts, fences,
+		cf = dma_fence_array_create(number_tiles, fences,
 					    vm->composite_fence_ctx,
 					    vm->composite_fence_seqno++,
 					    false);
@@ -1552,32 +1545,31 @@ static struct dma_fence *
 xe_vm_bind_vma(struct xe_vma *vma, struct xe_engine *e,
 	       struct xe_sync_entry *syncs, u32 num_syncs)
 {
-	struct xe_gt *gt;
+	struct xe_tile *tile;
 	struct dma_fence *fence;
 	struct dma_fence **fences = NULL;
 	struct dma_fence_array *cf = NULL;
 	struct xe_vm *vm = vma->vm;
 	int cur_fence = 0, i;
-	int number_gts = hweight_long(vma->gt_mask);
+	int number_tiles = hweight_long(vma->tile_mask);
 	int err;
 	u8 id;
 
 	trace_xe_vma_bind(vma);
 
-	if (number_gts > 1) {
-		fences = kmalloc_array(number_gts, sizeof(*fences),
+	if (number_tiles > 1) {
+		fences = kmalloc_array(number_tiles, sizeof(*fences),
 				       GFP_KERNEL);
 		if (!fences)
 			return ERR_PTR(-ENOMEM);
 	}
 
-	for_each_gt(gt, vm->xe, id) {
-		if (!(vma->gt_mask & BIT(id)))
+	for_each_tile(tile, vm->xe, id) {
+		if (!(vma->tile_mask & BIT(id)))
 			goto next;
 
-		XE_BUG_ON(xe_gt_is_media_type(gt));
-		fence = __xe_pt_bind_vma(gt, vma, e, syncs, num_syncs,
-					 vma->gt_present & BIT(id));
+		fence = __xe_pt_bind_vma(tile, vma, e, syncs, num_syncs,
+					 vma->tile_present & BIT(id));
 		if (IS_ERR(fence)) {
 			err = PTR_ERR(fence);
 			goto err_fences;
@@ -1592,7 +1584,7 @@ xe_vm_bind_vma(struct xe_vma *vma, struct xe_engine *e,
 	}
 
 	if (fences) {
-		cf = dma_fence_array_create(number_gts, fences,
+		cf = dma_fence_array_create(number_tiles, fences,
 					    vm->composite_fence_ctx,
 					    vm->composite_fence_seqno++,
 					    false);
@@ -1980,7 +1972,7 @@ static int xe_vm_prefetch(struct xe_vm *vm, struct xe_vma *vma,
 			return err;
 	}
 
-	if (vma->gt_mask != (vma->gt_present & ~vma->usm.gt_invalidated)) {
+	if (vma->tile_mask != (vma->tile_present & ~vma->usm.tile_invalidated)) {
 		return xe_vm_bind(vm, vma, e, vma->bo, syncs, num_syncs,
 				  afence);
 	} else {
@@ -2616,7 +2608,7 @@ static struct xe_vma *vm_unbind_lookup_vmas(struct xe_vm *vm,
 					  first->start,
 					  lookup->start - 1,
 					  (first->pte_flags & XE_PTE_READ_ONLY),
-					  first->gt_mask);
+					  first->tile_mask);
 		if (first->bo)
 			xe_bo_unlock(first->bo, &ww);
 		if (!new_first) {
@@ -2647,7 +2639,7 @@ static struct xe_vma *vm_unbind_lookup_vmas(struct xe_vm *vm,
 					 last->start + chunk,
 					 last->end,
 					 (last->pte_flags & XE_PTE_READ_ONLY),
-					 last->gt_mask);
+					 last->tile_mask);
 		if (last->bo)
 			xe_bo_unlock(last->bo, &ww);
 		if (!new_last) {
@@ -2783,7 +2775,7 @@ static struct xe_vma *vm_bind_ioctl_lookup_vma(struct xe_vm *vm,
 					       struct xe_bo *bo,
 					       u64 bo_offset_or_userptr,
 					       u64 addr, u64 range, u32 op,
-					       u64 gt_mask, u32 region)
+					       u64 tile_mask, u32 region)
 {
 	struct ww_acquire_ctx ww;
 	struct xe_vma *vma, lookup;
@@ -2804,7 +2796,7 @@ static struct xe_vma *vm_bind_ioctl_lookup_vma(struct xe_vm *vm,
 		vma = xe_vma_create(vm, bo, bo_offset_or_userptr, addr,
 				    addr + range - 1,
 				    op & XE_VM_BIND_FLAG_READONLY,
-				    gt_mask);
+				    tile_mask);
 		xe_bo_unlock(bo, &ww);
 		if (!vma)
 			return ERR_PTR(-ENOMEM);
@@ -2844,7 +2836,7 @@ static struct xe_vma *vm_bind_ioctl_lookup_vma(struct xe_vm *vm,
 		vma = xe_vma_create(vm, NULL, bo_offset_or_userptr, addr,
 				    addr + range - 1,
 				    op & XE_VM_BIND_FLAG_READONLY,
-				    gt_mask);
+				    tile_mask);
 		if (!vma)
 			return ERR_PTR(-ENOMEM);
 
@@ -3072,11 +3064,11 @@ int xe_vm_bind_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
 			goto put_engine;
 		}
 
-		if (bind_ops[i].gt_mask) {
-			u64 valid_gts = BIT(xe->info.tile_count) - 1;
+		if (bind_ops[i].tile_mask) {
+			u64 valid_tiles = BIT(xe->info.tile_count) - 1;
 
-			if (XE_IOCTL_ERR(xe, bind_ops[i].gt_mask &
-					 ~valid_gts)) {
+			if (XE_IOCTL_ERR(xe, bind_ops[i].tile_mask &
+					 ~valid_tiles)) {
 				err = -EINVAL;
 				goto put_engine;
 			}
@@ -3167,11 +3159,11 @@ int xe_vm_bind_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
 		u64 addr = bind_ops[i].addr;
 		u32 op = bind_ops[i].op;
 		u64 obj_offset = bind_ops[i].obj_offset;
-		u64 gt_mask = bind_ops[i].gt_mask;
+		u64 tile_mask = bind_ops[i].tile_mask;
 		u32 region = bind_ops[i].region;
 
 		vmas[i] = vm_bind_ioctl_lookup_vma(vm, bos[i], obj_offset,
-						   addr, range, op, gt_mask,
+						   addr, range, op, tile_mask,
 						   region);
 		if (IS_ERR(vmas[i])) {
 			err = PTR_ERR(vmas[i]);
@@ -3345,8 +3337,8 @@ void xe_vm_unlock(struct xe_vm *vm, struct ww_acquire_ctx *ww)
 int xe_vm_invalidate_vma(struct xe_vma *vma)
 {
 	struct xe_device *xe = vma->vm->xe;
-	struct xe_gt *gt;
-	u32 gt_needs_invalidate = 0;
+	struct xe_tile *tile;
+	u32 tile_needs_invalidate = 0;
 	int seqno[XE_MAX_TILES_PER_DEVICE];
 	u8 id;
 	int ret;
@@ -3368,25 +3360,29 @@ int xe_vm_invalidate_vma(struct xe_vma *vma)
 		}
 	}
 
-	for_each_gt(gt, xe, id) {
-		if (xe_pt_zap_ptes(gt, vma)) {
-			gt_needs_invalidate |= BIT(id);
+	for_each_tile(tile, xe, id) {
+		if (xe_pt_zap_ptes(tile, vma)) {
+			tile_needs_invalidate |= BIT(id);
 			xe_device_wmb(xe);
-			seqno[id] = xe_gt_tlb_invalidation_vma(gt, NULL, vma);
+			/*
+			 * FIXME: We potentially need to invalidate multiple
+			 * GTs within the tile
+			 */
+			seqno[id] = xe_gt_tlb_invalidation_vma(&tile->primary_gt, NULL, vma);
 			if (seqno[id] < 0)
 				return seqno[id];
 		}
 	}
 
-	for_each_gt(gt, xe, id) {
-		if (gt_needs_invalidate & BIT(id)) {
-			ret = xe_gt_tlb_invalidation_wait(gt, seqno[id]);
+	for_each_tile(tile, xe, id) {
+		if (tile_needs_invalidate & BIT(id)) {
+			ret = xe_gt_tlb_invalidation_wait(&tile->primary_gt, seqno[id]);
 			if (ret < 0)
 				return ret;
 		}
 	}
 
-	vma->usm.gt_invalidated = vma->gt_mask;
+	vma->usm.tile_invalidated = vma->tile_mask;
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/xe/xe_vm.h b/drivers/gpu/drm/xe/xe_vm.h
index 748dc16ebed9..372f26153209 100644
--- a/drivers/gpu/drm/xe/xe_vm.h
+++ b/drivers/gpu/drm/xe/xe_vm.h
@@ -54,7 +54,7 @@ xe_vm_find_overlapping_vma(struct xe_vm *vm, const struct xe_vma *vma);
 
 #define xe_vm_assert_held(vm) dma_resv_assert_held(&(vm)->resv)
 
-u64 xe_vm_pdp4_descriptor(struct xe_vm *vm, struct xe_gt *full_gt);
+u64 xe_vm_pdp4_descriptor(struct xe_vm *vm, struct xe_tile *tile);
 
 int xe_vm_create_ioctl(struct drm_device *dev, void *data,
 		       struct drm_file *file);
diff --git a/drivers/gpu/drm/xe/xe_vm_types.h b/drivers/gpu/drm/xe/xe_vm_types.h
index 203ba9d946b8..c45c5daeeaa7 100644
--- a/drivers/gpu/drm/xe/xe_vm_types.h
+++ b/drivers/gpu/drm/xe/xe_vm_types.h
@@ -37,17 +37,17 @@ struct xe_vma {
 	/** @bo_offset: offset into BO if not a userptr, unused for userptr */
 	u64 bo_offset;
 
-	/** @gt_mask: GT mask of where to create binding for this VMA */
-	u64 gt_mask;
+	/** @tile_mask: Tile mask of where to create binding for this VMA */
+	u64 tile_mask;
 
 	/**
-	 * @gt_present: GT mask of binding are present for this VMA.
+	 * @tile_present: GT mask of binding are present for this VMA.
 	 * protected by vm->lock, vm->resv and for userptrs,
 	 * vm->userptr.notifier_lock for writing. Needs either for reading,
 	 * but if reading is done under the vm->lock only, it needs to be held
 	 * in write mode.
 	 */
-	u64 gt_present;
+	u64 tile_present;
 
 	/**
 	 * @destroyed: VMA is destroyed, in the sense that it shouldn't be
@@ -132,8 +132,8 @@ struct xe_vma {
 
 	/** @usm: unified shared memory state */
 	struct {
-		/** @gt_invalidated: VMA has been invalidated */
-		u64 gt_invalidated;
+		/** @tile_invalidated: VMA has been invalidated */
+		u64 tile_invalidated;
 	} usm;
 
 	struct {
diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
index b0b80aae3ee8..11a0ede7155e 100644
--- a/include/uapi/drm/xe_drm.h
+++ b/include/uapi/drm/xe_drm.h
@@ -398,10 +398,10 @@ struct drm_xe_vm_bind_op {
 	__u64 addr;
 
 	/**
-	 * @gt_mask: Mask for which GTs to create binds for, 0 == All GTs,
+	 * @tile_mask: Mask for which tiles to create binds for, 0 == All tiles,
 	 * only applies to creating new VMAs
 	 */
-	__u64 gt_mask;
+	__u64 tile_mask;
 
 	/** @op: Operation to perform (lower 16 bits) and flags (upper 16 bits) */
 	__u32 op;
-- 
2.40.0


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [Intel-xe] [PATCH 08/26] drm/xe: Move migration from GT to tile
  2023-05-11  3:46 [Intel-xe] [PATCH 00/26] Separate GT and tile Matt Roper
                   ` (6 preceding siblings ...)
  2023-05-11  3:47 ` [Intel-xe] [PATCH 07/26] drm/xe: Memory allocations are tile-based, not GT-based Matt Roper
@ 2023-05-11  3:47 ` Matt Roper
  2023-05-17  5:00   ` Lucas De Marchi
  2023-05-11  3:47 ` [Intel-xe] [PATCH 09/26] drm/xe: Clarify 'gt' retrieval for primary tile Matt Roper
                   ` (26 subsequent siblings)
  34 siblings, 1 reply; 75+ messages in thread
From: Matt Roper @ 2023-05-11  3:47 UTC (permalink / raw)
  To: intel-xe; +Cc: matthew.d.roper

Migration primarily focuses on the memory associated with a tile, so it
makes more sense to track this at the tile level (especially since the
driver was already skipping migration operations on media GTs).

Note that the blitter engine used to perform the migration always lives
in the tile's primary GT today.  In theory that could change if media
GTs ever start including blitter engines in the future, but we can
extend the design if/when that happens in the future.

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/xe/xe_bo.c           |  6 +--
 drivers/gpu/drm/xe/xe_bo_evict.c     | 14 +++----
 drivers/gpu/drm/xe/xe_device_types.h |  3 ++
 drivers/gpu/drm/xe/xe_engine.c       |  2 +-
 drivers/gpu/drm/xe/xe_gt.c           | 28 ++++---------
 drivers/gpu/drm/xe/xe_gt.h           |  1 -
 drivers/gpu/drm/xe/xe_gt_pagefault.c |  2 +-
 drivers/gpu/drm/xe/xe_gt_types.h     |  3 --
 drivers/gpu/drm/xe/xe_migrate.c      | 61 +++++++++++++---------------
 drivers/gpu/drm/xe/xe_migrate.h      |  4 +-
 drivers/gpu/drm/xe/xe_pt.c           |  4 +-
 drivers/gpu/drm/xe/xe_tile.c         | 10 ++++-
 drivers/gpu/drm/xe/xe_tile.h         |  2 +
 drivers/gpu/drm/xe/xe_vm.c           |  2 +-
 drivers/gpu/drm/xe/xe_vm_types.h     |  2 +-
 15 files changed, 68 insertions(+), 76 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
index 9d613fc5d309..a596f2619e0f 100644
--- a/drivers/gpu/drm/xe/xe_bo.c
+++ b/drivers/gpu/drm/xe/xe_bo.c
@@ -637,7 +637,7 @@ static int xe_bo_move(struct ttm_buffer_object *ttm_bo, bool evict,
 		tile = mem_type_to_tile(xe, old_mem->mem_type);
 
 	XE_BUG_ON(!tile);
-	XE_BUG_ON(!tile->primary_gt.migrate);
+	XE_BUG_ON(!tile->migrate);
 
 	trace_xe_bo_move(bo);
 	xe_device_mem_access_get(xe);
@@ -675,9 +675,9 @@ static int xe_bo_move(struct ttm_buffer_object *ttm_bo, bool evict,
 		}
 	} else {
 		if (move_lacks_source)
-			fence = xe_migrate_clear(tile->primary_gt.migrate, bo, new_mem);
+			fence = xe_migrate_clear(tile->migrate, bo, new_mem);
 		else
-			fence = xe_migrate_copy(tile->primary_gt.migrate, bo, old_mem, new_mem);
+			fence = xe_migrate_copy(tile->migrate, bo, old_mem, new_mem);
 		if (IS_ERR(fence)) {
 			ret = PTR_ERR(fence);
 			xe_device_mem_access_put(xe);
diff --git a/drivers/gpu/drm/xe/xe_bo_evict.c b/drivers/gpu/drm/xe/xe_bo_evict.c
index 9226195bd560..f559a7f3eb3e 100644
--- a/drivers/gpu/drm/xe/xe_bo_evict.c
+++ b/drivers/gpu/drm/xe/xe_bo_evict.c
@@ -8,7 +8,7 @@
 #include "xe_bo.h"
 #include "xe_device.h"
 #include "xe_ggtt.h"
-#include "xe_gt.h"
+#include "xe_tile.h"
 
 /**
  * xe_bo_evict_all - evict all BOs from VRAM
@@ -29,7 +29,7 @@ int xe_bo_evict_all(struct xe_device *xe)
 	struct ttm_device *bdev = &xe->ttm;
 	struct ww_acquire_ctx ww;
 	struct xe_bo *bo;
-	struct xe_gt *gt;
+	struct xe_tile *tile;
 	struct list_head still_in_list;
 	u32 mem_type;
 	u8 id;
@@ -83,8 +83,8 @@ int xe_bo_evict_all(struct xe_device *xe)
 	 * Wait for all user BO to be evicted as those evictions depend on the
 	 * memory moved below.
 	 */
-	for_each_gt(gt, xe, id)
-		xe_gt_migrate_wait(gt);
+	for_each_tile(tile, xe, id)
+		xe_tile_migrate_wait(tile);
 
 	spin_lock(&xe->pinned.lock);
 	for (;;) {
@@ -186,7 +186,7 @@ int xe_bo_restore_user(struct xe_device *xe)
 {
 	struct ww_acquire_ctx ww;
 	struct xe_bo *bo;
-	struct xe_gt *gt;
+	struct xe_tile *tile;
 	struct list_head still_in_list;
 	u8 id;
 	int ret;
@@ -224,8 +224,8 @@ int xe_bo_restore_user(struct xe_device *xe)
 	spin_unlock(&xe->pinned.lock);
 
 	/* Wait for validate to complete */
-	for_each_gt(gt, xe, id)
-		xe_gt_migrate_wait(gt);
+	for_each_tile(tile, xe, id)
+		xe_tile_migrate_wait(tile);
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
index c6365b6f14ba..fa76750a9a5f 100644
--- a/drivers/gpu/drm/xe/xe_device_types.h
+++ b/drivers/gpu/drm/xe/xe_device_types.h
@@ -139,6 +139,9 @@ struct xe_tile {
 		 */
 		struct xe_sa_manager *kernel_bb_pool;
 	} mem;
+
+	/** @migrate: Migration helper for vram blits and clearing */
+	struct xe_migrate *migrate;
 };
 
 /**
diff --git a/drivers/gpu/drm/xe/xe_engine.c b/drivers/gpu/drm/xe/xe_engine.c
index 094ec17d3004..2caa368fedda 100644
--- a/drivers/gpu/drm/xe/xe_engine.c
+++ b/drivers/gpu/drm/xe/xe_engine.c
@@ -557,7 +557,7 @@ int xe_engine_create_ioctl(struct drm_device *dev, void *data,
 			if (XE_IOCTL_ERR(xe, !hwe))
 				return -EINVAL;
 
-			migrate_vm = xe_migrate_get_vm(gt->migrate);
+			migrate_vm = xe_migrate_get_vm(gt_to_tile(gt)->migrate);
 			new = xe_engine_create(xe, migrate_vm, logical_mask,
 					       args->width, hwe,
 					       ENGINE_FLAG_PERSISTENT |
diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c
index d769bc93d15c..297ee32ad928 100644
--- a/drivers/gpu/drm/xe/xe_gt.c
+++ b/drivers/gpu/drm/xe/xe_gt.c
@@ -42,16 +42,6 @@
 #include "xe_wa.h"
 #include "xe_wopcm.h"
 
-struct xe_gt *xe_find_full_gt(struct xe_gt *gt)
-{
-	/*
-	 * FIXME: Media GTs are disabled at the moment.  Once re-enabled,
-	 * the proper handling here is to return the primary GT from the
-	 * parameter GT's tile.
-	 */
-	return gt;
-}
-
 int xe_gt_alloc(struct xe_device *xe, struct xe_gt *gt)
 {
 	XE_BUG_ON(gt->info.type == XE_GT_TYPE_UNINITIALIZED);
@@ -169,6 +159,7 @@ static int emit_wa_job(struct xe_gt *gt, struct xe_engine *e)
 int xe_gt_record_default_lrcs(struct xe_gt *gt)
 {
 	struct xe_device *xe = gt_to_xe(gt);
+	struct xe_tile *tile = gt_to_tile(gt);
 	struct xe_hw_engine *hwe;
 	enum xe_hw_engine_id id;
 	int err = 0;
@@ -192,7 +183,7 @@ int xe_gt_record_default_lrcs(struct xe_gt *gt)
 		if (!default_lrc)
 			return -ENOMEM;
 
-		vm = xe_migrate_get_vm(gt->migrate);
+		vm = xe_migrate_get_vm(tile->migrate);
 		e = xe_engine_create(xe, vm, BIT(hwe->logical_instance), 1,
 				     hwe, ENGINE_FLAG_WA);
 		if (IS_ERR(e)) {
@@ -377,13 +368,13 @@ static int all_fw_domain_init(struct xe_gt *gt)
 	}
 
 	if (!xe_gt_is_media_type(gt)) {
-		gt->migrate = xe_migrate_init(gt);
-		if (IS_ERR(gt->migrate)) {
-			err = PTR_ERR(gt->migrate);
+		struct xe_tile *tile = gt_to_tile(gt);
+
+		tile->migrate = xe_migrate_init(tile);
+		if (IS_ERR(tile->migrate)) {
+			err = PTR_ERR(tile->migrate);
 			goto err_force_wake;
 		}
-	} else {
-		gt->migrate = xe_find_full_gt(gt)->migrate;
 	}
 
 	err = xe_uc_init_hw(&gt->uc);
@@ -644,11 +635,6 @@ int xe_gt_resume(struct xe_gt *gt)
 	return err;
 }
 
-void xe_gt_migrate_wait(struct xe_gt *gt)
-{
-	xe_migrate_wait(gt->migrate);
-}
-
 struct xe_hw_engine *xe_gt_hw_engine(struct xe_gt *gt,
 				     enum xe_engine_class class,
 				     u16 instance, bool logical)
diff --git a/drivers/gpu/drm/xe/xe_gt.h b/drivers/gpu/drm/xe/xe_gt.h
index f4e98f499b36..c8abbeb0fb96 100644
--- a/drivers/gpu/drm/xe/xe_gt.h
+++ b/drivers/gpu/drm/xe/xe_gt.h
@@ -25,7 +25,6 @@ void xe_gt_suspend_prepare(struct xe_gt *gt);
 int xe_gt_suspend(struct xe_gt *gt);
 int xe_gt_resume(struct xe_gt *gt);
 void xe_gt_reset_async(struct xe_gt *gt);
-void xe_gt_migrate_wait(struct xe_gt *gt);
 void xe_gt_sanitize(struct xe_gt *gt);
 
 struct xe_gt *xe_find_full_gt(struct xe_gt *gt);
diff --git a/drivers/gpu/drm/xe/xe_gt_pagefault.c b/drivers/gpu/drm/xe/xe_gt_pagefault.c
index 1c2b23ae89cf..73db7f7c0381 100644
--- a/drivers/gpu/drm/xe/xe_gt_pagefault.c
+++ b/drivers/gpu/drm/xe/xe_gt_pagefault.c
@@ -208,7 +208,7 @@ static int handle_pagefault(struct xe_gt *gt, struct pagefault *pf)
 
 	/* Bind VMA only to the GT that has faulted */
 	trace_xe_vma_pf_bind(vma);
-	fence = __xe_pt_bind_vma(tile, vma, xe_gt_migrate_engine(gt), NULL, 0,
+	fence = __xe_pt_bind_vma(tile, vma, xe_tile_migrate_engine(tile), NULL, 0,
 				 vma->tile_present & BIT(tile->id));
 	if (IS_ERR(fence)) {
 		ret = PTR_ERR(fence);
diff --git a/drivers/gpu/drm/xe/xe_gt_types.h b/drivers/gpu/drm/xe/xe_gt_types.h
index 6e239ce738c1..8a5f9122ba80 100644
--- a/drivers/gpu/drm/xe/xe_gt_types.h
+++ b/drivers/gpu/drm/xe/xe_gt_types.h
@@ -278,9 +278,6 @@ struct xe_gt {
 	/** @hw_engines: hardware engines on the GT */
 	struct xe_hw_engine hw_engines[XE_NUM_HW_ENGINES];
 
-	/** @migrate: Migration helper for vram blits and clearing */
-	struct xe_migrate *migrate;
-
 	/** @pcode: GT's PCODE */
 	struct {
 		/** @lock: protecting GT's PCODE mailbox data */
diff --git a/drivers/gpu/drm/xe/xe_migrate.c b/drivers/gpu/drm/xe/xe_migrate.c
index 031a0bde5585..c3a37109bc64 100644
--- a/drivers/gpu/drm/xe/xe_migrate.c
+++ b/drivers/gpu/drm/xe/xe_migrate.c
@@ -36,8 +36,8 @@
 struct xe_migrate {
 	/** @eng: Default engine used for migration */
 	struct xe_engine *eng;
-	/** @gt: Backpointer to the gt this struct xe_migrate belongs to. */
-	struct xe_gt *gt;
+	/** @gt: Backpointer to the tile this struct xe_migrate belongs to. */
+	struct xe_tile *tile;
 	/** @job_mutex: Timeline mutex for @eng. */
 	struct mutex job_mutex;
 	/** @pt_bo: Page-table buffer object. */
@@ -70,17 +70,17 @@ struct xe_migrate {
 #define NUM_PT_PER_BLIT (MAX_PREEMPTDISABLE_TRANSFER / SZ_2M)
 
 /**
- * xe_gt_migrate_engine() - Get this gt's migrate engine.
- * @gt: The gt.
+ * xe_tile_migrate_engine() - Get this tile's migrate engine.
+ * @tile: The tile.
  *
- * Returns the default migrate engine of this gt.
+ * Returns the default migrate engine of this tile.
  * TODO: Perhaps this function is slightly misplaced, and even unneeded?
  *
  * Return: The default migrate engine
  */
-struct xe_engine *xe_gt_migrate_engine(struct xe_gt *gt)
+struct xe_engine *xe_tile_migrate_engine(struct xe_tile *tile)
 {
-	return gt->migrate->eng;
+	return tile->migrate->eng;
 }
 
 static void xe_migrate_fini(struct drm_device *dev, void *arg)
@@ -128,8 +128,7 @@ static u64 xe_migrate_vram_ofs(u64 addr)
  */
 static int xe_migrate_create_cleared_bo(struct xe_migrate *m, struct xe_vm *vm)
 {
-	struct xe_gt *gt = m->gt;
-	struct xe_tile *tile = gt_to_tile(gt);
+	struct xe_tile *tile = m->tile;
 	struct xe_device *xe = vm->xe;
 	size_t cleared_size;
 	u64 vram_addr;
@@ -155,14 +154,13 @@ static int xe_migrate_create_cleared_bo(struct xe_migrate *m, struct xe_vm *vm)
 	return 0;
 }
 
-static int xe_migrate_prepare_vm(struct xe_gt *gt, struct xe_migrate *m,
+static int xe_migrate_prepare_vm(struct xe_tile *tile, struct xe_migrate *m,
 				 struct xe_vm *vm)
 {
-	u8 id = gt->info.id;
+	struct xe_device *xe = tile_to_xe(tile);
+	u8 id = tile->id;
 	u32 num_entries = NUM_PT_SLOTS, num_level = vm->pt_root[id]->level;
 	u32 map_ofs, level, i;
-	struct xe_device *xe = gt_to_xe(m->gt);
-	struct xe_tile *tile = gt_to_tile(m->gt);
 	struct xe_bo *bo, *batch = tile->mem.kernel_bb_pool->bo;
 	u64 entry;
 	int ret;
@@ -231,7 +229,7 @@ static int xe_migrate_prepare_vm(struct xe_gt *gt, struct xe_migrate *m,
 		m->batch_base_ofs = xe_migrate_vram_ofs(batch_addr);
 
 		if (xe->info.supports_usm) {
-			batch = gt->usm.bb_pool->bo;
+			batch = tile->primary_gt.usm.bb_pool->bo;
 			batch_addr = xe_bo_addr(batch, 0, XE_PAGE_SIZE,
 						&is_vram);
 			m->usm_batch_base_ofs = xe_migrate_vram_ofs(batch_addr);
@@ -308,34 +306,33 @@ static int xe_migrate_prepare_vm(struct xe_gt *gt, struct xe_migrate *m,
 
 /**
  * xe_migrate_init() - Initialize a migrate context
- * @gt: Back-pointer to the gt we're initializing for.
+ * @tile: Back-pointer to the tile we're initializing for.
  *
  * Return: Pointer to a migrate context on success. Error pointer on error.
  */
-struct xe_migrate *xe_migrate_init(struct xe_gt *gt)
+struct xe_migrate *xe_migrate_init(struct xe_tile *tile)
 {
-	struct xe_device *xe = gt_to_xe(gt);
+	struct xe_device *xe = tile_to_xe(tile);
+	struct xe_gt *primary_gt = &tile->primary_gt;
 	struct xe_migrate *m;
 	struct xe_vm *vm;
 	struct ww_acquire_ctx ww;
 	int err;
 
-	XE_BUG_ON(xe_gt_is_media_type(gt));
-
 	m = drmm_kzalloc(&xe->drm, sizeof(*m), GFP_KERNEL);
 	if (!m)
 		return ERR_PTR(-ENOMEM);
 
-	m->gt = gt;
+	m->tile = tile;
 
 	/* Special layout, prepared below.. */
 	vm = xe_vm_create(xe, XE_VM_FLAG_MIGRATION |
-			  XE_VM_FLAG_SET_GT_ID(gt));
+			  XE_VM_FLAG_SET_TILE_ID(tile));
 	if (IS_ERR(vm))
 		return ERR_CAST(vm);
 
 	xe_vm_lock(vm, &ww, 0, false);
-	err = xe_migrate_prepare_vm(gt, m, vm);
+	err = xe_migrate_prepare_vm(tile, m, vm);
 	xe_vm_unlock(vm, &ww);
 	if (err) {
 		xe_vm_close_and_put(vm);
@@ -343,9 +340,9 @@ struct xe_migrate *xe_migrate_init(struct xe_gt *gt)
 	}
 
 	if (xe->info.supports_usm) {
-		struct xe_hw_engine *hwe = xe_gt_hw_engine(gt,
+		struct xe_hw_engine *hwe = xe_gt_hw_engine(primary_gt,
 							   XE_ENGINE_CLASS_COPY,
-							   gt->usm.reserved_bcs_instance,
+							   primary_gt->usm.reserved_bcs_instance,
 							   false);
 		if (!hwe)
 			return ERR_PTR(-EINVAL);
@@ -354,7 +351,7 @@ struct xe_migrate *xe_migrate_init(struct xe_gt *gt)
 					  BIT(hwe->logical_instance), 1,
 					  hwe, ENGINE_FLAG_KERNEL);
 	} else {
-		m->eng = xe_engine_create_class(xe, gt, vm,
+		m->eng = xe_engine_create_class(xe, primary_gt, vm,
 						XE_ENGINE_CLASS_COPY,
 						ENGINE_FLAG_KERNEL);
 	}
@@ -549,7 +546,7 @@ static u32 xe_migrate_ccs_copy(struct xe_migrate *m,
 			       u64 dst_ofs, bool dst_is_vram, u32 dst_size,
 			       u64 ccs_ofs, bool copy_ccs)
 {
-	struct xe_gt *gt = m->gt;
+	struct xe_gt *gt = &m->tile->primary_gt;
 	u32 flush_flags = 0;
 
 	if (xe_device_has_flat_ccs(gt_to_xe(gt)) && !copy_ccs && dst_is_vram) {
@@ -604,7 +601,7 @@ struct dma_fence *xe_migrate_copy(struct xe_migrate *m,
 				  struct ttm_resource *src,
 				  struct ttm_resource *dst)
 {
-	struct xe_gt *gt = m->gt;
+	struct xe_gt *gt = &m->tile->primary_gt;
 	struct xe_device *xe = gt_to_xe(gt);
 	struct dma_fence *fence = NULL;
 	u64 size = bo->size;
@@ -856,7 +853,7 @@ struct dma_fence *xe_migrate_clear(struct xe_migrate *m,
 				   struct ttm_resource *dst)
 {
 	bool clear_vram = mem_type_is_vram(dst->mem_type);
-	struct xe_gt *gt = m->gt;
+	struct xe_gt *gt = &m->tile->primary_gt;
 	struct xe_device *xe = gt_to_xe(gt);
 	struct dma_fence *fence = NULL;
 	u64 size = bo->size;
@@ -1063,7 +1060,7 @@ xe_migrate_update_pgtables_cpu(struct xe_migrate *m,
 	for (i = 0; i < num_updates; i++) {
 		const struct xe_vm_pgtable_update *update = &updates[i];
 
-		ops->populate(pt_update, gt_to_tile(m->gt), &update->pt_bo->vmap, NULL,
+		ops->populate(pt_update, m->tile, &update->pt_bo->vmap, NULL,
 			      update->ofs, update->qwords, update);
 	}
 
@@ -1130,9 +1127,9 @@ xe_migrate_update_pgtables(struct xe_migrate *m,
 			   struct xe_migrate_pt_update *pt_update)
 {
 	const struct xe_migrate_pt_update_ops *ops = pt_update->ops;
-	struct xe_gt *gt = m->gt;
-	struct xe_tile *tile = gt_to_tile(m->gt);
-	struct xe_device *xe = gt_to_xe(gt);
+	struct xe_tile *tile = m->tile;
+	struct xe_gt *gt = &tile->primary_gt;
+	struct xe_device *xe = tile_to_xe(tile);
 	struct xe_sched_job *job;
 	struct dma_fence *fence;
 	struct drm_suballoc *sa_bo = NULL;
diff --git a/drivers/gpu/drm/xe/xe_migrate.h b/drivers/gpu/drm/xe/xe_migrate.h
index e07b2a8845c0..df68a8976194 100644
--- a/drivers/gpu/drm/xe/xe_migrate.h
+++ b/drivers/gpu/drm/xe/xe_migrate.h
@@ -71,7 +71,7 @@ struct xe_migrate_pt_update {
 	struct xe_vma *vma;
 };
 
-struct xe_migrate *xe_migrate_init(struct xe_gt *gt);
+struct xe_migrate *xe_migrate_init(struct xe_tile *tile);
 
 struct dma_fence *xe_migrate_copy(struct xe_migrate *m,
 				  struct xe_bo *bo,
@@ -96,5 +96,5 @@ xe_migrate_update_pgtables(struct xe_migrate *m,
 
 void xe_migrate_wait(struct xe_migrate *m);
 
-struct xe_engine *xe_gt_migrate_engine(struct xe_gt *gt);
+struct xe_engine *xe_tile_migrate_engine(struct xe_tile *tile);
 #endif
diff --git a/drivers/gpu/drm/xe/xe_pt.c b/drivers/gpu/drm/xe/xe_pt.c
index ea68e6b38133..a606cd1a7e3a 100644
--- a/drivers/gpu/drm/xe/xe_pt.c
+++ b/drivers/gpu/drm/xe/xe_pt.c
@@ -1305,7 +1305,7 @@ __xe_pt_bind_vma(struct xe_tile *tile, struct xe_vma *vma, struct xe_engine *e,
 			return ERR_PTR(-ENOMEM);
 	}
 
-	fence = xe_migrate_update_pgtables(tile->primary_gt.migrate,
+	fence = xe_migrate_update_pgtables(tile->migrate,
 					   vm, vma->bo,
 					   e ? e : vm->eng[tile->id],
 					   entries, num_entries,
@@ -1626,7 +1626,7 @@ __xe_pt_unbind_vma(struct xe_tile *tile, struct xe_vma *vma, struct xe_engine *e
 	 * clear again here. The eviction may have updated pagetables at a
 	 * lower level, because it needs to be more conservative.
 	 */
-	fence = xe_migrate_update_pgtables(tile->primary_gt.migrate,
+	fence = xe_migrate_update_pgtables(tile->migrate,
 					   vm, NULL, e ? e :
 					   vm->eng[tile->id],
 					   entries, num_entries,
diff --git a/drivers/gpu/drm/xe/xe_tile.c b/drivers/gpu/drm/xe/xe_tile.c
index c322e7a7b677..996d52b28562 100644
--- a/drivers/gpu/drm/xe/xe_tile.c
+++ b/drivers/gpu/drm/xe/xe_tile.c
@@ -7,6 +7,7 @@
 
 #include "xe_device.h"
 #include "xe_ggtt.h"
+#include "xe_migrate.h"
 #include "xe_sa.h"
 #include "xe_tile.h"
 #include "xe_ttm_vram_mgr.h"
@@ -72,10 +73,17 @@ int xe_tile_init_noalloc(struct xe_tile *tile)
 		goto err_mem_access;
 
 	tile->mem.kernel_bb_pool = xe_sa_bo_manager_init(tile, SZ_1M, 16);
-	if (IS_ERR(tile->mem.kernel_bb_pool))
+	if (IS_ERR(tile->mem.kernel_bb_pool)) {
 		err = PTR_ERR(tile->mem.kernel_bb_pool);
+		goto err_mem_access;
+	}
 
 err_mem_access:
 	xe_device_mem_access_put(tile_to_xe(tile));
 	return err;
 }
+
+void xe_tile_migrate_wait(struct xe_tile *tile)
+{
+	xe_migrate_wait(tile->migrate);
+}
diff --git a/drivers/gpu/drm/xe/xe_tile.h b/drivers/gpu/drm/xe/xe_tile.h
index 49b64d83ce91..80c13ebfe3bf 100644
--- a/drivers/gpu/drm/xe/xe_tile.h
+++ b/drivers/gpu/drm/xe/xe_tile.h
@@ -11,4 +11,6 @@ struct xe_tile;
 int xe_tile_alloc(struct xe_tile *tile);
 int xe_tile_init_noalloc(struct xe_tile *tile);
 
+void xe_tile_migrate_wait(struct xe_tile *tile);
+
 #endif
diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
index 632f7538a6d5..6beb73b40dca 100644
--- a/drivers/gpu/drm/xe/xe_vm.c
+++ b/drivers/gpu/drm/xe/xe_vm.c
@@ -1207,7 +1207,7 @@ struct xe_vm *xe_vm_create(struct xe_device *xe, u32 flags)
 			if (!vm->pt_root[id])
 				continue;
 
-			migrate_vm = xe_migrate_get_vm(gt->migrate);
+			migrate_vm = xe_migrate_get_vm(tile->migrate);
 			eng = xe_engine_create_class(xe, gt, migrate_vm,
 						     XE_ENGINE_CLASS_COPY,
 						     ENGINE_FLAG_VM);
diff --git a/drivers/gpu/drm/xe/xe_vm_types.h b/drivers/gpu/drm/xe/xe_vm_types.h
index c45c5daeeaa7..76af6ac0fa84 100644
--- a/drivers/gpu/drm/xe/xe_vm_types.h
+++ b/drivers/gpu/drm/xe/xe_vm_types.h
@@ -179,7 +179,7 @@ struct xe_vm {
 #define XE_VM_FLAG_SCRATCH_PAGE		BIT(4)
 #define XE_VM_FLAG_FAULT_MODE		BIT(5)
 #define XE_VM_FLAG_GT_ID(flags)		(((flags) >> 6) & 0x3)
-#define XE_VM_FLAG_SET_GT_ID(gt)	((gt)->info.id << 6)
+#define XE_VM_FLAG_SET_TILE_ID(tile)	((tile)->id << 6)
 	unsigned long flags;
 
 	/** @composite_fence_ctx: context composite fence */
-- 
2.40.0


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [Intel-xe] [PATCH 09/26] drm/xe: Clarify 'gt' retrieval for primary tile
  2023-05-11  3:46 [Intel-xe] [PATCH 00/26] Separate GT and tile Matt Roper
                   ` (7 preceding siblings ...)
  2023-05-11  3:47 ` [Intel-xe] [PATCH 08/26] drm/xe: Move migration from GT to tile Matt Roper
@ 2023-05-11  3:47 ` Matt Roper
  2023-05-17  5:07   ` Lucas De Marchi
  2023-05-11  3:47 ` [Intel-xe] [PATCH 10/26] drm/xe: Drop vram_id Matt Roper
                   ` (25 subsequent siblings)
  34 siblings, 1 reply; 75+ messages in thread
From: Matt Roper @ 2023-05-11  3:47 UTC (permalink / raw)
  To: intel-xe; +Cc: matthew.d.roper

There are a bunch of places in the driver where we need to perform
non-GT MMIO against the platform's primary tile (display code, top-level
interrupt enable/disable, driver initialization, etc.).  Rename
'to_gt()' to 'xe_primary_mmio_gt()' to clarify that we're trying to get
a primary MMIO handle for these top-level operations.

In the future we need to move away from xe_gt as the target for MMIO
operations (most of which are completely unrelated to GT).

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/xe/compat-i915-headers/intel_uncore.h | 2 +-
 drivers/gpu/drm/xe/xe_device.c                        | 2 +-
 drivers/gpu/drm/xe/xe_device.h                        | 9 +++++++--
 drivers/gpu/drm/xe/xe_irq.c                           | 6 +++---
 drivers/gpu/drm/xe/xe_mmio.c                          | 8 ++++----
 drivers/gpu/drm/xe/xe_query.c                         | 2 +-
 drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c                | 4 ++--
 7 files changed, 19 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/xe/compat-i915-headers/intel_uncore.h b/drivers/gpu/drm/xe/compat-i915-headers/intel_uncore.h
index 14f195fe275d..6eff72311773 100644
--- a/drivers/gpu/drm/xe/compat-i915-headers/intel_uncore.h
+++ b/drivers/gpu/drm/xe/compat-i915-headers/intel_uncore.h
@@ -14,7 +14,7 @@ static inline struct xe_gt *__fake_uncore_to_gt(struct fake_uncore *uncore)
 {
 	struct xe_device *xe = container_of(uncore, struct xe_device, uncore);
 
-	return to_gt(xe);
+	return xe_primary_mmio_gt(xe);
 }
 
 static inline u32 intel_uncore_read(struct fake_uncore *uncore,
diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
index 038074a90584..c93c8895862f 100644
--- a/drivers/gpu/drm/xe/xe_device.c
+++ b/drivers/gpu/drm/xe/xe_device.c
@@ -397,7 +397,7 @@ static void device_kill_persistent_engines(struct xe_device *xe,
 
 void xe_device_wmb(struct xe_device *xe)
 {
-	struct xe_gt *gt = xe_device_get_gt(xe, 0);
+	struct xe_gt *gt = xe_primary_mmio_gt(xe);
 
 	wmb();
 	if (IS_DGFX(xe))
diff --git a/drivers/gpu/drm/xe/xe_device.h b/drivers/gpu/drm/xe/xe_device.h
index 745dbb16d417..fc2655484dfd 100644
--- a/drivers/gpu/drm/xe/xe_device.h
+++ b/drivers/gpu/drm/xe/xe_device.h
@@ -66,9 +66,14 @@ static inline struct xe_gt *xe_device_get_gt(struct xe_device *xe, u8 gt_id)
 }
 
 /*
- * FIXME: Placeholder until multi-gt lands. Once that lands, kill this function.
+ * Provide a GT structure suitable for performing non-GT MMIO operations against
+ * the primary tile.  Primarily intended for early tile initialization, display
+ * handling, top-most interrupt enable/disable, etc.
+ *
+ * FIXME: Fix the driver design so that 'gt' isn't the target of all MMIO
+ * operations.
  */
-static inline struct xe_gt *to_gt(struct xe_device *xe)
+static inline struct xe_gt *xe_primary_mmio_gt(struct xe_device *xe)
 {
 	return &xe_device_get_root_tile(xe)->primary_gt;
 }
diff --git a/drivers/gpu/drm/xe/xe_irq.c b/drivers/gpu/drm/xe/xe_irq.c
index 5be31855d789..806121009102 100644
--- a/drivers/gpu/drm/xe/xe_irq.c
+++ b/drivers/gpu/drm/xe/xe_irq.c
@@ -285,7 +285,7 @@ static void gt_irq_handler(struct xe_device *xe, struct xe_gt *gt,
 static irqreturn_t xelp_irq_handler(int irq, void *arg)
 {
 	struct xe_device *xe = arg;
-	struct xe_gt *gt = xe_device_get_gt(xe, 0);	/* Only 1 GT here */
+	struct xe_gt *gt = xe_primary_mmio_gt(xe);
 	u32 master_ctl, gu_misc_iir;
 	long unsigned int intr_dw[2];
 	u32 identity[32];
@@ -311,7 +311,7 @@ static irqreturn_t xelp_irq_handler(int irq, void *arg)
 
 static u32 dg1_intr_disable(struct xe_device *xe)
 {
-	struct xe_gt *gt = xe_device_get_gt(xe, 0);
+	struct xe_gt *gt = xe_primary_mmio_gt(xe);
 	u32 val;
 
 	/* First disable interrupts */
@@ -329,7 +329,7 @@ static u32 dg1_intr_disable(struct xe_device *xe)
 
 static void dg1_intr_enable(struct xe_device *xe, bool stall)
 {
-	struct xe_gt *gt = xe_device_get_gt(xe, 0);
+	struct xe_gt *gt = xe_primary_mmio_gt(xe);
 
 	xe_mmio_write32(gt, DG1_MSTR_TILE_INTR, DG1_MSTR_IRQ);
 	if (stall)
diff --git a/drivers/gpu/drm/xe/xe_mmio.c b/drivers/gpu/drm/xe/xe_mmio.c
index 17b3a9880409..90423c8c7e76 100644
--- a/drivers/gpu/drm/xe/xe_mmio.c
+++ b/drivers/gpu/drm/xe/xe_mmio.c
@@ -150,7 +150,7 @@ static bool xe_pci_resource_valid(struct pci_dev *pdev, int bar)
 
 int xe_mmio_total_vram_size(struct xe_device *xe, u64 *vram_size, u64 *usable_size)
 {
-	struct xe_gt *gt = xe_device_get_gt(xe, 0);
+	struct xe_gt *gt = xe_primary_mmio_gt(xe);
 	struct pci_dev *pdev = to_pci_dev(xe->drm.dev);
 	int err;
 	u32 reg_val;
@@ -287,7 +287,7 @@ int xe_mmio_probe_vram(struct xe_device *xe)
 
 static void xe_mmio_probe_tiles(struct xe_device *xe)
 {
-	struct xe_gt *gt = xe_device_get_gt(xe, 0);
+	struct xe_gt *gt = xe_primary_mmio_gt(xe);
 	u32 mtcfg;
 	u8 adj_tile_count;
 	u8 id;
@@ -339,7 +339,7 @@ static void mmio_fini(struct drm_device *drm, void *arg)
 int xe_mmio_init(struct xe_device *xe)
 {
 	struct xe_tile *root_tile = xe_device_get_root_tile(xe);
-	struct xe_gt *gt = xe_device_get_gt(xe, 0);
+	struct xe_gt *gt = xe_primary_mmio_gt(xe);
 	const int mmio_bar = 0;
 	int err;
 
@@ -398,7 +398,7 @@ int xe_mmio_ioctl(struct drm_device *dev, void *data,
 		  struct drm_file *file)
 {
 	struct xe_device *xe = to_xe_device(dev);
-	struct xe_gt *gt = xe_device_get_gt(xe, 0);
+	struct xe_gt *gt = xe_primary_mmio_gt(xe);
 	struct drm_xe_mmio *args = data;
 	unsigned int bits_flag, bytes;
 	struct xe_reg reg;
diff --git a/drivers/gpu/drm/xe/xe_query.c b/drivers/gpu/drm/xe/xe_query.c
index c81652d7f4ec..4d8473328962 100644
--- a/drivers/gpu/drm/xe/xe_query.c
+++ b/drivers/gpu/drm/xe/xe_query.c
@@ -259,7 +259,7 @@ static int query_gts(struct xe_device *xe, struct drm_xe_device_query *query)
 static int query_hwconfig(struct xe_device *xe,
 			  struct drm_xe_device_query *query)
 {
-	struct xe_gt *gt = xe_device_get_gt(xe, 0);
+	struct xe_gt *gt = xe_primary_mmio_gt(xe);
 	size_t size = xe_guc_hwconfig_size(&gt->uc.guc);
 	void __user *query_ptr = u64_to_user_ptr(query->data);
 	void *hwconfig;
diff --git a/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c b/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c
index a3855870321f..6d12e5dd9981 100644
--- a/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c
+++ b/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c
@@ -54,7 +54,7 @@ bool xe_ttm_stolen_cpu_access_needs_ggtt(struct xe_device *xe)
 static s64 detect_bar2_dgfx(struct xe_device *xe, struct xe_ttm_stolen_mgr *mgr)
 {
 	struct pci_dev *pdev = to_pci_dev(xe->drm.dev);
-	struct xe_gt *gt = to_gt(xe);
+	struct xe_gt *gt = xe_primary_mmio_gt(xe);
 	u64 vram_size, stolen_size;
 	int err;
 
@@ -88,7 +88,7 @@ static u32 detect_bar2_integrated(struct xe_device *xe, struct xe_ttm_stolen_mgr
 	u32 stolen_size;
 	u32 ggc, gms;
 
-	ggc = xe_mmio_read32(to_gt(xe), GGC);
+	ggc = xe_mmio_read32(xe_primary_mmio_gt(xe), GGC);
 
 	/* check GGMS, should be fixed 0x3 (8MB) */
 	if (drm_WARN_ON(&xe->drm, (ggc & GGMS_MASK) != GGMS_MASK))
-- 
2.40.0


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [Intel-xe] [PATCH 10/26] drm/xe: Drop vram_id
  2023-05-11  3:46 [Intel-xe] [PATCH 00/26] Separate GT and tile Matt Roper
                   ` (8 preceding siblings ...)
  2023-05-11  3:47 ` [Intel-xe] [PATCH 09/26] drm/xe: Clarify 'gt' retrieval for primary tile Matt Roper
@ 2023-05-11  3:47 ` Matt Roper
  2023-05-17  5:09   ` Lucas De Marchi
  2023-05-11  3:47 ` [Intel-xe] [PATCH 11/26] drm/xe: Drop extra_gts[] declarations and XE_GT_TYPE_REMOTE Matt Roper
                   ` (24 subsequent siblings)
  34 siblings, 1 reply; 75+ messages in thread
From: Matt Roper @ 2023-05-11  3:47 UTC (permalink / raw)
  To: intel-xe; +Cc: matthew.d.roper

The VRAM ID is always the tile ID; there's no need to track it
separately within a GT.

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/xe/tests/xe_bo.c | 6 +++---
 drivers/gpu/drm/xe/xe_pci.c      | 2 --
 2 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/xe/tests/xe_bo.c b/drivers/gpu/drm/xe/tests/xe_bo.c
index bee5a2031153..e4d1d17b1d3c 100644
--- a/drivers/gpu/drm/xe/tests/xe_bo.c
+++ b/drivers/gpu/drm/xe/tests/xe_bo.c
@@ -115,9 +115,9 @@ static void ccs_test_run_gt(struct xe_device *xe, struct xe_gt *gt,
 	int ret;
 
 	/* TODO: Sanity check */
-	vram_bit = XE_BO_CREATE_VRAM0_BIT << gt->info.vram_id;
+	vram_bit = XE_BO_CREATE_VRAM0_BIT << gt_to_tile(gt)->id;
 	kunit_info(test, "Testing gt id %u vram id %u\n", gt->info.id,
-		   gt->info.vram_id);
+		   gt_to_tile(gt)->id);
 
 	bo = xe_bo_create_locked(xe, NULL, NULL, SZ_1M, ttm_bo_type_device,
 				 vram_bit);
@@ -179,7 +179,7 @@ static int evict_test_run_gt(struct xe_device *xe, struct xe_gt *gt, struct kuni
 	int err, i;
 
 	kunit_info(test, "Testing device %s gt id %u vram id %u\n",
-		   dev_name(xe->drm.dev), gt->info.id, gt->info.vram_id);
+		   dev_name(xe->drm.dev), gt->info.id, gt_to_tile(gt)->id);
 
 	for (i = 0; i < 2; ++i) {
 		xe_vm_lock(vm, &ww, 0, false);
diff --git a/drivers/gpu/drm/xe/xe_pci.c b/drivers/gpu/drm/xe/xe_pci.c
index be7c41024838..0f3508c72c79 100644
--- a/drivers/gpu/drm/xe/xe_pci.c
+++ b/drivers/gpu/drm/xe/xe_pci.c
@@ -34,7 +34,6 @@ struct xe_subplatform_desc {
 
 struct xe_gt_desc {
 	enum xe_gt_type type;
-	u8 vram_id;
 	u32 mmio_adj_limit;
 	u32 mmio_adj_offset;
 };
@@ -258,7 +257,6 @@ static const struct xe_device_desc dg2_desc = {
 static const struct xe_gt_desc pvc_gts[] = {
 	{
 		.type = XE_GT_TYPE_REMOTE,
-		.vram_id = 1,
 		.mmio_adj_limit = 0,
 		.mmio_adj_offset = 0,
 	},
-- 
2.40.0


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [Intel-xe] [PATCH 11/26] drm/xe: Drop extra_gts[] declarations and XE_GT_TYPE_REMOTE
  2023-05-11  3:46 [Intel-xe] [PATCH 00/26] Separate GT and tile Matt Roper
                   ` (9 preceding siblings ...)
  2023-05-11  3:47 ` [Intel-xe] [PATCH 10/26] drm/xe: Drop vram_id Matt Roper
@ 2023-05-11  3:47 ` Matt Roper
  2023-05-17  5:14   ` Lucas De Marchi
  2023-05-11  3:47 ` [Intel-xe] [PATCH 12/26] drm/xe: Allocate GT dynamically Matt Roper
                   ` (23 subsequent siblings)
  34 siblings, 1 reply; 75+ messages in thread
From: Matt Roper @ 2023-05-11  3:47 UTC (permalink / raw)
  To: intel-xe; +Cc: matthew.d.roper

Now that tiles and GTs are handled separately, extra_gts[] doesn't
really provide any useful information that we can't just infer directly.
The primary GT of the root tile and the remote tiles behave the same way
and don't need independent handling.

When we re-add support for media GTs in a future patch, the presence of
media can be determined from MEDIA_VER() (i.e., >= 13) and media's GSI
offset handling is expected to remain constant for all forseeable future
platforms, so it won't need to be provided in a definition structure
either.

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/xe/xe_gt_types.h |  1 -
 drivers/gpu/drm/xe/xe_pci.c      | 37 ++++++--------------------------
 2 files changed, 7 insertions(+), 31 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_gt_types.h b/drivers/gpu/drm/xe/xe_gt_types.h
index 8a5f9122ba80..5e0bfb21ae1c 100644
--- a/drivers/gpu/drm/xe/xe_gt_types.h
+++ b/drivers/gpu/drm/xe/xe_gt_types.h
@@ -20,7 +20,6 @@ struct xe_ring_ops;
 enum xe_gt_type {
 	XE_GT_TYPE_UNINITIALIZED,
 	XE_GT_TYPE_MAIN,
-	XE_GT_TYPE_REMOTE,
 	XE_GT_TYPE_MEDIA,
 };
 
diff --git a/drivers/gpu/drm/xe/xe_pci.c b/drivers/gpu/drm/xe/xe_pci.c
index 0f3508c72c79..bfdc9563e54f 100644
--- a/drivers/gpu/drm/xe/xe_pci.c
+++ b/drivers/gpu/drm/xe/xe_pci.c
@@ -46,7 +46,6 @@ struct xe_device_desc {
 
 	const char *platform_name;
 	const struct xe_subplatform_desc *subplatforms;
-	const struct xe_gt_desc *extra_gts;
 
 	enum xe_platform platform;
 
@@ -254,20 +253,11 @@ static const struct xe_device_desc dg2_desc = {
 	DG2_FEATURES,
 };
 
-static const struct xe_gt_desc pvc_gts[] = {
-	{
-		.type = XE_GT_TYPE_REMOTE,
-		.mmio_adj_limit = 0,
-		.mmio_adj_offset = 0,
-	},
-};
-
 static const struct xe_device_desc pvc_desc = {
 	.graphics = &graphics_xehpc,
 	DGFX_FEATURES,
 	PLATFORM(XE_PVC),
 	.require_force_probe = true,
-	.extra_gts = pvc_gts,
 };
 
 static const struct xe_device_desc mtl_desc = {
@@ -531,26 +521,13 @@ static int xe_info_init(struct xe_device *xe,
 		gt->info.id = id;
 		gt->tile = tile;
 
-		gt->info.id = id;
-		if (id == 0) {
-			gt->info.type = XE_GT_TYPE_MAIN;
-
-			gt->info.__engine_mask = graphics_desc->hw_engine_mask;
-			if (MEDIA_VER(xe) < 13 && media_desc)
-				gt->info.__engine_mask |= media_desc->hw_engine_mask;
-
-			gt->mmio.adj_limit = 0;
-			gt->mmio.adj_offset = 0;
-		} else {
-			gt->info.type = desc->extra_gts[id - 1].type;
-			gt->info.__engine_mask = xe_gt_is_media_type(gt) ?
-				media_desc->hw_engine_mask :
-				graphics_desc->hw_engine_mask;
-			gt->mmio.adj_limit =
-				desc->extra_gts[id - 1].mmio_adj_limit;
-			gt->mmio.adj_offset =
-				desc->extra_gts[id - 1].mmio_adj_offset;
-		}
+		gt->info.id = id;	/* FIXME: Determine sensible numbering */
+		gt->info.type = XE_GT_TYPE_MAIN;
+		gt->info.__engine_mask = graphics_desc->hw_engine_mask;
+		if (MEDIA_VER(xe) < 13 && media_desc)
+			gt->info.__engine_mask |= media_desc->hw_engine_mask;
+
+		/* TODO: Init media GT, if present */
 	}
 
 	return 0;
-- 
2.40.0


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [Intel-xe] [PATCH 12/26] drm/xe: Allocate GT dynamically
  2023-05-11  3:46 [Intel-xe] [PATCH 00/26] Separate GT and tile Matt Roper
                   ` (10 preceding siblings ...)
  2023-05-11  3:47 ` [Intel-xe] [PATCH 11/26] drm/xe: Drop extra_gts[] declarations and XE_GT_TYPE_REMOTE Matt Roper
@ 2023-05-11  3:47 ` Matt Roper
  2023-05-17  5:23   ` Lucas De Marchi
  2023-05-11  3:47 ` [Intel-xe] [PATCH 13/26] drm/xe: Add media GT to tile Matt Roper
                   ` (22 subsequent siblings)
  34 siblings, 1 reply; 75+ messages in thread
From: Matt Roper @ 2023-05-11  3:47 UTC (permalink / raw)
  To: intel-xe; +Cc: matthew.d.roper

In preparation for re-adding media GT support, switch the primary GT
within the tile to a dynamic allocation.

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/xe/xe_device.c       |  4 ----
 drivers/gpu/drm/xe/xe_device.h       |  8 ++++++--
 drivers/gpu/drm/xe/xe_device_types.h |  2 +-
 drivers/gpu/drm/xe/xe_ggtt.c         |  2 +-
 drivers/gpu/drm/xe/xe_gt.c           | 11 ++++++++---
 drivers/gpu/drm/xe/xe_gt.h           |  2 +-
 drivers/gpu/drm/xe/xe_migrate.c      | 12 ++++++------
 drivers/gpu/drm/xe/xe_pci.c          |  7 +++++--
 drivers/gpu/drm/xe/xe_pt.c           |  4 ++--
 drivers/gpu/drm/xe/xe_vm.c           |  6 +++---
 10 files changed, 33 insertions(+), 25 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
index c93c8895862f..b6fecee68cc6 100644
--- a/drivers/gpu/drm/xe/xe_device.c
+++ b/drivers/gpu/drm/xe/xe_device.c
@@ -254,10 +254,6 @@ int xe_device_probe(struct xe_device *xe)
 		err = xe_tile_alloc(tile);
 		if (err)
 			return err;
-
-		err = xe_gt_alloc(xe, &tile->primary_gt);
-		if (err)
-			return err;
 	}
 
 	err = xe_mmio_init(xe);
diff --git a/drivers/gpu/drm/xe/xe_device.h b/drivers/gpu/drm/xe/xe_device.h
index fc2655484dfd..370b9ccb875b 100644
--- a/drivers/gpu/drm/xe/xe_device.h
+++ b/drivers/gpu/drm/xe/xe_device.h
@@ -58,7 +58,11 @@ static inline struct xe_gt *xe_device_get_gt(struct xe_device *xe, u8 gt_id)
 	struct xe_gt *gt;
 
 	XE_BUG_ON(gt_id > XE_MAX_TILES_PER_DEVICE);
-	gt = &xe->tiles[gt_id].primary_gt;
+
+	gt = xe->tiles[gt_id].primary_gt;
+	if (drm_WARN_ON(&xe->drm, !gt))
+		return NULL;
+
 	XE_BUG_ON(gt->info.id != gt_id);
 	XE_BUG_ON(gt->info.type == XE_GT_TYPE_UNINITIALIZED);
 
@@ -75,7 +79,7 @@ static inline struct xe_gt *xe_device_get_gt(struct xe_device *xe, u8 gt_id)
  */
 static inline struct xe_gt *xe_primary_mmio_gt(struct xe_device *xe)
 {
-	return &xe_device_get_root_tile(xe)->primary_gt;
+	return xe_device_get_root_tile(xe)->primary_gt;
 }
 
 static inline bool xe_device_guc_submission_enabled(struct xe_device *xe)
diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
index fa76750a9a5f..1033f233f6ab 100644
--- a/drivers/gpu/drm/xe/xe_device_types.h
+++ b/drivers/gpu/drm/xe/xe_device_types.h
@@ -79,7 +79,7 @@ struct xe_tile {
 	/**
 	 * @primary_gt: Primary GT
 	 */
-	struct xe_gt primary_gt;
+	struct xe_gt *primary_gt;
 
 	/* TODO: Add media GT here */
 
diff --git a/drivers/gpu/drm/xe/xe_ggtt.c b/drivers/gpu/drm/xe/xe_ggtt.c
index b11f22b68bb8..7c87623ef5c5 100644
--- a/drivers/gpu/drm/xe/xe_ggtt.c
+++ b/drivers/gpu/drm/xe/xe_ggtt.c
@@ -194,7 +194,7 @@ void xe_ggtt_invalidate(struct xe_ggtt *ggtt)
 	 * TODO: Loop over each GT in tile once media GT support is
 	 * re-added
 	 */
-	struct xe_gt *gt = &ggtt->tile->primary_gt;
+	struct xe_gt *gt = ggtt->tile->primary_gt;
 
 	/* TODO: vfunc for GuC vs. non-GuC */
 
diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c
index 297ee32ad928..20663cd0ddaf 100644
--- a/drivers/gpu/drm/xe/xe_gt.c
+++ b/drivers/gpu/drm/xe/xe_gt.c
@@ -42,13 +42,18 @@
 #include "xe_wa.h"
 #include "xe_wopcm.h"
 
-int xe_gt_alloc(struct xe_device *xe, struct xe_gt *gt)
+struct xe_gt *xe_gt_alloc(struct xe_tile *tile)
 {
-	XE_BUG_ON(gt->info.type == XE_GT_TYPE_UNINITIALIZED);
+	struct xe_gt *gt;
 
+	gt = drmm_kzalloc(&tile_to_xe(tile)->drm, sizeof(*gt), GFP_KERNEL);
+	if (IS_ERR(gt))
+		return ERR_CAST(gt);
+
+	gt->tile = tile;
 	gt->ordered_wq = alloc_ordered_workqueue("gt-ordered-wq", 0);
 
-	return 0;
+	return gt;
 }
 
 void xe_gt_sanitize(struct xe_gt *gt)
diff --git a/drivers/gpu/drm/xe/xe_gt.h b/drivers/gpu/drm/xe/xe_gt.h
index c8abbeb0fb96..abcefd8cde78 100644
--- a/drivers/gpu/drm/xe/xe_gt.h
+++ b/drivers/gpu/drm/xe/xe_gt.h
@@ -16,7 +16,7 @@
 	     for_each_if (((hwe__) = (gt__)->hw_engines + (id__)) && \
 			  xe_hw_engine_is_valid((hwe__)))
 
-int xe_gt_alloc(struct xe_device *xe, struct xe_gt *gt);
+struct xe_gt *xe_gt_alloc(struct xe_tile *tile);
 int xe_gt_init_early(struct xe_gt *gt);
 int xe_gt_init_noalloc(struct xe_gt *gt);
 int xe_gt_init(struct xe_gt *gt);
diff --git a/drivers/gpu/drm/xe/xe_migrate.c b/drivers/gpu/drm/xe/xe_migrate.c
index c3a37109bc64..8a4fd80a7fde 100644
--- a/drivers/gpu/drm/xe/xe_migrate.c
+++ b/drivers/gpu/drm/xe/xe_migrate.c
@@ -229,7 +229,7 @@ static int xe_migrate_prepare_vm(struct xe_tile *tile, struct xe_migrate *m,
 		m->batch_base_ofs = xe_migrate_vram_ofs(batch_addr);
 
 		if (xe->info.supports_usm) {
-			batch = tile->primary_gt.usm.bb_pool->bo;
+			batch = tile->primary_gt->usm.bb_pool->bo;
 			batch_addr = xe_bo_addr(batch, 0, XE_PAGE_SIZE,
 						&is_vram);
 			m->usm_batch_base_ofs = xe_migrate_vram_ofs(batch_addr);
@@ -313,7 +313,7 @@ static int xe_migrate_prepare_vm(struct xe_tile *tile, struct xe_migrate *m,
 struct xe_migrate *xe_migrate_init(struct xe_tile *tile)
 {
 	struct xe_device *xe = tile_to_xe(tile);
-	struct xe_gt *primary_gt = &tile->primary_gt;
+	struct xe_gt *primary_gt = tile->primary_gt;
 	struct xe_migrate *m;
 	struct xe_vm *vm;
 	struct ww_acquire_ctx ww;
@@ -546,7 +546,7 @@ static u32 xe_migrate_ccs_copy(struct xe_migrate *m,
 			       u64 dst_ofs, bool dst_is_vram, u32 dst_size,
 			       u64 ccs_ofs, bool copy_ccs)
 {
-	struct xe_gt *gt = &m->tile->primary_gt;
+	struct xe_gt *gt = m->tile->primary_gt;
 	u32 flush_flags = 0;
 
 	if (xe_device_has_flat_ccs(gt_to_xe(gt)) && !copy_ccs && dst_is_vram) {
@@ -601,7 +601,7 @@ struct dma_fence *xe_migrate_copy(struct xe_migrate *m,
 				  struct ttm_resource *src,
 				  struct ttm_resource *dst)
 {
-	struct xe_gt *gt = &m->tile->primary_gt;
+	struct xe_gt *gt = m->tile->primary_gt;
 	struct xe_device *xe = gt_to_xe(gt);
 	struct dma_fence *fence = NULL;
 	u64 size = bo->size;
@@ -853,7 +853,7 @@ struct dma_fence *xe_migrate_clear(struct xe_migrate *m,
 				   struct ttm_resource *dst)
 {
 	bool clear_vram = mem_type_is_vram(dst->mem_type);
-	struct xe_gt *gt = &m->tile->primary_gt;
+	struct xe_gt *gt = m->tile->primary_gt;
 	struct xe_device *xe = gt_to_xe(gt);
 	struct dma_fence *fence = NULL;
 	u64 size = bo->size;
@@ -1128,7 +1128,7 @@ xe_migrate_update_pgtables(struct xe_migrate *m,
 {
 	const struct xe_migrate_pt_update_ops *ops = pt_update->ops;
 	struct xe_tile *tile = m->tile;
-	struct xe_gt *gt = &tile->primary_gt;
+	struct xe_gt *gt = tile->primary_gt;
 	struct xe_device *xe = tile_to_xe(tile);
 	struct xe_sched_job *job;
 	struct dma_fence *fence;
diff --git a/drivers/gpu/drm/xe/xe_pci.c b/drivers/gpu/drm/xe/xe_pci.c
index bfdc9563e54f..7d5e65d34f39 100644
--- a/drivers/gpu/drm/xe/xe_pci.c
+++ b/drivers/gpu/drm/xe/xe_pci.c
@@ -517,9 +517,12 @@ static int xe_info_init(struct xe_device *xe,
 		tile->xe = xe;
 		tile->id = id;
 
-		gt = &tile->primary_gt;
+		tile->primary_gt = xe_gt_alloc(tile);
+		if (IS_ERR(tile->primary_gt))
+			return PTR_ERR(tile->primary_gt);
+
+		gt = tile->primary_gt;
 		gt->info.id = id;
-		gt->tile = tile;
 
 		gt->info.id = id;	/* FIXME: Determine sensible numbering */
 		gt->info.type = XE_GT_TYPE_MAIN;
diff --git a/drivers/gpu/drm/xe/xe_pt.c b/drivers/gpu/drm/xe/xe_pt.c
index a606cd1a7e3a..60e4a97c78fb 100644
--- a/drivers/gpu/drm/xe/xe_pt.c
+++ b/drivers/gpu/drm/xe/xe_pt.c
@@ -1316,7 +1316,7 @@ __xe_pt_bind_vma(struct xe_tile *tile, struct xe_vma *vma, struct xe_engine *e,
 
 		/* TLB invalidation must be done before signaling rebind */
 		if (rebind && !xe_vm_no_dma_fences(vma->vm)) {
-			int err = invalidation_fence_init(&tile->primary_gt, ifence, fence,
+			int err = invalidation_fence_init(tile->primary_gt, ifence, fence,
 							  vma);
 			if (err) {
 				dma_fence_put(fence);
@@ -1636,7 +1636,7 @@ __xe_pt_unbind_vma(struct xe_tile *tile, struct xe_vma *vma, struct xe_engine *e
 		int err;
 
 		/* TLB invalidation must be done before signaling unbind */
-		err = invalidation_fence_init(&tile->primary_gt, ifence, fence, vma);
+		err = invalidation_fence_init(tile->primary_gt, ifence, fence, vma);
 		if (err) {
 			dma_fence_put(fence);
 			kfree(ifence);
diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
index 6beb73b40dca..cbbc809ae4e4 100644
--- a/drivers/gpu/drm/xe/xe_vm.c
+++ b/drivers/gpu/drm/xe/xe_vm.c
@@ -1200,7 +1200,7 @@ struct xe_vm *xe_vm_create(struct xe_device *xe, u32 flags)
 	/* Kernel migration VM shouldn't have a circular loop.. */
 	if (!(flags & XE_VM_FLAG_MIGRATION)) {
 		for_each_tile(tile, xe, id) {
-			struct xe_gt *gt = &tile->primary_gt;
+			struct xe_gt *gt = tile->primary_gt;
 			struct xe_vm *migrate_vm;
 			struct xe_engine *eng;
 
@@ -3368,7 +3368,7 @@ int xe_vm_invalidate_vma(struct xe_vma *vma)
 			 * FIXME: We potentially need to invalidate multiple
 			 * GTs within the tile
 			 */
-			seqno[id] = xe_gt_tlb_invalidation_vma(&tile->primary_gt, NULL, vma);
+			seqno[id] = xe_gt_tlb_invalidation_vma(tile->primary_gt, NULL, vma);
 			if (seqno[id] < 0)
 				return seqno[id];
 		}
@@ -3376,7 +3376,7 @@ int xe_vm_invalidate_vma(struct xe_vma *vma)
 
 	for_each_tile(tile, xe, id) {
 		if (tile_needs_invalidate & BIT(id)) {
-			ret = xe_gt_tlb_invalidation_wait(&tile->primary_gt, seqno[id]);
+			ret = xe_gt_tlb_invalidation_wait(tile->primary_gt, seqno[id]);
 			if (ret < 0)
 				return ret;
 		}
-- 
2.40.0


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [Intel-xe] [PATCH 13/26] drm/xe: Add media GT to tile
  2023-05-11  3:46 [Intel-xe] [PATCH 00/26] Separate GT and tile Matt Roper
                   ` (11 preceding siblings ...)
  2023-05-11  3:47 ` [Intel-xe] [PATCH 12/26] drm/xe: Allocate GT dynamically Matt Roper
@ 2023-05-11  3:47 ` Matt Roper
  2023-05-18 17:50   ` Rodrigo Vivi
  2023-05-11  3:47 ` [Intel-xe] [PATCH 14/26] drm/xe: Move display IRQ postinstall out of GT function Matt Roper
                   ` (21 subsequent siblings)
  34 siblings, 1 reply; 75+ messages in thread
From: Matt Roper @ 2023-05-11  3:47 UTC (permalink / raw)
  To: intel-xe; +Cc: matthew.d.roper

This media_gt pointer isn't actually allocated yet.  Future patches will
start hooking it up at appropriate places in the code, and then creation
of the media GT will be added once those infrastructure changes are in
place.

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/xe/xe_device_types.h | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
index 1033f233f6ab..2cf67ea57aac 100644
--- a/drivers/gpu/drm/xe/xe_device_types.h
+++ b/drivers/gpu/drm/xe/xe_device_types.h
@@ -81,7 +81,12 @@ struct xe_tile {
 	 */
 	struct xe_gt *primary_gt;
 
-	/* TODO: Add media GT here */
+	/**
+	 * @media_gt: Media GT
+	 *
+	 * Only present on devices with media version >= 13.
+	 */
+	struct xe_gt *media_gt;
 
 	/**
 	 * @mmio: MMIO info for a tile.
-- 
2.40.0


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [Intel-xe] [PATCH 14/26] drm/xe: Move display IRQ postinstall out of GT function
  2023-05-11  3:46 [Intel-xe] [PATCH 00/26] Separate GT and tile Matt Roper
                   ` (12 preceding siblings ...)
  2023-05-11  3:47 ` [Intel-xe] [PATCH 13/26] drm/xe: Add media GT to tile Matt Roper
@ 2023-05-11  3:47 ` Matt Roper
  2023-05-18 17:51   ` Rodrigo Vivi
  2023-05-18 18:20   ` Lucas De Marchi
  2023-05-11  3:47 ` [Intel-xe] [PATCH 15/26] drm/xe: Interrupts are delivered per-tile, not per-GT Matt Roper
                   ` (20 subsequent siblings)
  34 siblings, 2 replies; 75+ messages in thread
From: Matt Roper @ 2023-05-11  3:47 UTC (permalink / raw)
  To: intel-xe; +Cc: matthew.d.roper

Display interrupts are unrelated to the GT (and are also only relevant
to the root tile).  Move the postinstall call up a level in the
callstack.

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/xe/xe_irq.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_irq.c b/drivers/gpu/drm/xe/xe_irq.c
index 806121009102..494ec5567e50 100644
--- a/drivers/gpu/drm/xe/xe_irq.c
+++ b/drivers/gpu/drm/xe/xe_irq.c
@@ -490,8 +490,6 @@ void xe_gt_irq_postinstall(struct xe_gt *gt)
 		dg1_irq_postinstall(xe, gt);
 	else
 		xelp_irq_postinstall(xe, gt);
-
-	xe_display_irq_postinstall(xe, gt);
 }
 
 static void xe_irq_postinstall(struct xe_device *xe)
@@ -501,6 +499,8 @@ static void xe_irq_postinstall(struct xe_device *xe)
 
 	for_each_gt(gt, xe, id)
 		xe_gt_irq_postinstall(gt);
+
+	xe_display_irq_postinstall(xe, xe_primary_mmio_gt(xe));
 }
 
 static irq_handler_t xe_irq_handler(struct xe_device *xe)
-- 
2.40.0


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [Intel-xe] [PATCH 15/26] drm/xe: Interrupts are delivered per-tile, not per-GT
  2023-05-11  3:46 [Intel-xe] [PATCH 00/26] Separate GT and tile Matt Roper
                   ` (13 preceding siblings ...)
  2023-05-11  3:47 ` [Intel-xe] [PATCH 14/26] drm/xe: Move display IRQ postinstall out of GT function Matt Roper
@ 2023-05-11  3:47 ` Matt Roper
  2023-05-11 12:14   ` Iddamsetty, Aravind
  2023-05-18 18:30   ` Lucas De Marchi
  2023-05-11  3:47 ` [Intel-xe] [PATCH 16/26] drm/xe/irq: Handle ASLE backlight interrupts at same time as display Matt Roper
                   ` (19 subsequent siblings)
  34 siblings, 2 replies; 75+ messages in thread
From: Matt Roper @ 2023-05-11  3:47 UTC (permalink / raw)
  To: intel-xe; +Cc: matthew.d.roper

IRQ delivery and handling needs to be handled on a per-tile basis.  Note
that this is true even for the "GT interrupts" relating to engines and
GuCs --- the interrupts relating to both GTs get raised through a single
set of registers in the tile's sgunit range.

The (mis)use of struct xe_gt as a target for MMIO operations in the
driver makes the code somewhat confusing since we wind up needing a GT
pointer to handle programming that's unrelated to the GT.  To mitigate
this confusion, all of the xe_gt structures used solely as an MMIO
target in interrupt code are renamed to 'mmio.'  Reworking the driver's
MMIO handling to not be dependent on xe_gt is planned as a future
update.

Note that GT initialization code currently calls xe_gt_irq_postinstall()
in an attempt to enable the HWE interrupts for the GT being initialized.
Unfortunately xe_gt_irq_postinstall() doesn't really match its name and
does a bunch of other stuff unrelated to the GT interrupts (such as
enabling the top-level device interrupts).  That will be addressed in
future patches.

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/xe/xe_gt.c  |   2 +-
 drivers/gpu/drm/xe/xe_irq.c | 334 ++++++++++++++++++++----------------
 drivers/gpu/drm/xe/xe_irq.h |   4 +-
 3 files changed, 187 insertions(+), 153 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c
index 20663cd0ddaf..e00d260dff00 100644
--- a/drivers/gpu/drm/xe/xe_gt.c
+++ b/drivers/gpu/drm/xe/xe_gt.c
@@ -303,7 +303,7 @@ static int gt_fw_domain_init(struct xe_gt *gt)
 	gt->info.engine_mask = gt->info.__engine_mask;
 
 	/* Enables per hw engine IRQs */
-	xe_gt_irq_postinstall(gt);
+	xe_gt_irq_postinstall(gt_to_tile(gt));
 
 	/* Rerun MCR init as we now have hw engine list */
 	xe_gt_mcr_init(gt);
diff --git a/drivers/gpu/drm/xe/xe_irq.c b/drivers/gpu/drm/xe/xe_irq.c
index 494ec5567e50..fa7d04ba23c0 100644
--- a/drivers/gpu/drm/xe/xe_irq.c
+++ b/drivers/gpu/drm/xe/xe_irq.c
@@ -27,60 +27,66 @@
 #define IIR(offset)				XE_REG(offset + 0x8)
 #define IER(offset)				XE_REG(offset + 0xc)
 
-static void assert_iir_is_zero(struct xe_gt *gt, struct xe_reg reg)
+static void assert_iir_is_zero(struct xe_gt *mmio, struct xe_reg reg)
 {
-	u32 val = xe_mmio_read32(gt, reg);
+	u32 val = xe_mmio_read32(mmio, reg);
 
 	if (val == 0)
 		return;
 
-	drm_WARN(&gt_to_xe(gt)->drm, 1,
+	drm_WARN(&gt_to_xe(mmio)->drm, 1,
 		 "Interrupt register 0x%x is not zero: 0x%08x\n",
 		 reg.addr, val);
-	xe_mmio_write32(gt, reg, 0xffffffff);
-	xe_mmio_read32(gt, reg);
-	xe_mmio_write32(gt, reg, 0xffffffff);
-	xe_mmio_read32(gt, reg);
+	xe_mmio_write32(mmio, reg, 0xffffffff);
+	xe_mmio_read32(mmio, reg);
+	xe_mmio_write32(mmio, reg, 0xffffffff);
+	xe_mmio_read32(mmio, reg);
 }
 
 /*
  * Unmask and enable the specified interrupts.  Does not check current state,
  * so any bits not specified here will become masked and disabled.
  */
-static void unmask_and_enable(struct xe_gt *gt, u32 irqregs, u32 bits)
+static void unmask_and_enable(struct xe_tile *tile, u32 irqregs, u32 bits)
 {
+	struct xe_gt *mmio = tile->primary_gt;
+
 	/*
 	 * If we're just enabling an interrupt now, it shouldn't already
 	 * be raised in the IIR.
 	 */
-	assert_iir_is_zero(gt, IIR(irqregs));
+	assert_iir_is_zero(mmio, IIR(irqregs));
 
-	xe_mmio_write32(gt, IER(irqregs), bits);
-	xe_mmio_write32(gt, IMR(irqregs), ~bits);
+	xe_mmio_write32(mmio, IER(irqregs), bits);
+	xe_mmio_write32(mmio, IMR(irqregs), ~bits);
 
 	/* Posting read */
-	xe_mmio_read32(gt, IMR(irqregs));
+	xe_mmio_read32(mmio, IMR(irqregs));
 }
 
 /* Mask and disable all interrupts. */
-static void mask_and_disable(struct xe_gt *gt, u32 irqregs)
+static void mask_and_disable(struct xe_tile *tile, u32 irqregs)
 {
-	xe_mmio_write32(gt, IMR(irqregs), ~0);
+	struct xe_gt *mmio = tile->primary_gt;
+
+	xe_mmio_write32(mmio, IMR(irqregs), ~0);
 	/* Posting read */
-	xe_mmio_read32(gt, IMR(irqregs));
+	xe_mmio_read32(mmio, IMR(irqregs));
 
-	xe_mmio_write32(gt, IER(irqregs), 0);
+	xe_mmio_write32(mmio, IER(irqregs), 0);
 
 	/* IIR can theoretically queue up two events. Be paranoid. */
-	xe_mmio_write32(gt, IIR(irqregs), ~0);
-	xe_mmio_read32(gt, IIR(irqregs));
-	xe_mmio_write32(gt, IIR(irqregs), ~0);
-	xe_mmio_read32(gt, IIR(irqregs));
+	xe_mmio_write32(mmio, IIR(irqregs), ~0);
+	xe_mmio_read32(mmio, IIR(irqregs));
+	xe_mmio_write32(mmio, IIR(irqregs), ~0);
+	xe_mmio_read32(mmio, IIR(irqregs));
 }
 
-static u32 xelp_intr_disable(struct xe_gt *gt)
+static u32 xelp_intr_disable(struct xe_device *xe)
 {
-	xe_mmio_write32(gt, GFX_MSTR_IRQ, 0);
+	struct xe_gt *mmio = xe_primary_mmio_gt(xe);
+
+	xe_mmio_write32(mmio, GFX_MSTR_IRQ, 0);
 
 	/*
 	 * Now with master disabled, get a sample of level indications
@@ -88,36 +94,41 @@ static u32 xelp_intr_disable(struct xe_gt *gt)
 	 * New indications can and will light up during processing,
 	 * and will generate new interrupt after enabling master.
 	 */
-	return xe_mmio_read32(gt, GFX_MSTR_IRQ);
+	return xe_mmio_read32(mmio, GFX_MSTR_IRQ);
 }
 
 static u32
-gu_misc_irq_ack(struct xe_gt *gt, const u32 master_ctl)
+gu_misc_irq_ack(struct xe_device *xe, const u32 master_ctl)
 {
+	struct xe_gt *mmio = xe_primary_mmio_gt(xe);
 	u32 iir;
 
 	if (!(master_ctl & GU_MISC_IRQ))
 		return 0;
 
-	iir = xe_mmio_read32(gt, IIR(GU_MISC_IRQ_OFFSET));
+	iir = xe_mmio_read32(mmio, IIR(GU_MISC_IRQ_OFFSET));
 	if (likely(iir))
-		xe_mmio_write32(gt, IIR(GU_MISC_IRQ_OFFSET), iir);
+		xe_mmio_write32(mmio, IIR(GU_MISC_IRQ_OFFSET), iir);
 
 	return iir;
 }
 
-static inline void xelp_intr_enable(struct xe_gt *gt, bool stall)
+static inline void xelp_intr_enable(struct xe_device *xe, bool stall)
 {
-	xe_mmio_write32(gt, GFX_MSTR_IRQ, MASTER_IRQ);
+	struct xe_gt *mmio = xe_primary_mmio_gt(xe);
+
+	xe_mmio_write32(mmio, GFX_MSTR_IRQ, MASTER_IRQ);
 	if (stall)
-		xe_mmio_read32(gt, GFX_MSTR_IRQ);
+		xe_mmio_read32(mmio, GFX_MSTR_IRQ);
 }
 
-static void gt_irq_postinstall(struct xe_device *xe, struct xe_gt *gt)
+static void gt_irq_postinstall(struct xe_tile *tile)
 {
+	struct xe_device *xe = tile_to_xe(tile);
+	struct xe_gt *mmio = tile->primary_gt;
 	u32 irqs, dmask, smask;
-	u32 ccs_mask = xe_hw_engine_mask_per_class(gt, XE_ENGINE_CLASS_COMPUTE);
-	u32 bcs_mask = xe_hw_engine_mask_per_class(gt, XE_ENGINE_CLASS_COPY);
+	u32 ccs_mask = xe_hw_engine_mask_per_class(tile->primary_gt, XE_ENGINE_CLASS_COMPUTE);
+	u32 bcs_mask = xe_hw_engine_mask_per_class(tile->primary_gt, XE_ENGINE_CLASS_COPY);
 
 	if (xe_device_guc_submission_enabled(xe)) {
 		irqs = GT_RENDER_USER_INTERRUPT |
@@ -133,57 +144,57 @@ static void gt_irq_postinstall(struct xe_device *xe, struct xe_gt *gt)
 	smask = irqs << 16;
 
 	/* Enable RCS, BCS, VCS and VECS class interrupts. */
-	xe_mmio_write32(gt, RENDER_COPY_INTR_ENABLE, dmask);
-	xe_mmio_write32(gt, VCS_VECS_INTR_ENABLE, dmask);
+	xe_mmio_write32(mmio, RENDER_COPY_INTR_ENABLE, dmask);
+	xe_mmio_write32(mmio, VCS_VECS_INTR_ENABLE, dmask);
 	if (ccs_mask)
-		xe_mmio_write32(gt, CCS_RSVD_INTR_ENABLE, smask);
+		xe_mmio_write32(mmio, CCS_RSVD_INTR_ENABLE, smask);
 
 	/* Unmask irqs on RCS, BCS, VCS and VECS engines. */
-	xe_mmio_write32(gt, RCS0_RSVD_INTR_MASK, ~smask);
-	xe_mmio_write32(gt, BCS_RSVD_INTR_MASK, ~smask);
+	xe_mmio_write32(mmio, RCS0_RSVD_INTR_MASK, ~smask);
+	xe_mmio_write32(mmio, BCS_RSVD_INTR_MASK, ~smask);
 	if (bcs_mask & (BIT(1)|BIT(2)))
-		xe_mmio_write32(gt, XEHPC_BCS1_BCS2_INTR_MASK, ~dmask);
+		xe_mmio_write32(mmio, XEHPC_BCS1_BCS2_INTR_MASK, ~dmask);
 	if (bcs_mask & (BIT(3)|BIT(4)))
-		xe_mmio_write32(gt, XEHPC_BCS3_BCS4_INTR_MASK, ~dmask);
+		xe_mmio_write32(mmio, XEHPC_BCS3_BCS4_INTR_MASK, ~dmask);
 	if (bcs_mask & (BIT(5)|BIT(6)))
-		xe_mmio_write32(gt, XEHPC_BCS5_BCS6_INTR_MASK, ~dmask);
+		xe_mmio_write32(mmio, XEHPC_BCS5_BCS6_INTR_MASK, ~dmask);
 	if (bcs_mask & (BIT(7)|BIT(8)))
-		xe_mmio_write32(gt, XEHPC_BCS7_BCS8_INTR_MASK, ~dmask);
-	xe_mmio_write32(gt, VCS0_VCS1_INTR_MASK, ~dmask);
-	xe_mmio_write32(gt, VCS2_VCS3_INTR_MASK, ~dmask);
-	xe_mmio_write32(gt, VECS0_VECS1_INTR_MASK, ~dmask);
+		xe_mmio_write32(mmio, XEHPC_BCS7_BCS8_INTR_MASK, ~dmask);
+	xe_mmio_write32(mmio, VCS0_VCS1_INTR_MASK, ~dmask);
+	xe_mmio_write32(mmio, VCS2_VCS3_INTR_MASK, ~dmask);
+	xe_mmio_write32(mmio, VECS0_VECS1_INTR_MASK, ~dmask);
 	if (ccs_mask & (BIT(0)|BIT(1)))
-		xe_mmio_write32(gt, CCS0_CCS1_INTR_MASK, ~dmask);
+		xe_mmio_write32(mmio, CCS0_CCS1_INTR_MASK, ~dmask);
 	if (ccs_mask & (BIT(2)|BIT(3)))
-		xe_mmio_write32(gt,  CCS2_CCS3_INTR_MASK, ~dmask);
+		xe_mmio_write32(mmio,  CCS2_CCS3_INTR_MASK, ~dmask);
 
 	/*
 	 * RPS interrupts will get enabled/disabled on demand when RPS itself
 	 * is enabled/disabled.
 	 */
 	/* TODO: gt->pm_ier, gt->pm_imr */
-	xe_mmio_write32(gt, GPM_WGBOXPERF_INTR_ENABLE, 0);
-	xe_mmio_write32(gt, GPM_WGBOXPERF_INTR_MASK,  ~0);
+	xe_mmio_write32(mmio, GPM_WGBOXPERF_INTR_ENABLE, 0);
+	xe_mmio_write32(mmio, GPM_WGBOXPERF_INTR_MASK,  ~0);
 
 	/* Same thing for GuC interrupts */
-	xe_mmio_write32(gt, GUC_SG_INTR_ENABLE, 0);
-	xe_mmio_write32(gt, GUC_SG_INTR_MASK,  ~0);
+	xe_mmio_write32(mmio, GUC_SG_INTR_ENABLE, 0);
+	xe_mmio_write32(mmio, GUC_SG_INTR_MASK,  ~0);
 }
 
-static void xelp_irq_postinstall(struct xe_device *xe, struct xe_gt *gt)
+static void xelp_irq_postinstall(struct xe_device *xe, struct xe_tile *tile)
 {
 	/* TODO: PCH */
 
-	gt_irq_postinstall(xe, gt);
+	gt_irq_postinstall(tile);
 
-	unmask_and_enable(gt, GU_MISC_IRQ_OFFSET, GU_MISC_GSE);
+	unmask_and_enable(tile, GU_MISC_IRQ_OFFSET, GU_MISC_GSE);
 
-	xelp_intr_enable(gt, true);
+	xelp_intr_enable(xe, true);
 }
 
 static u32
 gt_engine_identity(struct xe_device *xe,
-		   struct xe_gt *gt,
+		   struct xe_gt *mmio,
 		   const unsigned int bank,
 		   const unsigned int bit)
 {
@@ -192,7 +203,7 @@ gt_engine_identity(struct xe_device *xe,
 
 	lockdep_assert_held(&xe->irq.lock);
 
-	xe_mmio_write32(gt, IIR_REG_SELECTOR(bank), BIT(bit));
+	xe_mmio_write32(mmio, IIR_REG_SELECTOR(bank), BIT(bit));
 
 	/*
 	 * NB: Specs do not specify how long to spin wait,
@@ -200,7 +211,7 @@ gt_engine_identity(struct xe_device *xe,
 	 */
 	timeout_ts = (local_clock() >> 10) + 100;
 	do {
-		ident = xe_mmio_read32(gt, INTR_IDENTITY_REG(bank));
+		ident = xe_mmio_read32(mmio, INTR_IDENTITY_REG(bank));
 	} while (!(ident & INTR_DATA_VALID) &&
 		 !time_after32(local_clock() >> 10, timeout_ts));
 
@@ -210,7 +221,7 @@ gt_engine_identity(struct xe_device *xe,
 		return 0;
 	}
 
-	xe_mmio_write32(gt, INTR_IDENTITY_REG(bank), INTR_DATA_VALID);
+	xe_mmio_write32(mmio, INTR_IDENTITY_REG(bank), INTR_DATA_VALID);
 
 	return ident;
 }
@@ -232,10 +243,32 @@ gt_other_irq_handler(struct xe_gt *gt, const u8 instance, const u16 iir)
 	}
 }
 
-static void gt_irq_handler(struct xe_device *xe, struct xe_gt *gt,
+static struct xe_gt *pick_engine_gt(struct xe_tile *tile,
+				    enum xe_engine_class class,
+				    unsigned int instance)
+{
+	struct xe_device *xe = tile_to_xe(tile);
+
+	if (MEDIA_VER(xe) < 13)
+		return tile->primary_gt;
+
+	if (class == XE_ENGINE_CLASS_VIDEO_DECODE ||
+	    class == XE_ENGINE_CLASS_VIDEO_ENHANCE)
+		return tile->media_gt;
+
+	if (class == XE_ENGINE_CLASS_OTHER &&
+	    instance == OTHER_MEDIA_GUC_INSTANCE)
+		return tile->media_gt;
+
+	return tile->primary_gt;
+}
+
+static void gt_irq_handler(struct xe_tile *tile,
 			   u32 master_ctl, long unsigned int *intr_dw,
 			   u32 *identity)
 {
+	struct xe_device *xe = tile_to_xe(tile);
+	struct xe_gt *mmio = tile->primary_gt;
 	unsigned int bank, bit;
 	u16 instance, intr_vec;
 	enum xe_engine_class class;
@@ -247,27 +280,26 @@ static void gt_irq_handler(struct xe_device *xe, struct xe_gt *gt,
 		if (!(master_ctl & GT_DW_IRQ(bank)))
 			continue;
 
-		if (!xe_gt_is_media_type(gt)) {
-			intr_dw[bank] =
-				xe_mmio_read32(gt, GT_INTR_DW(bank));
-			for_each_set_bit(bit, intr_dw + bank, 32)
-				identity[bit] = gt_engine_identity(xe, gt,
-								   bank, bit);
-			xe_mmio_write32(gt, GT_INTR_DW(bank),
-					intr_dw[bank]);
-		}
+		intr_dw[bank] = xe_mmio_read32(mmio, GT_INTR_DW(bank));
+		for_each_set_bit(bit, intr_dw + bank, 32)
+			identity[bit] = gt_engine_identity(xe, mmio, bank, bit);
+		xe_mmio_write32(mmio, GT_INTR_DW(bank), intr_dw[bank]);
 
 		for_each_set_bit(bit, intr_dw + bank, 32) {
+			struct xe_gt *engine_gt;
+
 			class = INTR_ENGINE_CLASS(identity[bit]);
 			instance = INTR_ENGINE_INSTANCE(identity[bit]);
 			intr_vec = INTR_ENGINE_INTR(identity[bit]);
 
+			engine_gt = pick_engine_gt(tile, class, instance);
+
 			if (class == XE_ENGINE_CLASS_OTHER) {
-				gt_other_irq_handler(gt, instance, intr_vec);
+				gt_other_irq_handler(engine_gt, instance, intr_vec);
 				continue;
 			}
 
-			hwe = xe_gt_hw_engine(gt, class, instance, false);
+			hwe = xe_gt_hw_engine(engine_gt, class, instance, false);
 			if (!hwe)
 				continue;
 
@@ -285,24 +317,24 @@ static void gt_irq_handler(struct xe_device *xe, struct xe_gt *gt,
 static irqreturn_t xelp_irq_handler(int irq, void *arg)
 {
 	struct xe_device *xe = arg;
-	struct xe_gt *gt = xe_primary_mmio_gt(xe);
+	struct xe_tile *tile = xe_device_get_root_tile(xe);
 	u32 master_ctl, gu_misc_iir;
 	long unsigned int intr_dw[2];
 	u32 identity[32];
 
-	master_ctl = xelp_intr_disable(gt);
+	master_ctl = xelp_intr_disable(xe);
 	if (!master_ctl) {
-		xelp_intr_enable(gt, false);
+		xelp_intr_enable(xe, false);
 		return IRQ_NONE;
 	}
 
-	gt_irq_handler(xe, gt, master_ctl, intr_dw, identity);
+	gt_irq_handler(tile, master_ctl, intr_dw, identity);
 
 	xe_display_irq_handler(xe, master_ctl);
 
-	gu_misc_iir = gu_misc_irq_ack(gt, master_ctl);
+	gu_misc_iir = gu_misc_irq_ack(xe, master_ctl);
 
-	xelp_intr_enable(gt, false);
+	xelp_intr_enable(xe, false);
 
 	xe_display_irq_enable(xe, gu_misc_iir);
 
@@ -311,38 +343,38 @@ static irqreturn_t xelp_irq_handler(int irq, void *arg)
 
 static u32 dg1_intr_disable(struct xe_device *xe)
 {
-	struct xe_gt *gt = xe_primary_mmio_gt(xe);
+	struct xe_gt *mmio = xe_primary_mmio_gt(xe);
 	u32 val;
 
 	/* First disable interrupts */
-	xe_mmio_write32(gt, DG1_MSTR_TILE_INTR, 0);
+	xe_mmio_write32(mmio, DG1_MSTR_TILE_INTR, 0);
 
 	/* Get the indication levels and ack the master unit */
-	val = xe_mmio_read32(gt, DG1_MSTR_TILE_INTR);
+	val = xe_mmio_read32(mmio, DG1_MSTR_TILE_INTR);
 	if (unlikely(!val))
 		return 0;
 
-	xe_mmio_write32(gt, DG1_MSTR_TILE_INTR, val);
+	xe_mmio_write32(mmio, DG1_MSTR_TILE_INTR, val);
 
 	return val;
 }
 
 static void dg1_intr_enable(struct xe_device *xe, bool stall)
 {
-	struct xe_gt *gt = xe_primary_mmio_gt(xe);
+	struct xe_gt *mmio = xe_primary_mmio_gt(xe);
 
-	xe_mmio_write32(gt, DG1_MSTR_TILE_INTR, DG1_MSTR_IRQ);
+	xe_mmio_write32(mmio, DG1_MSTR_TILE_INTR, DG1_MSTR_IRQ);
 	if (stall)
-		xe_mmio_read32(gt, DG1_MSTR_TILE_INTR);
+		xe_mmio_read32(mmio, DG1_MSTR_TILE_INTR);
 }
 
-static void dg1_irq_postinstall(struct xe_device *xe, struct xe_gt *gt)
+static void dg1_irq_postinstall(struct xe_device *xe, struct xe_tile *tile)
 {
-	gt_irq_postinstall(xe, gt);
+	gt_irq_postinstall(tile);
 
-	unmask_and_enable(gt, GU_MISC_IRQ_OFFSET, GU_MISC_GSE);
+	unmask_and_enable(tile, GU_MISC_IRQ_OFFSET, GU_MISC_GSE);
 
-	if (gt->info.id == XE_GT0)
+	if (tile->id == 0)
 		dg1_intr_enable(xe, true);
 }
 
@@ -354,8 +386,8 @@ static void dg1_irq_postinstall(struct xe_device *xe, struct xe_gt *gt)
 static irqreturn_t dg1_irq_handler(int irq, void *arg)
 {
 	struct xe_device *xe = arg;
-	struct xe_gt *gt;
-	u32 master_tile_ctl, master_ctl = 0, tile0_master_ctl = 0, gu_misc_iir;
+	struct xe_tile *tile;
+	u32 master_tile_ctl, master_ctl = 0, gu_misc_iir = 0;
 	long unsigned int intr_dw[2];
 	u32 identity[32];
 	u8 id;
@@ -368,12 +400,13 @@ static irqreturn_t dg1_irq_handler(int irq, void *arg)
 		return IRQ_NONE;
 	}
 
-	for_each_gt(gt, xe, id) {
-		if ((master_tile_ctl & DG1_MSTR_TILE(gt_to_tile(gt)->id)) == 0)
+	for_each_tile(tile, xe, id) {
+		struct xe_gt *mmio = tile->primary_gt;
+
+		if ((master_tile_ctl & DG1_MSTR_TILE(tile->id)) == 0)
 			continue;
 
-		if (!xe_gt_is_media_type(gt))
-			master_ctl = xe_mmio_read32(gt, GFX_MSTR_IRQ);
+		master_ctl = xe_mmio_read32(mmio, GFX_MSTR_IRQ);
 
 		/*
 		 * We might be in irq handler just when PCIe DPC is initiated
@@ -381,124 +414,125 @@ static irqreturn_t dg1_irq_handler(int irq, void *arg)
 		 * irq as device is inaccessible.
 		 */
 		if (master_ctl == REG_GENMASK(31, 0)) {
-			dev_dbg(gt_to_xe(gt)->drm.dev,
+			dev_dbg(tile_to_xe(tile)->drm.dev,
 				"Ignore this IRQ as device might be in DPC containment.\n");
 			return IRQ_HANDLED;
 		}
 
-		if (!xe_gt_is_media_type(gt))
-			xe_mmio_write32(gt, GFX_MSTR_IRQ, master_ctl);
-		gt_irq_handler(xe, gt, master_ctl, intr_dw, identity);
+		xe_mmio_write32(mmio, GFX_MSTR_IRQ, master_ctl);
+
+		gt_irq_handler(tile, master_ctl, intr_dw, identity);
 
 		/*
-		 * Save primary tile's master interrupt register for display
-		 * processing below.
+		 * Display interrupts (including display backlight operations
+		 * that get reported as Gunit GSE) would only be hooked up to
+		 * the primary tile.
 		 */
-		if (id == 0)
-			tile0_master_ctl = master_ctl;
+		if (id == 0) {
+			xe_display_irq_handler(xe, master_ctl);
+			gu_misc_iir = gu_misc_irq_ack(xe, master_ctl);
+		}
 	}
 
-	xe_display_irq_handler(xe, tile0_master_ctl);
-
-	/* Gunit GSE interrupts can trigger display backlight operations */
-	gu_misc_iir = gu_misc_irq_ack(gt, tile0_master_ctl);
-
 	dg1_intr_enable(xe, false);
-
 	xe_display_irq_enable(xe, gu_misc_iir);
 
 	return IRQ_HANDLED;
 }
 
-static void gt_irq_reset(struct xe_gt *gt)
+static void gt_irq_reset(struct xe_tile *tile)
 {
-	u32 ccs_mask = xe_hw_engine_mask_per_class(gt, XE_ENGINE_CLASS_COMPUTE);
-	u32 bcs_mask = xe_hw_engine_mask_per_class(gt, XE_ENGINE_CLASS_COPY);
+	struct xe_gt *mmio = tile->primary_gt;
+
+	u32 ccs_mask = xe_hw_engine_mask_per_class(tile->primary_gt,
+						   XE_ENGINE_CLASS_COMPUTE);
+	u32 bcs_mask = xe_hw_engine_mask_per_class(tile->primary_gt,
+						   XE_ENGINE_CLASS_COPY);
 
 	/* Disable RCS, BCS, VCS and VECS class engines. */
-	xe_mmio_write32(gt, RENDER_COPY_INTR_ENABLE,	 0);
-	xe_mmio_write32(gt, VCS_VECS_INTR_ENABLE,	 0);
+	xe_mmio_write32(mmio, RENDER_COPY_INTR_ENABLE, 0);
+	xe_mmio_write32(mmio, VCS_VECS_INTR_ENABLE, 0);
 	if (ccs_mask)
-		xe_mmio_write32(gt, CCS_RSVD_INTR_ENABLE, 0);
+		xe_mmio_write32(mmio, CCS_RSVD_INTR_ENABLE, 0);
 
 	/* Restore masks irqs on RCS, BCS, VCS and VECS engines. */
-	xe_mmio_write32(gt, RCS0_RSVD_INTR_MASK,	~0);
-	xe_mmio_write32(gt, BCS_RSVD_INTR_MASK,	~0);
+	xe_mmio_write32(mmio, RCS0_RSVD_INTR_MASK,	~0);
+	xe_mmio_write32(mmio, BCS_RSVD_INTR_MASK,	~0);
 	if (bcs_mask & (BIT(1)|BIT(2)))
-		xe_mmio_write32(gt, XEHPC_BCS1_BCS2_INTR_MASK, ~0);
+		xe_mmio_write32(mmio, XEHPC_BCS1_BCS2_INTR_MASK, ~0);
 	if (bcs_mask & (BIT(3)|BIT(4)))
-		xe_mmio_write32(gt, XEHPC_BCS3_BCS4_INTR_MASK, ~0);
+		xe_mmio_write32(mmio, XEHPC_BCS3_BCS4_INTR_MASK, ~0);
 	if (bcs_mask & (BIT(5)|BIT(6)))
-		xe_mmio_write32(gt, XEHPC_BCS5_BCS6_INTR_MASK, ~0);
+		xe_mmio_write32(mmio, XEHPC_BCS5_BCS6_INTR_MASK, ~0);
 	if (bcs_mask & (BIT(7)|BIT(8)))
-		xe_mmio_write32(gt, XEHPC_BCS7_BCS8_INTR_MASK, ~0);
-	xe_mmio_write32(gt, VCS0_VCS1_INTR_MASK,	~0);
-	xe_mmio_write32(gt, VCS2_VCS3_INTR_MASK,	~0);
-	xe_mmio_write32(gt, VECS0_VECS1_INTR_MASK,	~0);
+		xe_mmio_write32(mmio, XEHPC_BCS7_BCS8_INTR_MASK, ~0);
+	xe_mmio_write32(mmio, VCS0_VCS1_INTR_MASK,	~0);
+	xe_mmio_write32(mmio, VCS2_VCS3_INTR_MASK,	~0);
+	xe_mmio_write32(mmio, VECS0_VECS1_INTR_MASK,	~0);
 	if (ccs_mask & (BIT(0)|BIT(1)))
-		xe_mmio_write32(gt, CCS0_CCS1_INTR_MASK, ~0);
+		xe_mmio_write32(mmio, CCS0_CCS1_INTR_MASK, ~0);
 	if (ccs_mask & (BIT(2)|BIT(3)))
-		xe_mmio_write32(gt,  CCS2_CCS3_INTR_MASK, ~0);
+		xe_mmio_write32(mmio,  CCS2_CCS3_INTR_MASK, ~0);
 
-	xe_mmio_write32(gt, GPM_WGBOXPERF_INTR_ENABLE, 0);
-	xe_mmio_write32(gt, GPM_WGBOXPERF_INTR_MASK,  ~0);
-	xe_mmio_write32(gt, GUC_SG_INTR_ENABLE,	 0);
-	xe_mmio_write32(gt, GUC_SG_INTR_MASK,		~0);
+	xe_mmio_write32(mmio, GPM_WGBOXPERF_INTR_ENABLE, 0);
+	xe_mmio_write32(mmio, GPM_WGBOXPERF_INTR_MASK,  ~0);
+	xe_mmio_write32(mmio, GUC_SG_INTR_ENABLE,	 0);
+	xe_mmio_write32(mmio, GUC_SG_INTR_MASK,		~0);
 }
 
-static void xelp_irq_reset(struct xe_gt *gt)
+static void xelp_irq_reset(struct xe_tile *tile)
 {
-	xelp_intr_disable(gt);
+	xelp_intr_disable(tile_to_xe(tile));
 
-	gt_irq_reset(gt);
+	gt_irq_reset(tile);
 
-	mask_and_disable(gt, GU_MISC_IRQ_OFFSET);
-	mask_and_disable(gt, PCU_IRQ_OFFSET);
+	mask_and_disable(tile, GU_MISC_IRQ_OFFSET);
+	mask_and_disable(tile, PCU_IRQ_OFFSET);
 }
 
-static void dg1_irq_reset(struct xe_gt *gt)
+static void dg1_irq_reset(struct xe_tile *tile)
 {
-	if (gt->info.id == 0)
-		dg1_intr_disable(gt_to_xe(gt));
+	if (tile->id == 0)
+		dg1_intr_disable(tile_to_xe(tile));
 
-	gt_irq_reset(gt);
+	gt_irq_reset(tile);
 
-	mask_and_disable(gt, GU_MISC_IRQ_OFFSET);
-	mask_and_disable(gt, PCU_IRQ_OFFSET);
+	mask_and_disable(tile, GU_MISC_IRQ_OFFSET);
+	mask_and_disable(tile, PCU_IRQ_OFFSET);
 }
 
 static void xe_irq_reset(struct xe_device *xe)
 {
-	struct xe_gt *gt;
+	struct xe_tile *tile;
 	u8 id;
 
-	for_each_gt(gt, xe, id) {
+	for_each_tile(tile, xe, id) {
 		if (GRAPHICS_VERx100(xe) >= 1210)
-			dg1_irq_reset(gt);
+			dg1_irq_reset(tile);
 		else
-			xelp_irq_reset(gt);
+			xelp_irq_reset(tile);
 	}
 
 	xe_display_irq_reset(xe);
 }
 
-void xe_gt_irq_postinstall(struct xe_gt *gt)
+void xe_gt_irq_postinstall(struct xe_tile *tile)
 {
-	struct xe_device *xe = gt_to_xe(gt);
+	struct xe_device *xe = tile_to_xe(tile);
 
 	if (GRAPHICS_VERx100(xe) >= 1210)
-		dg1_irq_postinstall(xe, gt);
+		dg1_irq_postinstall(xe, tile);
 	else
-		xelp_irq_postinstall(xe, gt);
+		xelp_irq_postinstall(xe, tile);
 }
 
 static void xe_irq_postinstall(struct xe_device *xe)
 {
-	struct xe_gt *gt;
+	struct xe_tile *tile;
 	u8 id;
 
-	for_each_gt(gt, xe, id)
-		xe_gt_irq_postinstall(gt);
+	for_each_tile(tile, xe, id)
+		xe_gt_irq_postinstall(tile);
 
 	xe_display_irq_postinstall(xe, xe_primary_mmio_gt(xe));
 }
diff --git a/drivers/gpu/drm/xe/xe_irq.h b/drivers/gpu/drm/xe/xe_irq.h
index 34ecf22b32d3..69113c21e1cd 100644
--- a/drivers/gpu/drm/xe/xe_irq.h
+++ b/drivers/gpu/drm/xe/xe_irq.h
@@ -7,10 +7,10 @@
 #define _XE_IRQ_H_
 
 struct xe_device;
-struct xe_gt;
+struct xe_tile;
 
 int xe_irq_install(struct xe_device *xe);
-void xe_gt_irq_postinstall(struct xe_gt *gt);
+void xe_gt_irq_postinstall(struct xe_tile *tile);
 void xe_irq_shutdown(struct xe_device *xe);
 void xe_irq_suspend(struct xe_device *xe);
 void xe_irq_resume(struct xe_device *xe);
-- 
2.40.0


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [Intel-xe] [PATCH 16/26] drm/xe/irq: Handle ASLE backlight interrupts at same time as display
  2023-05-11  3:46 [Intel-xe] [PATCH 00/26] Separate GT and tile Matt Roper
                   ` (14 preceding siblings ...)
  2023-05-11  3:47 ` [Intel-xe] [PATCH 15/26] drm/xe: Interrupts are delivered per-tile, not per-GT Matt Roper
@ 2023-05-11  3:47 ` Matt Roper
  2023-05-18 18:33   ` Lucas De Marchi
  2023-05-11  3:47 ` [Intel-xe] [PATCH 17/26] drm/xe/irq: Actually call xe_irq_postinstall() Matt Roper
                   ` (18 subsequent siblings)
  34 siblings, 1 reply; 75+ messages in thread
From: Matt Roper @ 2023-05-11  3:47 UTC (permalink / raw)
  To: intel-xe; +Cc: matthew.d.roper

Our only use of GUnit interrupts is to handle ASLE backlight operations
that are reported as GUnit GSE interrupts.  Move the enable/disable of
these interrupts adjacent to display interrupts.

In the future we may want to even move these inside the
xe_display_irq_*() functions.  But since these rely on xe_irq static
functions like mask_and_disable() it's easier to keep them separate for
now.

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/xe/xe_irq.c | 15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_irq.c b/drivers/gpu/drm/xe/xe_irq.c
index fa7d04ba23c0..02f44b58ce3e 100644
--- a/drivers/gpu/drm/xe/xe_irq.c
+++ b/drivers/gpu/drm/xe/xe_irq.c
@@ -187,8 +187,6 @@ static void xelp_irq_postinstall(struct xe_device *xe, struct xe_tile *tile)
 
 	gt_irq_postinstall(tile);
 
-	unmask_and_enable(tile, GU_MISC_IRQ_OFFSET, GU_MISC_GSE);
-
 	xelp_intr_enable(xe, true);
 }
 
@@ -372,8 +370,6 @@ static void dg1_irq_postinstall(struct xe_device *xe, struct xe_tile *tile)
 {
 	gt_irq_postinstall(tile);
 
-	unmask_and_enable(tile, GU_MISC_IRQ_OFFSET, GU_MISC_GSE);
-
 	if (tile->id == 0)
 		dg1_intr_enable(xe, true);
 }
@@ -486,7 +482,6 @@ static void xelp_irq_reset(struct xe_tile *tile)
 
 	gt_irq_reset(tile);
 
-	mask_and_disable(tile, GU_MISC_IRQ_OFFSET);
 	mask_and_disable(tile, PCU_IRQ_OFFSET);
 }
 
@@ -497,7 +492,6 @@ static void dg1_irq_reset(struct xe_tile *tile)
 
 	gt_irq_reset(tile);
 
-	mask_and_disable(tile, GU_MISC_IRQ_OFFSET);
 	mask_and_disable(tile, PCU_IRQ_OFFSET);
 }
 
@@ -513,6 +507,8 @@ static void xe_irq_reset(struct xe_device *xe)
 			xelp_irq_reset(tile);
 	}
 
+	tile = xe_device_get_root_tile(xe);
+	mask_and_disable(tile, GU_MISC_IRQ_OFFSET);
 	xe_display_irq_reset(xe);
 }
 
@@ -535,6 +531,13 @@ static void xe_irq_postinstall(struct xe_device *xe)
 		xe_gt_irq_postinstall(tile);
 
 	xe_display_irq_postinstall(xe, xe_primary_mmio_gt(xe));
+
+	/*
+	 * ASLE backlight operations are reported via GUnit GSE interrupts
+	 * on the root tile.
+	 */
+	unmask_and_enable(xe_device_get_root_tile(xe),
+			  GU_MISC_IRQ_OFFSET, GU_MISC_GSE);
 }
 
 static irq_handler_t xe_irq_handler(struct xe_device *xe)
-- 
2.40.0


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [Intel-xe] [PATCH 17/26] drm/xe/irq: Actually call xe_irq_postinstall()
  2023-05-11  3:46 [Intel-xe] [PATCH 00/26] Separate GT and tile Matt Roper
                   ` (15 preceding siblings ...)
  2023-05-11  3:47 ` [Intel-xe] [PATCH 16/26] drm/xe/irq: Handle ASLE backlight interrupts at same time as display Matt Roper
@ 2023-05-11  3:47 ` Matt Roper
  2023-05-18 18:40   ` Lucas De Marchi
  2023-05-11  3:47 ` [Intel-xe] [PATCH 18/26] drm/xe/irq: Ensure primary GuC won't clobber media GuC's interrupt mask Matt Roper
                   ` (17 subsequent siblings)
  34 siblings, 1 reply; 75+ messages in thread
From: Matt Roper @ 2023-05-11  3:47 UTC (permalink / raw)
  To: intel-xe; +Cc: matthew.d.roper

The xe_irq_postinstall() never actually gets called after installing the
interrupt handler.  This oversight seems to get papered over due to the
fact that the (misnamed) xe_gt_irq_postinstall does more than it really
should and gets called in the middle of the GT initialization.

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/xe/xe_irq.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/xe/xe_irq.c b/drivers/gpu/drm/xe/xe_irq.c
index 02f44b58ce3e..2549fd9fb5cd 100644
--- a/drivers/gpu/drm/xe/xe_irq.c
+++ b/drivers/gpu/drm/xe/xe_irq.c
@@ -588,6 +588,8 @@ int xe_irq_install(struct xe_device *xe)
 		return err;
 	}
 
+	xe_irq_postinstall(xe);
+
 	err = drmm_add_action_or_reset(&xe->drm, irq_uninstall, xe);
 	if (err)
 		return err;
-- 
2.40.0


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [Intel-xe] [PATCH 18/26] drm/xe/irq: Ensure primary GuC won't clobber media GuC's interrupt mask
  2023-05-11  3:46 [Intel-xe] [PATCH 00/26] Separate GT and tile Matt Roper
                   ` (16 preceding siblings ...)
  2023-05-11  3:47 ` [Intel-xe] [PATCH 17/26] drm/xe/irq: Actually call xe_irq_postinstall() Matt Roper
@ 2023-05-11  3:47 ` Matt Roper
  2023-05-11  3:47 ` [Intel-xe] [PATCH 19/26] drm/xe/irq: Untangle postinstall functions Matt Roper
                   ` (16 subsequent siblings)
  34 siblings, 0 replies; 75+ messages in thread
From: Matt Roper @ 2023-05-11  3:47 UTC (permalink / raw)
  To: intel-xe; +Cc: Lucas De Marchi, matthew.d.roper

Although primary and media GuC share a single interrupt enable bit, they
each have distinct bits in the mask register.  Although we always enable
interrupts for the primary GuC before the media GuC today (and never
disable either of them), this might not always be the case in the
future, so use a RMW when updating the mask register to ensure the other
GuC's mask doesn't get clobbered.

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
---
 drivers/gpu/drm/xe/xe_guc.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_guc.c b/drivers/gpu/drm/xe/xe_guc.c
index eb4af4c71124..2f01314c7f11 100644
--- a/drivers/gpu/drm/xe/xe_guc.c
+++ b/drivers/gpu/drm/xe/xe_guc.c
@@ -553,12 +553,15 @@ static void guc_enable_irq(struct xe_guc *guc)
 		REG_FIELD_PREP(ENGINE0_MASK, GUC_INTR_GUC2HOST)  :
 		REG_FIELD_PREP(ENGINE1_MASK, GUC_INTR_GUC2HOST);
 
+	/* Primary GuC and media GuC share a single enable bit */
 	xe_mmio_write32(gt, GUC_SG_INTR_ENABLE,
 			REG_FIELD_PREP(ENGINE1_MASK, GUC_INTR_GUC2HOST));
-	if (xe_gt_is_media_type(gt))
-		xe_mmio_rmw32(gt, GUC_SG_INTR_MASK, events, 0);
-	else
-		xe_mmio_write32(gt, GUC_SG_INTR_MASK, ~events);
+
+	/*
+	 * There are separate mask bits for primary and media GuCs, so use
+	 * a RMW operation to avoid clobbering the other GuC's setting.
+	 */
+	xe_mmio_rmw32(gt, GUC_SG_INTR_MASK, events, 0);
 }
 
 int xe_guc_enable_communication(struct xe_guc *guc)
-- 
2.40.0


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [Intel-xe] [PATCH 19/26] drm/xe/irq: Untangle postinstall functions
  2023-05-11  3:46 [Intel-xe] [PATCH 00/26] Separate GT and tile Matt Roper
                   ` (17 preceding siblings ...)
  2023-05-11  3:47 ` [Intel-xe] [PATCH 18/26] drm/xe/irq: Ensure primary GuC won't clobber media GuC's interrupt mask Matt Roper
@ 2023-05-11  3:47 ` Matt Roper
  2023-05-18 18:45   ` Lucas De Marchi
  2023-05-11  3:47 ` [Intel-xe] [PATCH 20/26] drm/xe: Replace xe_gt_irq_postinstall with xe_irq_enable_hwe Matt Roper
                   ` (15 subsequent siblings)
  34 siblings, 1 reply; 75+ messages in thread
From: Matt Roper @ 2023-05-11  3:47 UTC (permalink / raw)
  To: intel-xe; +Cc: matthew.d.roper

The callstack for postinstall is a bit muddled with top-level device
interrupt enablement happening within platform-specific functions called
from the per-tile xe_gt_irq_postinstall() function.  If we pull
top-level irq enablement up to xe_irq_postinstall where we'd expect it
to be, we can eliminate some confusing layers of indirection.

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/xe/xe_irq.c | 35 +++++++----------------------------
 1 file changed, 7 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_irq.c b/drivers/gpu/drm/xe/xe_irq.c
index 2549fd9fb5cd..58745a5add87 100644
--- a/drivers/gpu/drm/xe/xe_irq.c
+++ b/drivers/gpu/drm/xe/xe_irq.c
@@ -122,7 +122,7 @@ static inline void xelp_intr_enable(struct xe_device *xe, bool stall)
 		xe_mmio_read32(mmio, GFX_MSTR_IRQ);
 }
 
-static void gt_irq_postinstall(struct xe_tile *tile)
+void xe_gt_irq_postinstall(struct xe_tile *tile)
 {
 	struct xe_device *xe = tile_to_xe(tile);
 	struct xe_gt *mmio = tile->primary_gt;
@@ -181,15 +181,6 @@ static void gt_irq_postinstall(struct xe_tile *tile)
 	xe_mmio_write32(mmio, GUC_SG_INTR_MASK,  ~0);
 }
 
-static void xelp_irq_postinstall(struct xe_device *xe, struct xe_tile *tile)
-{
-	/* TODO: PCH */
-
-	gt_irq_postinstall(tile);
-
-	xelp_intr_enable(xe, true);
-}
-
 static u32
 gt_engine_identity(struct xe_device *xe,
 		   struct xe_gt *mmio,
@@ -366,14 +357,6 @@ static void dg1_intr_enable(struct xe_device *xe, bool stall)
 		xe_mmio_read32(mmio, DG1_MSTR_TILE_INTR);
 }
 
-static void dg1_irq_postinstall(struct xe_device *xe, struct xe_tile *tile)
-{
-	gt_irq_postinstall(tile);
-
-	if (tile->id == 0)
-		dg1_intr_enable(xe, true);
-}
-
 /*
  * Top-level interrupt handler for Xe_LP+ and beyond.  These platforms have
  * a "master tile" interrupt register which must be consulted before the
@@ -512,16 +495,6 @@ static void xe_irq_reset(struct xe_device *xe)
 	xe_display_irq_reset(xe);
 }
 
-void xe_gt_irq_postinstall(struct xe_tile *tile)
-{
-	struct xe_device *xe = tile_to_xe(tile);
-
-	if (GRAPHICS_VERx100(xe) >= 1210)
-		dg1_irq_postinstall(xe, tile);
-	else
-		xelp_irq_postinstall(xe, tile);
-}
-
 static void xe_irq_postinstall(struct xe_device *xe)
 {
 	struct xe_tile *tile;
@@ -538,6 +511,12 @@ static void xe_irq_postinstall(struct xe_device *xe)
 	 */
 	unmask_and_enable(xe_device_get_root_tile(xe),
 			  GU_MISC_IRQ_OFFSET, GU_MISC_GSE);
+
+	/* Enable top-level interrupts */
+	if (GRAPHICS_VERx100(xe) >= 1210)
+		dg1_intr_enable(xe, true);
+	else
+		xelp_intr_enable(xe, true);
 }
 
 static irq_handler_t xe_irq_handler(struct xe_device *xe)
-- 
2.40.0


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [Intel-xe] [PATCH 20/26] drm/xe: Replace xe_gt_irq_postinstall with xe_irq_enable_hwe
  2023-05-11  3:46 [Intel-xe] [PATCH 00/26] Separate GT and tile Matt Roper
                   ` (18 preceding siblings ...)
  2023-05-11  3:47 ` [Intel-xe] [PATCH 19/26] drm/xe/irq: Untangle postinstall functions Matt Roper
@ 2023-05-11  3:47 ` Matt Roper
  2023-05-18 19:54   ` Lucas De Marchi
  2023-05-11  3:47 ` [Intel-xe] [PATCH 21/26] drm/xe: Invalidate TLB on all affected GTs during GGTT updates Matt Roper
                   ` (14 subsequent siblings)
  34 siblings, 1 reply; 75+ messages in thread
From: Matt Roper @ 2023-05-11  3:47 UTC (permalink / raw)
  To: intel-xe; +Cc: matthew.d.roper

The majority of xe_gt_irq_postinstall() is really focused on the
hardware engine interrupts; other GT-related interrupts such as the GuC
are enabled/disabled independently.  Renaming the function and making it
truly GT-specific will make it more clear what the intended focus is.

Disabling/masking of other interrupts (such as GuC interrupts) is
unnecessary since that has already happened during the irq_reset stage,
and doing so will become harmful once the media GT is re-enabled since
calls to xe_gt_irq_postinstall during media GT initialization would
incorrectly disable the primary GT's GuC interrupts.

Also, since this function is called from gt_fw_domain_init(), it's not
necessary to also call it earlier during xe_irq_postinstall; just
xe_irq_resume to handle runtime resume should be sufficient.

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/xe/xe_gt.c        |  2 +-
 drivers/gpu/drm/xe/xe_hw_engine.c |  1 +
 drivers/gpu/drm/xe/xe_irq.c       | 91 ++++++++++++++++---------------
 drivers/gpu/drm/xe/xe_irq.h       |  3 +-
 4 files changed, 50 insertions(+), 47 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c
index e00d260dff00..2a3457fb97fa 100644
--- a/drivers/gpu/drm/xe/xe_gt.c
+++ b/drivers/gpu/drm/xe/xe_gt.c
@@ -303,7 +303,7 @@ static int gt_fw_domain_init(struct xe_gt *gt)
 	gt->info.engine_mask = gt->info.__engine_mask;
 
 	/* Enables per hw engine IRQs */
-	xe_gt_irq_postinstall(gt_to_tile(gt));
+	xe_irq_enable_hwe(gt);
 
 	/* Rerun MCR init as we now have hw engine list */
 	xe_gt_mcr_init(gt);
diff --git a/drivers/gpu/drm/xe/xe_hw_engine.c b/drivers/gpu/drm/xe/xe_hw_engine.c
index fe8af54ea8bd..5188ee268b30 100644
--- a/drivers/gpu/drm/xe/xe_hw_engine.c
+++ b/drivers/gpu/drm/xe/xe_hw_engine.c
@@ -17,6 +17,7 @@
 #include "xe_gt.h"
 #include "xe_gt_topology.h"
 #include "xe_hw_fence.h"
+#include "xe_irq.h"
 #include "xe_lrc.h"
 #include "xe_macros.h"
 #include "xe_mmio.h"
diff --git a/drivers/gpu/drm/xe/xe_irq.c b/drivers/gpu/drm/xe/xe_irq.c
index 58745a5add87..12919ef68cff 100644
--- a/drivers/gpu/drm/xe/xe_irq.c
+++ b/drivers/gpu/drm/xe/xe_irq.c
@@ -122,13 +122,14 @@ static inline void xelp_intr_enable(struct xe_device *xe, bool stall)
 		xe_mmio_read32(mmio, GFX_MSTR_IRQ);
 }
 
-void xe_gt_irq_postinstall(struct xe_tile *tile)
+/* Enable/unmask the HWE interrupts for a specific GT's engines. */
+void xe_irq_enable_hwe(struct xe_gt *gt)
 {
-	struct xe_device *xe = tile_to_xe(tile);
-	struct xe_gt *mmio = tile->primary_gt;
+	struct xe_device *xe = gt_to_xe(gt);
+	u32 ccs_mask, bcs_mask;
 	u32 irqs, dmask, smask;
-	u32 ccs_mask = xe_hw_engine_mask_per_class(tile->primary_gt, XE_ENGINE_CLASS_COMPUTE);
-	u32 bcs_mask = xe_hw_engine_mask_per_class(tile->primary_gt, XE_ENGINE_CLASS_COPY);
+	if (!gt)
+		return;
 
 	if (xe_device_guc_submission_enabled(xe)) {
 		irqs = GT_RENDER_USER_INTERRUPT |
@@ -140,45 +141,44 @@ void xe_gt_irq_postinstall(struct xe_tile *tile)
 		       GT_WAIT_SEMAPHORE_INTERRUPT;
 	}
 
+	ccs_mask = xe_hw_engine_mask_per_class(gt, XE_ENGINE_CLASS_COMPUTE);
+	bcs_mask = xe_hw_engine_mask_per_class(gt, XE_ENGINE_CLASS_COPY);
+
 	dmask = irqs << 16 | irqs;
 	smask = irqs << 16;
 
-	/* Enable RCS, BCS, VCS and VECS class interrupts. */
-	xe_mmio_write32(mmio, RENDER_COPY_INTR_ENABLE, dmask);
-	xe_mmio_write32(mmio, VCS_VECS_INTR_ENABLE, dmask);
-	if (ccs_mask)
-		xe_mmio_write32(mmio, CCS_RSVD_INTR_ENABLE, smask);
+	if (!xe_gt_is_media_type(gt)) {
+		/* Enable classes */
+		xe_mmio_write32(gt, RENDER_COPY_INTR_ENABLE, dmask);
+		if (ccs_mask)
+			xe_mmio_write32(gt, CCS_RSVD_INTR_ENABLE, smask);
+
+		/* Unmask instances */
+		xe_mmio_write32(gt, RCS0_RSVD_INTR_MASK, ~smask);
+		xe_mmio_write32(gt, BCS_RSVD_INTR_MASK, ~smask);
+		if (bcs_mask & (BIT(1)|BIT(2)))
+			xe_mmio_write32(gt, XEHPC_BCS1_BCS2_INTR_MASK, ~dmask);
+		if (bcs_mask & (BIT(3)|BIT(4)))
+			xe_mmio_write32(gt, XEHPC_BCS3_BCS4_INTR_MASK, ~dmask);
+		if (bcs_mask & (BIT(5)|BIT(6)))
+			xe_mmio_write32(gt, XEHPC_BCS5_BCS6_INTR_MASK, ~dmask);
+		if (bcs_mask & (BIT(7)|BIT(8)))
+			xe_mmio_write32(gt, XEHPC_BCS7_BCS8_INTR_MASK, ~dmask);
+		if (ccs_mask & (BIT(0)|BIT(1)))
+			xe_mmio_write32(gt, CCS0_CCS1_INTR_MASK, ~dmask);
+		if (ccs_mask & (BIT(2)|BIT(3)))
+			xe_mmio_write32(gt,  CCS2_CCS3_INTR_MASK, ~dmask);
+	}
 
-	/* Unmask irqs on RCS, BCS, VCS and VECS engines. */
-	xe_mmio_write32(mmio, RCS0_RSVD_INTR_MASK, ~smask);
-	xe_mmio_write32(mmio, BCS_RSVD_INTR_MASK, ~smask);
-	if (bcs_mask & (BIT(1)|BIT(2)))
-		xe_mmio_write32(mmio, XEHPC_BCS1_BCS2_INTR_MASK, ~dmask);
-	if (bcs_mask & (BIT(3)|BIT(4)))
-		xe_mmio_write32(mmio, XEHPC_BCS3_BCS4_INTR_MASK, ~dmask);
-	if (bcs_mask & (BIT(5)|BIT(6)))
-		xe_mmio_write32(mmio, XEHPC_BCS5_BCS6_INTR_MASK, ~dmask);
-	if (bcs_mask & (BIT(7)|BIT(8)))
-		xe_mmio_write32(mmio, XEHPC_BCS7_BCS8_INTR_MASK, ~dmask);
-	xe_mmio_write32(mmio, VCS0_VCS1_INTR_MASK, ~dmask);
-	xe_mmio_write32(mmio, VCS2_VCS3_INTR_MASK, ~dmask);
-	xe_mmio_write32(mmio, VECS0_VECS1_INTR_MASK, ~dmask);
-	if (ccs_mask & (BIT(0)|BIT(1)))
-		xe_mmio_write32(mmio, CCS0_CCS1_INTR_MASK, ~dmask);
-	if (ccs_mask & (BIT(2)|BIT(3)))
-		xe_mmio_write32(mmio,  CCS2_CCS3_INTR_MASK, ~dmask);
+	if (xe_gt_is_media_type(gt) || MEDIA_VER(xe) < 13) {
+		/* Enable classes */
+		xe_mmio_write32(gt, VCS_VECS_INTR_ENABLE, dmask);
 
-	/*
-	 * RPS interrupts will get enabled/disabled on demand when RPS itself
-	 * is enabled/disabled.
-	 */
-	/* TODO: gt->pm_ier, gt->pm_imr */
-	xe_mmio_write32(mmio, GPM_WGBOXPERF_INTR_ENABLE, 0);
-	xe_mmio_write32(mmio, GPM_WGBOXPERF_INTR_MASK,  ~0);
-
-	/* Same thing for GuC interrupts */
-	xe_mmio_write32(mmio, GUC_SG_INTR_ENABLE, 0);
-	xe_mmio_write32(mmio, GUC_SG_INTR_MASK,  ~0);
+		/* Unmask instances */
+		xe_mmio_write32(gt, VCS0_VCS1_INTR_MASK, ~dmask);
+		xe_mmio_write32(gt, VCS2_VCS3_INTR_MASK, ~dmask);
+		xe_mmio_write32(gt, VECS0_VECS1_INTR_MASK, ~dmask);
+	}
 }
 
 static u32
@@ -497,12 +497,6 @@ static void xe_irq_reset(struct xe_device *xe)
 
 static void xe_irq_postinstall(struct xe_device *xe)
 {
-	struct xe_tile *tile;
-	u8 id;
-
-	for_each_tile(tile, xe, id)
-		xe_gt_irq_postinstall(tile);
-
 	xe_display_irq_postinstall(xe, xe_primary_mmio_gt(xe));
 
 	/*
@@ -591,9 +585,16 @@ void xe_irq_suspend(struct xe_device *xe)
 
 void xe_irq_resume(struct xe_device *xe)
 {
+	struct xe_gt *gt;
+	int id;
+
 	spin_lock_irq(&xe->irq.lock);
 	xe->irq.enabled = true;
 	xe_irq_reset(xe);
 	xe_irq_postinstall(xe);
+
+	for_each_gt(gt, xe, id)
+		xe_irq_enable_hwe(gt);
+
 	spin_unlock_irq(&xe->irq.lock);
 }
diff --git a/drivers/gpu/drm/xe/xe_irq.h b/drivers/gpu/drm/xe/xe_irq.h
index 69113c21e1cd..bc42bc90d967 100644
--- a/drivers/gpu/drm/xe/xe_irq.h
+++ b/drivers/gpu/drm/xe/xe_irq.h
@@ -8,11 +8,12 @@
 
 struct xe_device;
 struct xe_tile;
+struct xe_gt;
 
 int xe_irq_install(struct xe_device *xe);
-void xe_gt_irq_postinstall(struct xe_tile *tile);
 void xe_irq_shutdown(struct xe_device *xe);
 void xe_irq_suspend(struct xe_device *xe);
 void xe_irq_resume(struct xe_device *xe);
+void xe_irq_enable_hwe(struct xe_gt *gt);
 
 #endif
-- 
2.40.0


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [Intel-xe] [PATCH 21/26] drm/xe: Invalidate TLB on all affected GTs during GGTT updates
  2023-05-11  3:46 [Intel-xe] [PATCH 00/26] Separate GT and tile Matt Roper
                   ` (19 preceding siblings ...)
  2023-05-11  3:47 ` [Intel-xe] [PATCH 20/26] drm/xe: Replace xe_gt_irq_postinstall with xe_irq_enable_hwe Matt Roper
@ 2023-05-11  3:47 ` Matt Roper
  2023-05-11  3:47 ` [Intel-xe] [PATCH 22/26] drm/xe/tlb: Obtain forcewake when doing GGTT TLB invalidations Matt Roper
                   ` (13 subsequent siblings)
  34 siblings, 0 replies; 75+ messages in thread
From: Matt Roper @ 2023-05-11  3:47 UTC (permalink / raw)
  To: intel-xe; +Cc: matthew.d.roper

The GGTT is part of the tile and is shared by the primary and media GTs
on platforms with a standalone media architecture.  However each of
these GTs has its own TLBs caching the page table lookups, and each
needs to be invalidated separately.

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/xe/xe_ggtt.c | 16 ++++++++++------
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_ggtt.c b/drivers/gpu/drm/xe/xe_ggtt.c
index 7c87623ef5c5..31f958613c2f 100644
--- a/drivers/gpu/drm/xe/xe_ggtt.c
+++ b/drivers/gpu/drm/xe/xe_ggtt.c
@@ -188,13 +188,10 @@ int xe_ggtt_init(struct xe_ggtt *ggtt)
 #define PVC_GUC_TLB_INV_DESC1			XE_REG(0xcf80)
 #define   PVC_GUC_TLB_INV_DESC1_INVALIDATE	REG_BIT(6)
 
-void xe_ggtt_invalidate(struct xe_ggtt *ggtt)
+static void ggtt_invalidate_gt_tlb(struct xe_gt *gt)
 {
-	/*
-	 * TODO: Loop over each GT in tile once media GT support is
-	 * re-added
-	 */
-	struct xe_gt *gt = ggtt->tile->primary_gt;
+	if (!gt)
+		return;
 
 	/* TODO: vfunc for GuC vs. non-GuC */
 
@@ -219,6 +216,13 @@ void xe_ggtt_invalidate(struct xe_ggtt *ggtt)
 	}
 }
 
+void xe_ggtt_invalidate(struct xe_ggtt *ggtt)
+{
+	/* Each GT in a tile has its own TLB to cache GGTT lookups */
+	ggtt_invalidate_gt_tlb(ggtt->tile->primary_gt);
+	ggtt_invalidate_gt_tlb(ggtt->tile->media_gt);
+}
+
 void xe_ggtt_printk(struct xe_ggtt *ggtt, const char *prefix)
 {
 	u64 addr, scratch_pte;
-- 
2.40.0


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [Intel-xe] [PATCH 22/26] drm/xe/tlb: Obtain forcewake when doing GGTT TLB invalidations
  2023-05-11  3:46 [Intel-xe] [PATCH 00/26] Separate GT and tile Matt Roper
                   ` (20 preceding siblings ...)
  2023-05-11  3:47 ` [Intel-xe] [PATCH 21/26] drm/xe: Invalidate TLB on all affected GTs during GGTT updates Matt Roper
@ 2023-05-11  3:47 ` Matt Roper
  2023-05-11  3:47 ` [Intel-xe] [PATCH 23/26] drm/xe: Allow GT looping and lookup on standalone media Matt Roper
                   ` (12 subsequent siblings)
  34 siblings, 0 replies; 75+ messages in thread
From: Matt Roper @ 2023-05-11  3:47 UTC (permalink / raw)
  To: intel-xe; +Cc: Lucas De Marchi, matthew.d.roper

Updates to the GGTT can happen when there are no in-flight jobs keeping
the hardware awake.  If the GT is powered down when invalidation is
requested, we will not be able to communicate with the GuC (or MMIO) and
the invalidation request will go missing.  Explicitly grab GT forcewake
to ensure the GT and GuC are powered up during the TLB invalidation.

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
---
 drivers/gpu/drm/xe/xe_ggtt.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/gpu/drm/xe/xe_ggtt.c b/drivers/gpu/drm/xe/xe_ggtt.c
index 31f958613c2f..8f8d0f6a82cd 100644
--- a/drivers/gpu/drm/xe/xe_ggtt.c
+++ b/drivers/gpu/drm/xe/xe_ggtt.c
@@ -193,6 +193,13 @@ static void ggtt_invalidate_gt_tlb(struct xe_gt *gt)
 	if (!gt)
 		return;
 
+	/*
+	 * Invalidation can happen when there's no in-flight work keeping the
+	 * GT awake.  We need to explicitly grab forcewake to ensure the GT
+	 * and GuC are accessible.
+	 */
+	xe_force_wake_get(gt_to_fw(gt), XE_FW_GT);
+
 	/* TODO: vfunc for GuC vs. non-GuC */
 
 	if (gt->uc.guc.submission_state.enabled) {
@@ -214,6 +221,8 @@ static void ggtt_invalidate_gt_tlb(struct xe_gt *gt)
 			xe_mmio_write32(gt, GUC_TLB_INV_CR,
 					GUC_TLB_INV_CR_INVALIDATE);
 	}
+
+	xe_force_wake_put(gt_to_fw(gt), XE_FW_GT);
 }
 
 void xe_ggtt_invalidate(struct xe_ggtt *ggtt)
-- 
2.40.0


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [Intel-xe] [PATCH 23/26] drm/xe: Allow GT looping and lookup on standalone media
  2023-05-11  3:46 [Intel-xe] [PATCH 00/26] Separate GT and tile Matt Roper
                   ` (21 preceding siblings ...)
  2023-05-11  3:47 ` [Intel-xe] [PATCH 22/26] drm/xe/tlb: Obtain forcewake when doing GGTT TLB invalidations Matt Roper
@ 2023-05-11  3:47 ` Matt Roper
  2023-05-11  3:47 ` [Intel-xe] [PATCH 24/26] drm/xe: Update query uapi to support " Matt Roper
                   ` (11 subsequent siblings)
  34 siblings, 0 replies; 75+ messages in thread
From: Matt Roper @ 2023-05-11  3:47 UTC (permalink / raw)
  To: intel-xe; +Cc: matthew.d.roper

Allow xe_device_get_gt() and for_each_gt() to operate as expected on
platforms with standalone media.

FIXME: We need to figure out a consistent ID scheme for GTs.  At the
moment our platforms either have multi-tile (i.e., PVC) or standalone
media (MTL) but not both.  If a future platform supports both of these
capabilities at the same time, how will we number the GTs of the
platform?   primary-primary-media-media?  primary-media-primary-media?
For that matter should we even still be exposing the concept of 'GT' to
userspace or should that switch to tile instead (and keep the hardware's
separation of render and media an internal implementation detail like it
is on i915)?  If we only expose tiles to userspace and not GTs, then we
may not even need per-GT ID numbers anymore.

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/xe/xe_device.h | 27 +++++++++++++++++++++------
 1 file changed, 21 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_device.h b/drivers/gpu/drm/xe/xe_device.h
index 370b9ccb875b..5c39c03f4a5d 100644
--- a/drivers/gpu/drm/xe/xe_device.h
+++ b/drivers/gpu/drm/xe/xe_device.h
@@ -55,16 +55,27 @@ static inline struct xe_tile *xe_device_get_root_tile(struct xe_device *xe)
 
 static inline struct xe_gt *xe_device_get_gt(struct xe_device *xe, u8 gt_id)
 {
+	struct xe_tile *root_tile = xe_device_get_root_tile(xe);
 	struct xe_gt *gt;
 
-	XE_BUG_ON(gt_id > XE_MAX_TILES_PER_DEVICE);
+	if (drm_WARN_ON(&xe->drm, gt_id > XE_MAX_TILES_PER_DEVICE))
+		return root_tile->primary_gt;
 
-	gt = xe->tiles[gt_id].primary_gt;
-	if (drm_WARN_ON(&xe->drm, !gt))
+	/*
+	 * FIXME: This only works for now because multi-tile and standalone
+	 * media are mutually exclusive on the platforms we have today.
+	 */
+	if (MEDIA_VER(xe) >= 13) {
+		gt = gt_id ? root_tile->media_gt : root_tile->primary_gt;
+	} else {
+		gt = xe->tiles[gt_id].primary_gt;
+	}
+
+	if (!gt)
 		return NULL;
 
-	XE_BUG_ON(gt->info.id != gt_id);
-	XE_BUG_ON(gt->info.type == XE_GT_TYPE_UNINITIALIZED);
+	drm_WARN_ON(&xe->drm, gt->info.id != gt_id);
+	drm_WARN_ON(&xe->drm, gt->info.type == XE_GT_TYPE_UNINITIALIZED);
 
 	return gt;
 }
@@ -96,8 +107,12 @@ static inline void xe_device_guc_submission_disable(struct xe_device *xe)
 	for ((id__) = 0; (id__) < (xe__)->info.tile_count; (id__++)) \
 		for_each_if ((tile__) = &(xe__)->tiles[(id__)])
 
+/*
+ * FIXME: This only works for now since multi-tile and standalone media
+ * happen to be mutually exclusive.  Future platforms may change this...
+ */
 #define for_each_gt(gt__, xe__, id__) \
-	for ((id__) = 0; (id__) < (xe__)->info.tile_count; (id__++)) \
+	for ((id__) = 0; (id__) < (xe__)->info.tile_count + (MEDIA_VER(xe__) >= 13 ? 1 : 0); (id__++)) \
 		for_each_if ((gt__) = xe_device_get_gt((xe__), (id__)))
 
 static inline struct xe_force_wake * gt_to_fw(struct xe_gt *gt)
-- 
2.40.0


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [Intel-xe] [PATCH 24/26] drm/xe: Update query uapi to support standalone media
  2023-05-11  3:46 [Intel-xe] [PATCH 00/26] Separate GT and tile Matt Roper
                   ` (22 preceding siblings ...)
  2023-05-11  3:47 ` [Intel-xe] [PATCH 23/26] drm/xe: Allow GT looping and lookup on standalone media Matt Roper
@ 2023-05-11  3:47 ` Matt Roper
  2023-05-11  3:47 ` [Intel-xe] [PATCH 25/26] drm/xe: Reinstate media GT support Matt Roper
                   ` (10 subsequent siblings)
  34 siblings, 0 replies; 75+ messages in thread
From: Matt Roper @ 2023-05-11  3:47 UTC (permalink / raw)
  To: intel-xe; +Cc: matthew.d.roper

Now that a higher GT count can result from either multiple tiles (with
one GT each) or an extra media GT within the root tile, we need to
update the query code slightly to stop looking at tile_count.

FIXME: As noted previously, we need to decide on a formal direction for
exposing tiles and/or GTs to userspace.

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/xe/xe_query.c | 26 ++++++++++++++++++--------
 1 file changed, 18 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_query.c b/drivers/gpu/drm/xe/xe_query.c
index 4d8473328962..d3d39eb44b39 100644
--- a/drivers/gpu/drm/xe/xe_query.c
+++ b/drivers/gpu/drm/xe/xe_query.c
@@ -26,6 +26,18 @@ static const enum xe_engine_class xe_to_user_engine_class[] = {
 	[XE_ENGINE_CLASS_COMPUTE] = DRM_XE_ENGINE_CLASS_COMPUTE,
 };
 
+static int num_gt(struct xe_device *xe)
+{
+	int num = xe->info.tile_count;
+
+	if (xe_device_get_root_tile(xe)->media_gt) {
+		drm_WARN_ON(&xe->drm, num > 1);
+		num++;
+	}
+
+	return num;
+}
+
 static size_t calc_hw_engine_info_size(struct xe_device *xe)
 {
 	struct xe_hw_engine *hwe;
@@ -192,7 +204,7 @@ static int query_config(struct xe_device *xe, struct drm_xe_device_query *query)
 		xe->info.vram_flags & XE_VRAM_FLAGS_NEED64K ? SZ_64K : SZ_4K;
 	config->info[XE_QUERY_CONFIG_VA_BITS] = 12 +
 		(9 * (xe->info.vm_max_level + 1));
-	config->info[XE_QUERY_CONFIG_GT_COUNT] = xe->info.tile_count;
+	config->info[XE_QUERY_CONFIG_GT_COUNT] = num_gt(xe);
 	config->info[XE_QUERY_CONFIG_MEM_REGION_COUNT] =
 		hweight_long(xe->info.mem_region_mask);
 	config->info[XE_QUERY_CONFIG_MAX_ENGINE_PRIORITY] =
@@ -211,7 +223,7 @@ static int query_gts(struct xe_device *xe, struct drm_xe_device_query *query)
 {
 	struct xe_gt *gt;
 	size_t size = sizeof(struct drm_xe_query_gts) +
-		xe->info.tile_count * sizeof(struct drm_xe_query_gt);
+		num_gt(xe) * sizeof(struct drm_xe_query_gt);
 	struct drm_xe_query_gts __user *query_ptr =
 		u64_to_user_ptr(query->data);
 	struct drm_xe_query_gts *gts;
@@ -228,14 +240,12 @@ static int query_gts(struct xe_device *xe, struct drm_xe_device_query *query)
 	if (XE_IOCTL_ERR(xe, !gts))
 		return -ENOMEM;
 
-	gts->num_gt = xe->info.tile_count;
+	gts->num_gt = num_gt(xe);
 	for_each_gt(gt, xe, id) {
-		if (id == 0)
-			gts->gts[id].type = XE_QUERY_GT_TYPE_MAIN;
-		else if (xe_gt_is_media_type(gt))
+		if (xe_gt_is_media_type(gt))
 			gts->gts[id].type = XE_QUERY_GT_TYPE_MEDIA;
 		else
-			gts->gts[id].type = XE_QUERY_GT_TYPE_REMOTE;
+			gts->gts[id].type = XE_QUERY_GT_TYPE_MAIN;
 		gts->gts[id].instance = id;
 		gts->gts[id].clock_freq = gt->info.clock_freq;
 		if (!IS_DGFX(xe))
@@ -290,7 +300,7 @@ static int query_hwconfig(struct xe_device *xe,
 
 static size_t calc_topo_query_size(struct xe_device *xe)
 {
-	return xe->info.tile_count *
+	return num_gt(xe) *
 		(3 * sizeof(struct drm_xe_query_topology_mask) +
 		 sizeof_field(struct xe_gt, fuse_topo.g_dss_mask) +
 		 sizeof_field(struct xe_gt, fuse_topo.c_dss_mask) +
-- 
2.40.0


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [Intel-xe] [PATCH 25/26] drm/xe: Reinstate media GT support
  2023-05-11  3:46 [Intel-xe] [PATCH 00/26] Separate GT and tile Matt Roper
                   ` (23 preceding siblings ...)
  2023-05-11  3:47 ` [Intel-xe] [PATCH 24/26] drm/xe: Update query uapi to support " Matt Roper
@ 2023-05-11  3:47 ` Matt Roper
  2023-05-11  3:47 ` [Intel-xe] [PATCH 26/26] drm/xe: Clarify source of GT log messages Matt Roper
                   ` (9 subsequent siblings)
  34 siblings, 0 replies; 75+ messages in thread
From: Matt Roper @ 2023-05-11  3:47 UTC (permalink / raw)
  To: intel-xe; +Cc: matthew.d.roper

Now that tiles and GTs are handled separately and other prerequisite
changes are in place, we're ready to re-enable the media GT.

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/xe/regs/xe_gt_regs.h |  8 ++++++++
 drivers/gpu/drm/xe/xe_pci.c          | 26 +++++++++++++++++++++++++-
 2 files changed, 33 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/xe/regs/xe_gt_regs.h b/drivers/gpu/drm/xe/regs/xe_gt_regs.h
index 4d87f1fe010d..26247725e0d8 100644
--- a/drivers/gpu/drm/xe/regs/xe_gt_regs.h
+++ b/drivers/gpu/drm/xe/regs/xe_gt_regs.h
@@ -8,6 +8,14 @@
 
 #include "regs/xe_reg_defs.h"
 
+/*
+ * The GSI register range [0x0 - 0x40000) is replicated at a higher offset
+ * for the media GT.  xe_mmio and xe_gt_mcr functions will automatically
+ * translate offsets by MEDIA_GT_GSI_OFFSET when operating on the media GT.
+ */
+#define MEDIA_GT_GSI_OFFSET				0x380000
+#define MEDIA_GT_GSI_LENGTH				0x40000
+
 /* RPM unit config (Gen8+) */
 #define RPM_CONFIG0					XE_REG(0xd00)
 #define   RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_MASK		REG_GENMASK(5, 3)
diff --git a/drivers/gpu/drm/xe/xe_pci.c b/drivers/gpu/drm/xe/xe_pci.c
index 7d5e65d34f39..e24f51560ed8 100644
--- a/drivers/gpu/drm/xe/xe_pci.c
+++ b/drivers/gpu/drm/xe/xe_pci.c
@@ -530,7 +530,31 @@ static int xe_info_init(struct xe_device *xe,
 		if (MEDIA_VER(xe) < 13 && media_desc)
 			gt->info.__engine_mask |= media_desc->hw_engine_mask;
 
-		/* TODO: Init media GT, if present */
+		if (MEDIA_VER(xe) < 13 || !media_desc)
+			continue;
+
+		/*
+		 * Allocate and setup media GT for platforms with standalone
+		 * media.
+		 */
+		tile->media_gt = xe_gt_alloc(tile);
+		if (IS_ERR(tile->media_gt))
+			return PTR_ERR(tile->media_gt);
+
+		gt = tile->media_gt;
+		gt->info.type = XE_GT_TYPE_MEDIA;
+		gt->info.__engine_mask = media_desc->hw_engine_mask;
+		gt->mmio.adj_offset = MEDIA_GT_GSI_OFFSET;
+		gt->mmio.adj_limit = MEDIA_GT_GSI_LENGTH;
+
+		/*
+		 * FIXME: At the moment multi-tile and standalone media are
+		 * mutually exclusive on current platforms.  We'll need to
+		 * come up with a better way to number GTs if we ever wind
+		 * up with platforms that support both together.
+		 */
+		drm_WARN_ON(&xe->drm, id != 0);
+		gt->info.id = 1;
 	}
 
 	return 0;
-- 
2.40.0


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [Intel-xe] [PATCH 26/26] drm/xe: Clarify source of GT log messages
  2023-05-11  3:46 [Intel-xe] [PATCH 00/26] Separate GT and tile Matt Roper
                   ` (24 preceding siblings ...)
  2023-05-11  3:47 ` [Intel-xe] [PATCH 25/26] drm/xe: Reinstate media GT support Matt Roper
@ 2023-05-11  3:47 ` Matt Roper
  2023-05-17  9:33   ` Michal Wajdeczko
  2023-05-11  3:50 ` [Intel-xe] ✓ CI.Patch_applied: success for Separate GT and tile Patchwork
                   ` (8 subsequent siblings)
  34 siblings, 1 reply; 75+ messages in thread
From: Matt Roper @ 2023-05-11  3:47 UTC (permalink / raw)
  To: intel-xe; +Cc: matthew.d.roper

The various functions in xe_gt.h can print a lot of important error and
information messages; ensure that we always include the GT ID in those
prints for clarity.

In the future we may want to place the new macros in a dedicated header
like we've done in i915.  For now we're just using them within this one
file, so including them at the top of the .c is fine.

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/xe/xe_gt.c | 52 ++++++++++++++++++--------------------
 1 file changed, 25 insertions(+), 27 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c
index 2a3457fb97fa..edcb8ccde346 100644
--- a/drivers/gpu/drm/xe/xe_gt.c
+++ b/drivers/gpu/drm/xe/xe_gt.c
@@ -42,6 +42,11 @@
 #include "xe_wa.h"
 #include "xe_wopcm.h"
 
+#define gt_info(_gt, _fmt, ...) \
+	drm_info(&gt_to_xe(_gt)->drm, "GT%u (Tile%u): " _fmt, (_gt)->info.id, gt_to_tile(_gt)->id, ##__VA_ARGS__)
+#define gt_err(_gt, _fmt, ...) \
+	drm_err(&gt_to_xe(_gt)->drm, "GT%u (Tile%u): " _fmt, (_gt)->info.id, gt_to_tile(_gt)->id, ##__VA_ARGS__)
+
 struct xe_gt *xe_gt_alloc(struct xe_tile *tile)
 {
 	struct xe_gt *gt;
@@ -193,16 +198,16 @@ int xe_gt_record_default_lrcs(struct xe_gt *gt)
 				     hwe, ENGINE_FLAG_WA);
 		if (IS_ERR(e)) {
 			err = PTR_ERR(e);
-			drm_err(&xe->drm, "gt%d, hwe %s, xe_engine_create,e failed=%d",
-				gt->info.id, hwe->name, err);
+			gt_err(gt, "hwe %s, xe_engine_create,e failed=%d",
+			       hwe->name, err);
 			goto put_vm;
 		}
 
 		/* Prime golden LRC with known good state */
 		err = emit_wa_job(gt, e);
 		if (err) {
-			drm_err(&xe->drm, "gt%d, hwe %s, guc_id=%d, emit_wa_job,e failed=%d",
-				gt->info.id, hwe->name, e->guc->id, err);
+			gt_err(gt, "hwe %s, guc_id=%d, emit_wa_job,e failed=%d",
+				hwe->name, e->guc->id, err);
 			goto put_engine;
 		}
 
@@ -210,24 +215,24 @@ int xe_gt_record_default_lrcs(struct xe_gt *gt)
 					 1, hwe, ENGINE_FLAG_WA);
 		if (IS_ERR(nop_e)) {
 			err = PTR_ERR(nop_e);
-			drm_err(&xe->drm, "gt%d, hwe %s, xe_engine_create,nop_e failed=%d",
-				gt->info.id, hwe->name, err);
+			gt_err(gt, "hwe %s, xe_engine_create,nop_e failed=%d",
+				hwe->name, err);
 			goto put_engine;
 		}
 
 		/* Switch to different LRC */
 		err = emit_nop_job(gt, nop_e);
 		if (err) {
-			drm_err(&xe->drm, "gt%d, hwe %s, guc_id=%d, emit_nop_job,nop_e failed=%d",
-				gt->info.id, hwe->name, nop_e->guc->id, err);
+			gt_err(gt, "hwe %s, guc_id=%d, emit_nop_job,nop_e failed=%d",
+				hwe->name, nop_e->guc->id, err);
 			goto put_nop_e;
 		}
 
 		/* Reload golden LRC to record the effect of any indirect W/A */
 		err = emit_nop_job(gt, e);
 		if (err) {
-			drm_err(&xe->drm, "gt%d, hwe %s, guc_id=%d, emit_nop_job,e failed=%d",
-				gt->info.id, hwe->name, e->guc->id, err);
+			gt_err(gt, "hwe %s, guc_id=%d, emit_nop_job,e failed=%d",
+				hwe->name, e->guc->id, err);
 			goto put_nop_e;
 		}
 
@@ -443,15 +448,13 @@ int xe_gt_init(struct xe_gt *gt)
 
 static int do_gt_reset(struct xe_gt *gt)
 {
-	struct xe_device *xe = gt_to_xe(gt);
 	int err;
 
 	xe_mmio_write32(gt, GDRST, GRDOM_FULL);
 	err = xe_mmio_wait32(gt, GDRST, 0, GRDOM_FULL, 5000,
 			     NULL, false);
 	if (err)
-		drm_err(&xe->drm,
-			"GT reset failed to clear GEN11_GRDOM_FULL\n");
+		gt_err(gt, "reset failed to clear GRDOM_FULL\n");
 
 	return err;
 }
@@ -494,14 +497,13 @@ static int do_gt_restart(struct xe_gt *gt)
 
 static int gt_reset(struct xe_gt *gt)
 {
-	struct xe_device *xe = gt_to_xe(gt);
 	int err;
 
 	/* We only support GT resets with GuC submission */
 	if (!xe_device_guc_submission_enabled(gt_to_xe(gt)))
 		return -ENODEV;
 
-	drm_info(&xe->drm, "GT reset started\n");
+	gt_info(gt, "reset started\n");
 
 	xe_gt_sanitize(gt);
 
@@ -530,7 +532,7 @@ static int gt_reset(struct xe_gt *gt)
 	err = xe_force_wake_put(gt_to_fw(gt), XE_FORCEWAKE_ALL);
 	XE_WARN_ON(err);
 
-	drm_info(&xe->drm, "GT reset done\n");
+	gt_info(gt, "reset done\n");
 
 	return 0;
 
@@ -539,7 +541,7 @@ static int gt_reset(struct xe_gt *gt)
 err_msg:
 	XE_WARN_ON(xe_uc_start(&gt->uc));
 	xe_device_mem_access_put(gt_to_xe(gt));
-	drm_err(&xe->drm, "GT reset failed, err=%d\n", err);
+	gt_err(gt, "reset failed, err=%d\n", err);
 
 	return err;
 }
@@ -553,15 +555,13 @@ static void gt_reset_worker(struct work_struct *w)
 
 void xe_gt_reset_async(struct xe_gt *gt)
 {
-	struct xe_device *xe = gt_to_xe(gt);
-
-	drm_info(&xe->drm, "Try GT reset\n");
+	gt_info(gt, "Try GT reset\n");
 
 	/* Don't do a reset while one is already in flight */
 	if (xe_uc_reset_prepare(&gt->uc))
 		return;
 
-	drm_info(&xe->drm, "Doing GT reset\n");
+	gt_info(gt, "Doing GT reset\n");
 	queue_work(gt->ordered_wq, &gt->reset.worker);
 }
 
@@ -578,7 +578,6 @@ void xe_gt_suspend_prepare(struct xe_gt *gt)
 
 int xe_gt_suspend(struct xe_gt *gt)
 {
-	struct xe_device *xe = gt_to_xe(gt);
 	int err;
 
 	/* For now suspend/resume is only allowed with GuC */
@@ -598,7 +597,7 @@ int xe_gt_suspend(struct xe_gt *gt)
 
 	xe_device_mem_access_put(gt_to_xe(gt));
 	XE_WARN_ON(xe_force_wake_put(gt_to_fw(gt), XE_FORCEWAKE_ALL));
-	drm_info(&xe->drm, "GT suspended\n");
+	gt_info(gt, "suspended\n");
 
 	return 0;
 
@@ -606,14 +605,13 @@ int xe_gt_suspend(struct xe_gt *gt)
 	XE_WARN_ON(xe_force_wake_put(gt_to_fw(gt), XE_FORCEWAKE_ALL));
 err_msg:
 	xe_device_mem_access_put(gt_to_xe(gt));
-	drm_err(&xe->drm, "GT suspend failed: %d\n", err);
+	gt_err(gt, "suspend failed: %d\n", err);
 
 	return err;
 }
 
 int xe_gt_resume(struct xe_gt *gt)
 {
-	struct xe_device *xe = gt_to_xe(gt);
 	int err;
 
 	xe_device_mem_access_get(gt_to_xe(gt));
@@ -627,7 +625,7 @@ int xe_gt_resume(struct xe_gt *gt)
 
 	xe_device_mem_access_put(gt_to_xe(gt));
 	XE_WARN_ON(xe_force_wake_put(gt_to_fw(gt), XE_FORCEWAKE_ALL));
-	drm_info(&xe->drm, "GT resumed\n");
+	gt_info(gt, "resumed\n");
 
 	return 0;
 
@@ -635,7 +633,7 @@ int xe_gt_resume(struct xe_gt *gt)
 	XE_WARN_ON(xe_force_wake_put(gt_to_fw(gt), XE_FORCEWAKE_ALL));
 err_msg:
 	xe_device_mem_access_put(gt_to_xe(gt));
-	drm_err(&xe->drm, "GT resume failed: %d\n", err);
+	gt_err(gt, "resume failed: %d\n", err);
 
 	return err;
 }
-- 
2.40.0


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [Intel-xe] ✓ CI.Patch_applied: success for Separate GT and tile
  2023-05-11  3:46 [Intel-xe] [PATCH 00/26] Separate GT and tile Matt Roper
                   ` (25 preceding siblings ...)
  2023-05-11  3:47 ` [Intel-xe] [PATCH 26/26] drm/xe: Clarify source of GT log messages Matt Roper
@ 2023-05-11  3:50 ` Patchwork
  2023-05-11  3:51 ` [Intel-xe] ✗ CI.KUnit: failure " Patchwork
                   ` (7 subsequent siblings)
  34 siblings, 0 replies; 75+ messages in thread
From: Patchwork @ 2023-05-11  3:50 UTC (permalink / raw)
  To: Matt Roper; +Cc: intel-xe

== Series Details ==

Series: Separate GT and tile
URL   : https://patchwork.freedesktop.org/series/117614/
State : success

== Summary ==

=== Applying kernel patches on branch 'drm-xe-next' with base: ===
Base commit: f4ccda3ae drm/xe: fix tlb_invalidation_seqno_past()
=== git am output follows ===
Applying: drm/xe/mtl: Disable media GT
Applying: drm/xe: Introduce xe_tile
Applying: drm/xe: Add backpointer from gt to tile
Applying: drm/xe: Add for_each_tile iterator
Applying: drm/xe: Move register MMIO into xe_tile
Applying: drm/xe: Move VRAM from GT to tile
Applying: drm/xe: Memory allocations are tile-based, not GT-based
Applying: drm/xe: Move migration from GT to tile
Applying: drm/xe: Clarify 'gt' retrieval for primary tile
Applying: drm/xe: Drop vram_id
Applying: drm/xe: Drop extra_gts[] declarations and XE_GT_TYPE_REMOTE
Applying: drm/xe: Allocate GT dynamically
Applying: drm/xe: Add media GT to tile
Applying: drm/xe: Move display IRQ postinstall out of GT function
Applying: drm/xe: Interrupts are delivered per-tile, not per-GT
Applying: drm/xe/irq: Handle ASLE backlight interrupts at same time as display
Applying: drm/xe/irq: Actually call xe_irq_postinstall()
Applying: drm/xe/irq: Ensure primary GuC won't clobber media GuC's interrupt mask
Applying: drm/xe/irq: Untangle postinstall functions
Applying: drm/xe: Replace xe_gt_irq_postinstall with xe_irq_enable_hwe
Applying: drm/xe: Invalidate TLB on all affected GTs during GGTT updates
Applying: drm/xe/tlb: Obtain forcewake when doing GGTT TLB invalidations
Applying: drm/xe: Allow GT looping and lookup on standalone media
Applying: drm/xe: Update query uapi to support standalone media
Applying: drm/xe: Reinstate media GT support
Applying: drm/xe: Clarify source of GT log messages



^ permalink raw reply	[flat|nested] 75+ messages in thread

* [Intel-xe] ✗ CI.KUnit: failure for Separate GT and tile
  2023-05-11  3:46 [Intel-xe] [PATCH 00/26] Separate GT and tile Matt Roper
                   ` (26 preceding siblings ...)
  2023-05-11  3:50 ` [Intel-xe] ✓ CI.Patch_applied: success for Separate GT and tile Patchwork
@ 2023-05-11  3:51 ` Patchwork
  2023-05-11  7:08 ` [Intel-xe] ✓ CI.Patch_applied: success for Separate GT and tile (rev2) Patchwork
                   ` (6 subsequent siblings)
  34 siblings, 0 replies; 75+ messages in thread
From: Patchwork @ 2023-05-11  3:51 UTC (permalink / raw)
  To: Matt Roper; +Cc: intel-xe

== Series Details ==

Series: Separate GT and tile
URL   : https://patchwork.freedesktop.org/series/117614/
State : failure

== Summary ==

+ trap cleanup EXIT
+ /kernel/tools/testing/kunit/kunit.py run --kunitconfig /kernel/drivers/gpu/drm/xe/.kunitconfig
ERROR:root:../drivers/gpu/drm/xe/tests/xe_rtp_test.c: In function ‘xe_rtp_process_tests’:
../drivers/gpu/drm/xe/tests/xe_rtp_test.c:239:32: error: ‘struct xe_device’ has no member named ‘gt’
  239 |  struct xe_reg_sr *reg_sr = &xe->gt[0].reg_sr;
      |                                ^~
../drivers/gpu/drm/xe/tests/xe_rtp_test.c:244:44: error: ‘struct xe_device’ has no member named ‘gt’
  244 |  xe_rtp_process(param->entries, reg_sr, &xe->gt[0], NULL);
      |                                            ^~
make[7]: *** [../scripts/Makefile.build:252: drivers/gpu/drm/xe/tests/xe_rtp_test.o] Error 1
make[7]: *** Waiting for unfinished jobs....
In file included from ../drivers/gpu/drm/xe/xe_bo.c:1966:
../drivers/gpu/drm/xe/tests/xe_bo.c: In function ‘ccs_test_migrate’:
../drivers/gpu/drm/xe/tests/xe_bo.c:38:30: error: ‘struct xe_gt’ has no member named ‘migrate’
   38 |   fence = xe_migrate_clear(gt->migrate, bo, bo->ttm.resource);
      |                              ^~
../drivers/gpu/drm/xe/tests/xe_bo.c:93:33: error: ‘struct xe_gt’ has no member named ‘xe’
   93 |  offset = xe_device_ccs_bytes(gt->xe, bo->size);
      |                                 ^~
../drivers/gpu/drm/xe/tests/xe_bo.c: In function ‘evict_test_run_gt’:
../drivers/gpu/drm/xe/tests/xe_bo.c:177:41: error: ‘struct xe_device’ has no member named ‘gt’
  177 |  struct xe_vm *vm = xe_migrate_get_vm(xe->gt[0].migrate);
      |                                         ^~
make[6]: *** [../scripts/Makefile.build:252: drivers/gpu/drm/xe/xe_bo.o] Error 1
make[6]: *** Waiting for unfinished jobs....
make[6]: *** [../scripts/Makefile.build:494: drivers/gpu/drm/xe/tests] Error 2
make[5]: *** [../scripts/Makefile.build:494: drivers/gpu/drm/xe] Error 2
make[4]: *** [../scripts/Makefile.build:494: drivers/gpu/drm] Error 2
make[3]: *** [../scripts/Makefile.build:494: drivers/gpu] Error 2
make[3]: *** Waiting for unfinished jobs....
make[2]: *** [../scripts/Makefile.build:494: drivers] Error 2
make[2]: *** Waiting for unfinished jobs....
make[1]: *** [/kernel/Makefile:2025: .] Error 2
make: *** [Makefile:226: __sub-make] Error 2

[03:51:12] Configuring KUnit Kernel ...
Generating .config ...
Populating config with:
$ make ARCH=um O=.kunit olddefconfig
[03:51:16] Building KUnit Kernel ...
Populating config with:
$ make ARCH=um O=.kunit olddefconfig
Building with:
$ make ARCH=um O=.kunit --jobs=48
+ cleanup
++ stat -c %u:%g /kernel
+ chown -R 1003:1003 /kernel



^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Intel-xe] [PATCH 02/26] drm/xe: Introduce xe_tile
  2023-05-11  3:46 ` [Intel-xe] [PATCH 02/26] drm/xe: Introduce xe_tile Matt Roper
@ 2023-05-11  5:46   ` Lucas De Marchi
  2023-05-12  5:33   ` Iddamsetty, Aravind
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 75+ messages in thread
From: Lucas De Marchi @ 2023-05-11  5:46 UTC (permalink / raw)
  To: Matt Roper; +Cc: intel-xe

On Wed, May 10, 2023 at 08:46:58PM -0700, Matt Roper wrote:
>Create a new xe_tile structure to begin separating the concept of "tile"
>from "GT."  A tile is effectively a complete GPU, and a GT is just one
>part of that.  On platforms like MTL, there's only a single full GPU
>(tile) which has its IP blocks provided by two GTs.  In contrast, a
>"multi-tile" platform like PVC is basically multiple complete GPUs
>packed behind a single PCI device.
>
>For now, just create xe_tile as a simple wrapper around xe_gt.  The
>items in xe_gt that are truly tied to the tile rather than the GT will
>be moved in future patches.  Support for multiple GTs per tile (i.e.,
>the MTL standalone media case) will also be re-introduced in a future
>patch.
>
>Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
>---
> drivers/gpu/drm/xe/xe_device.h       | 11 +++++---
> drivers/gpu/drm/xe/xe_device_types.h | 40 +++++++++++++++++++++++++---
> drivers/gpu/drm/xe/xe_gt_types.h     | 15 +++++++----
> drivers/gpu/drm/xe/xe_mmio.c         | 13 ++++-----
> drivers/gpu/drm/xe/xe_pci.c          |  5 +++-
> drivers/gpu/drm/xe/xe_vm.c           |  2 +-
> drivers/gpu/drm/xe/xe_vm_types.h     |  8 +++---
> 7 files changed, 71 insertions(+), 23 deletions(-)
>
>diff --git a/drivers/gpu/drm/xe/xe_device.h b/drivers/gpu/drm/xe/xe_device.h
>index cbae480a2092..f7acaf51a1fc 100644
>--- a/drivers/gpu/drm/xe/xe_device.h
>+++ b/drivers/gpu/drm/xe/xe_device.h
>@@ -48,12 +48,17 @@ static inline struct xe_file *to_xe_file(const struct drm_file *file)
> 	return file->driver_priv;
> }
>
>+static inline struct xe_tile *xe_device_get_root_tile(struct xe_device *xe)
>+{
>+	return &xe->tiles[0];
>+}
>+
> static inline struct xe_gt *xe_device_get_gt(struct xe_device *xe, u8 gt_id)
> {
> 	struct xe_gt *gt;
>
>-	XE_BUG_ON(gt_id > XE_MAX_GT);
>-	gt = xe->gt + gt_id;
>+	XE_BUG_ON(gt_id > XE_MAX_TILES_PER_DEVICE);
>+	gt = &xe->tiles[gt_id].primary_gt;
> 	XE_BUG_ON(gt->info.id != gt_id);
> 	XE_BUG_ON(gt->info.type == XE_GT_TYPE_UNINITIALIZED);
>
>@@ -65,7 +70,7 @@ static inline struct xe_gt *xe_device_get_gt(struct xe_device *xe, u8 gt_id)
>  */
> static inline struct xe_gt *to_gt(struct xe_device *xe)
> {
>-	return xe->gt;
>+	return &xe_device_get_root_tile(xe)->primary_gt;
> }
>
> static inline bool xe_device_guc_submission_enabled(struct xe_device *xe)
>diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
>index 6490a04614ce..5dcf1695925f 100644
>--- a/drivers/gpu/drm/xe/xe_device_types.h
>+++ b/drivers/gpu/drm/xe/xe_device_types.h
>@@ -34,7 +34,7 @@
>
> #define XE_GT0		0
> #define XE_GT1		1
>-#define XE_MAX_GT	(XE_GT1 + 1)
>+#define XE_MAX_TILES_PER_DEVICE	(XE_GT1 + 1)
>
> #define XE_MAX_ASID	(BIT(20))
>
>@@ -48,6 +48,40 @@
> 	 (_xe)->info.step.graphics >= (min_step) &&			\
> 	 (_xe)->info.step.graphics < (max_step))
>
>+#define tile_to_xe(tile__)								\
>+	_Generic(tile__,								\
>+		 const struct xe_tile *: (const struct xe_device *)((tile__)->xe),	\
>+		 struct xe_tile *: (tile__)->xe)
>+
>+/**
>+ * struct xe_tile - hardware tile structure
>+ *
>+ * From a driver perspective, a "tile" is effectively a complete GPU, containing
>+ * an SGunit, 1-2 GTs, and (for discrete platforms) VRAM.
>+ *
>+ * Multi-tile platforms effectively bundle multiple GPUs behind a single PCI
>+ * device and designate one "root" tile as being responsible for external PCI
>+ * communication.  PCI BAR0 exposes the GGTT and MMIO register space for each
>+ * tile in a stacked layout, and PCI BAR2 exposes the local memory associated
>+ * with each tile similarly.  Device-wide interrupts can be enabled/disabled
>+ * at the root tile, and the MSTR_TILE_INTR register will report which tiles
>+ * have interrupts that need servicing.
>+ */
>+struct xe_tile {
>+	/** @xe: Backpointer to tile's PCI device */
>+	struct xe_device *xe;
>+
>+	/** @id: ID of the tile */
>+	u8 id;
>+
>+	/**
>+	 * @primary_gt: Primary GT
>+	 */
>+	struct xe_gt primary_gt;
>+
>+	/* TODO: Add media GT here */
>+};
>+
> /**
>  * struct xe_device - Top level struct of XE device
>  */
>@@ -248,8 +282,8 @@ struct xe_device {
> 	/** @ordered_wq: used to serialize compute mode resume */
> 	struct workqueue_struct *ordered_wq;
>
>-	/** @gt: graphics tile */
>-	struct xe_gt gt[XE_MAX_GT];

kunit is now broken and needs to be fixed.

othet than that,


Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>

Lucas De Marchi

>+	/** @tiles: device tiles */
>+	struct xe_tile tiles[XE_MAX_TILES_PER_DEVICE];
>
> 	/**
> 	 * @mem_access: keep track of memory access in the device, possibly
>diff --git a/drivers/gpu/drm/xe/xe_gt_types.h b/drivers/gpu/drm/xe/xe_gt_types.h
>index 7c47d67aa8be..e0ed4508269b 100644
>--- a/drivers/gpu/drm/xe/xe_gt_types.h
>+++ b/drivers/gpu/drm/xe/xe_gt_types.h
>@@ -77,12 +77,17 @@ enum xe_steering_type {
> };
>
> /**
>- * struct xe_gt - Top level struct of a graphics tile
>+ * struct xe_gt - A "Graphics Technology" unit of the GPU
>  *
>- * A graphics tile may be a physical split (duplicate pieces of silicon,
>- * different GGTT + VRAM) or a virtual split (shared GGTT + VRAM). Either way
>- * this structure encapsulates of everything a GT is (MMIO, VRAM, memory
>- * management, microcontrols, and a hardware set of engines).
>+ * A GT ("Graphics Technology") is the subset of a GPU primarily responsible
>+ * for implementing the graphics and/or media IP.  It encapsulates the hardware
>+ * engines, programmable execution units, and GuC.   Each GT has its own
>+ * handling of power management (RC6+forcewake) and multicast register
>+ * steering.
>+ *
>+ * A GPU/tile may have a single GT that supplies all graphics and media
>+ * functionality, or the graphics and media may be split into separate GTs
>+ * within a tile.
>  */
> struct xe_gt {
> 	/** @xe: backpointer to XE device */
>diff --git a/drivers/gpu/drm/xe/xe_mmio.c b/drivers/gpu/drm/xe/xe_mmio.c
>index 4804616a3c44..254b4a63d901 100644
>--- a/drivers/gpu/drm/xe/xe_mmio.c
>+++ b/drivers/gpu/drm/xe/xe_mmio.c
>@@ -399,6 +399,7 @@ int xe_mmio_ioctl(struct drm_device *dev, void *data,
> 		  struct drm_file *file)
> {
> 	struct xe_device *xe = to_xe_device(dev);
>+	struct xe_gt *gt = xe_device_get_gt(xe, 0);
> 	struct drm_xe_mmio *args = data;
> 	unsigned int bits_flag, bytes;
> 	struct xe_reg reg;
>@@ -440,7 +441,7 @@ int xe_mmio_ioctl(struct drm_device *dev, void *data,
> 	 */
> 	reg = XE_REG(args->addr);
>
>-	xe_force_wake_get(gt_to_fw(&xe->gt[0]), XE_FORCEWAKE_ALL);
>+	xe_force_wake_get(gt_to_fw(gt), XE_FORCEWAKE_ALL);
>
> 	if (args->flags & DRM_XE_MMIO_WRITE) {
> 		switch (bits_flag) {
>@@ -449,10 +450,10 @@ int xe_mmio_ioctl(struct drm_device *dev, void *data,
> 				ret = -EINVAL;
> 				goto exit;
> 			}
>-			xe_mmio_write32(to_gt(xe), reg, args->value);
>+			xe_mmio_write32(gt, reg, args->value);
> 			break;
> 		case DRM_XE_MMIO_64BIT:
>-			xe_mmio_write64(to_gt(xe), reg, args->value);
>+			xe_mmio_write64(gt, reg, args->value);
> 			break;
> 		default:
> 			drm_dbg(&xe->drm, "Invalid MMIO bit size");
>@@ -467,10 +468,10 @@ int xe_mmio_ioctl(struct drm_device *dev, void *data,
> 	if (args->flags & DRM_XE_MMIO_READ) {
> 		switch (bits_flag) {
> 		case DRM_XE_MMIO_32BIT:
>-			args->value = xe_mmio_read32(to_gt(xe), reg);
>+			args->value = xe_mmio_read32(gt, reg);
> 			break;
> 		case DRM_XE_MMIO_64BIT:
>-			args->value = xe_mmio_read64(to_gt(xe), reg);
>+			args->value = xe_mmio_read64(gt, reg);
> 			break;
> 		default:
> 			drm_dbg(&xe->drm, "Invalid MMIO bit size");
>@@ -482,7 +483,7 @@ int xe_mmio_ioctl(struct drm_device *dev, void *data,
> 	}
>
> exit:
>-	xe_force_wake_put(gt_to_fw(&xe->gt[0]), XE_FORCEWAKE_ALL);
>+	xe_force_wake_put(gt_to_fw(gt), XE_FORCEWAKE_ALL);
>
> 	return ret;
> }
>diff --git a/drivers/gpu/drm/xe/xe_pci.c b/drivers/gpu/drm/xe/xe_pci.c
>index bf2c234c4f6e..e79b16d8bf7f 100644
>--- a/drivers/gpu/drm/xe/xe_pci.c
>+++ b/drivers/gpu/drm/xe/xe_pci.c
>@@ -525,7 +525,10 @@ static int xe_info_init(struct xe_device *xe,
> 	xe->info.step = xe_step_get(xe);
>
> 	for (id = 0; id < xe->info.tile_count; ++id) {
>-		gt = xe->gt + id;
>+		xe->tiles[id].xe = xe;
>+		xe->tiles[id].id = id;
>+
>+		gt = &xe->tiles[id].primary_gt;
> 		gt->info.id = id;
> 		gt->xe = xe;
>
>diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
>index 0a4becdf4675..fe6abb6ed6ca 100644
>--- a/drivers/gpu/drm/xe/xe_vm.c
>+++ b/drivers/gpu/drm/xe/xe_vm.c
>@@ -3347,7 +3347,7 @@ int xe_vm_invalidate_vma(struct xe_vma *vma)
> 	struct xe_device *xe = vma->vm->xe;
> 	struct xe_gt *gt;
> 	u32 gt_needs_invalidate = 0;
>-	int seqno[XE_MAX_GT];
>+	int seqno[XE_MAX_TILES_PER_DEVICE];
> 	u8 id;
> 	int ret;
>
>diff --git a/drivers/gpu/drm/xe/xe_vm_types.h b/drivers/gpu/drm/xe/xe_vm_types.h
>index fada7896867f..203ba9d946b8 100644
>--- a/drivers/gpu/drm/xe/xe_vm_types.h
>+++ b/drivers/gpu/drm/xe/xe_vm_types.h
>@@ -159,7 +159,7 @@ struct xe_vm {
> 	struct kref refcount;
>
> 	/* engine used for (un)binding vma's */
>-	struct xe_engine *eng[XE_MAX_GT];
>+	struct xe_engine *eng[XE_MAX_TILES_PER_DEVICE];
>
> 	/** Protects @rebind_list and the page-table structures */
> 	struct dma_resv resv;
>@@ -167,9 +167,9 @@ struct xe_vm {
> 	u64 size;
> 	struct rb_root vmas;
>
>-	struct xe_pt *pt_root[XE_MAX_GT];
>-	struct xe_bo *scratch_bo[XE_MAX_GT];
>-	struct xe_pt *scratch_pt[XE_MAX_GT][XE_VM_MAX_LEVEL];
>+	struct xe_pt *pt_root[XE_MAX_TILES_PER_DEVICE];
>+	struct xe_bo *scratch_bo[XE_MAX_TILES_PER_DEVICE];
>+	struct xe_pt *scratch_pt[XE_MAX_TILES_PER_DEVICE][XE_VM_MAX_LEVEL];
>
> 	/** @flags: flags for this VM, statically setup a creation time */
> #define XE_VM_FLAGS_64K			BIT(0)
>-- 
>2.40.0
>

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [Intel-xe] ✓ CI.Patch_applied: success for Separate GT and tile (rev2)
  2023-05-11  3:46 [Intel-xe] [PATCH 00/26] Separate GT and tile Matt Roper
                   ` (27 preceding siblings ...)
  2023-05-11  3:51 ` [Intel-xe] ✗ CI.KUnit: failure " Patchwork
@ 2023-05-11  7:08 ` Patchwork
  2023-05-11  7:10 ` [Intel-xe] ✗ CI.KUnit: failure " Patchwork
                   ` (5 subsequent siblings)
  34 siblings, 0 replies; 75+ messages in thread
From: Patchwork @ 2023-05-11  7:08 UTC (permalink / raw)
  To: Matt Roper; +Cc: intel-xe

== Series Details ==

Series: Separate GT and tile (rev2)
URL   : https://patchwork.freedesktop.org/series/117614/
State : success

== Summary ==

=== Applying kernel patches on branch 'drm-xe-next' with base: ===
Base commit: 6db883a16 drm/xe/adln: Enable ADL-N
=== git am output follows ===
Applying: drm/xe/mtl: Disable media GT
Applying: drm/xe: Introduce xe_tile
Applying: drm/xe: Add backpointer from gt to tile
Applying: drm/xe: Add for_each_tile iterator
Applying: drm/xe: Move register MMIO into xe_tile
Applying: drm/xe: Move VRAM from GT to tile
Applying: drm/xe: Memory allocations are tile-based, not GT-based
Applying: drm/xe: Move migration from GT to tile
Applying: drm/xe: Clarify 'gt' retrieval for primary tile
Applying: drm/xe: Drop vram_id
Applying: drm/xe: Drop extra_gts[] declarations and XE_GT_TYPE_REMOTE
Applying: drm/xe: Allocate GT dynamically
Applying: drm/xe: Add media GT to tile
Applying: drm/xe: Move display IRQ postinstall out of GT function
Applying: drm/xe: Interrupts are delivered per-tile, not per-GT
Applying: drm/xe/irq: Handle ASLE backlight interrupts at same time as display
Applying: drm/xe/irq: Actually call xe_irq_postinstall()
Applying: drm/xe/irq: Ensure primary GuC won't clobber media GuC's interrupt mask
Applying: drm/xe/irq: Untangle postinstall functions
Applying: drm/xe: Replace xe_gt_irq_postinstall with xe_irq_enable_hwe
Applying: drm/xe: Invalidate TLB on all affected GTs during GGTT updates
Applying: drm/xe/tlb: Obtain forcewake when doing GGTT TLB invalidations
Applying: drm/xe: Allow GT looping and lookup on standalone media
Applying: drm/xe: Update query uapi to support standalone media
Applying: drm/xe: Reinstate media GT support
Applying: drm/xe: Clarify source of GT log messages



^ permalink raw reply	[flat|nested] 75+ messages in thread

* [Intel-xe] ✗ CI.KUnit: failure for Separate GT and tile (rev2)
  2023-05-11  3:46 [Intel-xe] [PATCH 00/26] Separate GT and tile Matt Roper
                   ` (28 preceding siblings ...)
  2023-05-11  7:08 ` [Intel-xe] ✓ CI.Patch_applied: success for Separate GT and tile (rev2) Patchwork
@ 2023-05-11  7:10 ` Patchwork
  2023-05-12  7:21 ` [Intel-xe] ✓ CI.Patch_applied: success " Patchwork
                   ` (4 subsequent siblings)
  34 siblings, 0 replies; 75+ messages in thread
From: Patchwork @ 2023-05-11  7:10 UTC (permalink / raw)
  To: Matt Roper; +Cc: intel-xe

== Series Details ==

Series: Separate GT and tile (rev2)
URL   : https://patchwork.freedesktop.org/series/117614/
State : failure

== Summary ==

+ trap cleanup EXIT
+ /kernel/tools/testing/kunit/kunit.py run --kunitconfig /kernel/drivers/gpu/drm/xe/.kunitconfig
ERROR:root:make[1]: Entering directory '/kernel/.kunit'
***
*** The source tree is not clean, please run 'make ARCH=um mrproper'
*** in /kernel
***
make[1]: *** [/kernel/Makefile:646: outputmakefile] Error 1
make[1]: Leaving directory '/kernel/.kunit'
make: *** [Makefile:226: __sub-make] Error 2

[07:10:56] Configuring KUnit Kernel ...
Generating .config ...
Populating config with:
$ make ARCH=um O=.kunit olddefconfig
+ cleanup
++ stat -c %u:%g /kernel
+ chown -R 1003:1003 /kernel



^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Intel-xe] [PATCH 15/26] drm/xe: Interrupts are delivered per-tile, not per-GT
  2023-05-11  3:47 ` [Intel-xe] [PATCH 15/26] drm/xe: Interrupts are delivered per-tile, not per-GT Matt Roper
@ 2023-05-11 12:14   ` Iddamsetty, Aravind
  2023-05-11 13:50     ` Matt Roper
  2023-05-18 18:30   ` Lucas De Marchi
  1 sibling, 1 reply; 75+ messages in thread
From: Iddamsetty, Aravind @ 2023-05-11 12:14 UTC (permalink / raw)
  To: Matt Roper, intel-xe



On 11-05-2023 09:17, Matt Roper wrote:
> IRQ delivery and handling needs to be handled on a per-tile basis.  Note
> that this is true even for the "GT interrupts" relating to engines and
> GuCs --- the interrupts relating to both GTs get raised through a single
> set of registers in the tile's sgunit range.
> 
> The (mis)use of struct xe_gt as a target for MMIO operations in the
> driver makes the code somewhat confusing since we wind up needing a GT
> pointer to handle programming that's unrelated to the GT.  To mitigate
> this confusion, all of the xe_gt structures used solely as an MMIO
> target in interrupt code are renamed to 'mmio.'  Reworking the driver's
> MMIO handling to not be dependent on xe_gt is planned as a future
> update.
> 
> Note that GT initialization code currently calls xe_gt_irq_postinstall()
> in an attempt to enable the HWE interrupts for the GT being initialized.
> Unfortunately xe_gt_irq_postinstall() doesn't really match its name and
> does a bunch of other stuff unrelated to the GT interrupts (such as
> enabling the top-level device interrupts).  That will be addressed in
> future patches.
> 
> Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
> ---
>  drivers/gpu/drm/xe/xe_gt.c  |   2 +-
>  drivers/gpu/drm/xe/xe_irq.c | 334 ++++++++++++++++++++----------------
>  drivers/gpu/drm/xe/xe_irq.h |   4 +-
>  3 files changed, 187 insertions(+), 153 deletions(-)
> 


<snip>
>  static u32 dg1_intr_disable(struct xe_device *xe)
>  {
> -	struct xe_gt *gt = xe_primary_mmio_gt(xe);
> +	struct xe_gt *mmio = xe_primary_mmio_gt(xe);
>  	u32 val;
>  
>  	/* First disable interrupts */
> -	xe_mmio_write32(gt, DG1_MSTR_TILE_INTR, 0);
> +	xe_mmio_write32(mmio, DG1_MSTR_TILE_INTR, 0);

if sgunit registers are replicated per tile, but why DG1_MSR_TILE_INTR
is handled on primary tile only, is this not a sgunit register?

>  
>  	/* Get the indication levels and ack the master unit */
> -	val = xe_mmio_read32(gt, DG1_MSTR_TILE_INTR);
> +	val = xe_mmio_read32(mmio, DG1_MSTR_TILE_INTR);
>  	if (unlikely(!val))
>  		return 0;
>  
> -	xe_mmio_write32(gt, DG1_MSTR_TILE_INTR, val);
> +	xe_mmio_write32(mmio, DG1_MSTR_TILE_INTR, val);
>  
>  	return val;
>  }
>  
>  static void dg1_intr_enable(struct xe_device *xe, bool stall)
>  {
> -	struct xe_gt *gt = xe_primary_mmio_gt(xe);
> +	struct xe_gt *mmio = xe_primary_mmio_gt(xe);
>  
> -	xe_mmio_write32(gt, DG1_MSTR_TILE_INTR, DG1_MSTR_IRQ);
> +	xe_mmio_write32(mmio, DG1_MSTR_TILE_INTR, DG1_MSTR_IRQ);
>  	if (stall)
> -		xe_mmio_read32(gt, DG1_MSTR_TILE_INTR);
> +		xe_mmio_read32(mmio, DG1_MSTR_TILE_INTR);
>  }
>

Thanks,
Aravind.


^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Intel-xe] [PATCH 05/26] drm/xe: Move register MMIO into xe_tile
  2023-05-11  3:47 ` [Intel-xe] [PATCH 05/26] drm/xe: Move register MMIO into xe_tile Matt Roper
@ 2023-05-11 12:20   ` Jani Nikula
  2023-05-11 22:01     ` Lucas De Marchi
  2023-05-13  5:53   ` Lucas De Marchi
  1 sibling, 1 reply; 75+ messages in thread
From: Jani Nikula @ 2023-05-11 12:20 UTC (permalink / raw)
  To: Matt Roper, intel-xe; +Cc: matthew.d.roper

On Wed, 10 May 2023, Matt Roper <matthew.d.roper@intel.com> wrote:
> Each tile has its own register region in the BAR, containing instances
> of all registers for the platform.  In contrast, the multiple GTs within
> a tile share the same MMIO space; there's just a small subset of
> registers (the GSI registers) which have multiple copies at different
> offsets (0x0 for primary GT, 0x380000 for media GT).  Move the register
> MMIO region size/pointers to the tile structure, leaving just the GSI
> offset information in the GT structure.
>
> Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
> ---
>  drivers/gpu/drm/xe/display/ext/i915_irq.c |  2 +-
>  drivers/gpu/drm/xe/xe_device_types.h      | 16 ++++++++++++++
>  drivers/gpu/drm/xe/xe_ggtt.c              |  3 ++-
>  drivers/gpu/drm/xe/xe_gt_types.h          |  9 +++-----
>  drivers/gpu/drm/xe/xe_mmio.c              | 26 ++++++++++++-----------
>  drivers/gpu/drm/xe/xe_mmio.h              | 21 +++++++++++++-----
>  6 files changed, 52 insertions(+), 25 deletions(-)
>
> diff --git a/drivers/gpu/drm/xe/display/ext/i915_irq.c b/drivers/gpu/drm/xe/display/ext/i915_irq.c
> index afde97b6faa6..a9cbd7b59360 100644
> --- a/drivers/gpu/drm/xe/display/ext/i915_irq.c
> +++ b/drivers/gpu/drm/xe/display/ext/i915_irq.c
> @@ -920,7 +920,7 @@ gen8_de_irq_handler(struct drm_i915_private *dev_priv, u32 master_ctl)
>  
>  void gen11_display_irq_handler(struct drm_i915_private *i915)
>  {
> -	void __iomem * const regs = to_gt(i915)->mmio.regs;
> +	void __iomem * const regs = xe_device_get_root_tile(i915)->mmio.regs;

Side note, I'm hoping to merge [1] into i915, backport (or rebase) that
into xe, nuking ext/i915_irq.c completely.

IDK if that means adding new ifdefs in the display irq file, or how we
could abstract this in a way that doesn't require changes in i915
display code.

BR,
Jani.


[1] https://patchwork.freedesktop.org/series/117344/



>  	const u32 disp_ctl = raw_reg_read(regs, GEN11_DISPLAY_INT_CTL);
>  
>  	/*
> diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
> index 5dcf1695925f..2481b2045284 100644
> --- a/drivers/gpu/drm/xe/xe_device_types.h
> +++ b/drivers/gpu/drm/xe/xe_device_types.h
> @@ -80,6 +80,22 @@ struct xe_tile {
>  	struct xe_gt primary_gt;
>  
>  	/* TODO: Add media GT here */
> +
> +	/**
> +	 * @mmio: MMIO info for a tile.
> +	 *
> +	 * Each tile has its own 16MB space in BAR0, laid out as:
> +	 * * 0-4MB: registers
> +	 * * 4MB-8MB: reserved
> +	 * * 8MB-16MB: global GTT
> +	 */
> +	struct {
> +		/** @size: size of tile's MMIO space */
> +		size_t size;
> +
> +		/** @regs: pointer to tile's MMIO space (starting with registers) */
> +		void *regs;
> +	} mmio;
>  };
>  
>  /**
> diff --git a/drivers/gpu/drm/xe/xe_ggtt.c b/drivers/gpu/drm/xe/xe_ggtt.c
> index 546240261e0a..200976da3dc1 100644
> --- a/drivers/gpu/drm/xe/xe_ggtt.c
> +++ b/drivers/gpu/drm/xe/xe_ggtt.c
> @@ -93,6 +93,7 @@ static void ggtt_fini_noalloc(struct drm_device *drm, void *arg)
>  int xe_ggtt_init_noalloc(struct xe_gt *gt, struct xe_ggtt *ggtt)
>  {
>  	struct xe_device *xe = gt_to_xe(gt);
> +	struct xe_tile *tile = gt_to_tile(gt);
>  	struct pci_dev *pdev = to_pci_dev(xe->drm.dev);
>  	unsigned int gsm_size;
>  
> @@ -106,7 +107,7 @@ int xe_ggtt_init_noalloc(struct xe_gt *gt, struct xe_ggtt *ggtt)
>  		return -ENOMEM;
>  	}
>  
> -	ggtt->gsm = gt->mmio.regs + SZ_8M;
> +	ggtt->gsm = tile->mmio.regs + SZ_8M;
>  	ggtt->size = (gsm_size / 8) * (u64) XE_PAGE_SIZE;
>  
>  	if (IS_DGFX(xe) && xe->info.vram_flags & XE_VRAM_FLAGS_NEED64K)
> diff --git a/drivers/gpu/drm/xe/xe_gt_types.h b/drivers/gpu/drm/xe/xe_gt_types.h
> index c4376d50786b..03dd625b2781 100644
> --- a/drivers/gpu/drm/xe/xe_gt_types.h
> +++ b/drivers/gpu/drm/xe/xe_gt_types.h
> @@ -124,14 +124,11 @@ struct xe_gt {
>  	} info;
>  
>  	/**
> -	 * @mmio: mmio info for GT, can be subset of the global device mmio
> -	 * space
> +	 * @mmio: mmio info for GT.  All GTs within a tile share the same
> +	 * register space, but have their own copy of GSI registers at a
> +	 * specific offset, as well as their own forcewake handling.
>  	 */
>  	struct {
> -		/** @size: size of MMIO space on GT */
> -		size_t size;
> -		/** @regs: pointer to MMIO space on GT */
> -		void *regs;
>  		/** @fw: force wake for GT */
>  		struct xe_force_wake fw;
>  		/**
> diff --git a/drivers/gpu/drm/xe/xe_mmio.c b/drivers/gpu/drm/xe/xe_mmio.c
> index 254b4a63d901..54fa1212fcd9 100644
> --- a/drivers/gpu/drm/xe/xe_mmio.c
> +++ b/drivers/gpu/drm/xe/xe_mmio.c
> @@ -307,6 +307,7 @@ static void xe_mmio_probe_tiles(struct xe_device *xe)
>  
>  	if (xe->info.tile_count > 1) {
>  		const int mmio_bar = 0;
> +		struct xe_tile *tile;
>  		size_t size;
>  		void *regs;
>  
> @@ -320,11 +321,11 @@ static void xe_mmio_probe_tiles(struct xe_device *xe)
>  		size = xe->mmio.size / adj_tile_count;
>  		regs = xe->mmio.regs;
>  
> -		for_each_gt(gt, xe, id) {
> -			if (id && !xe_gt_is_media_type(gt))
> -				regs += size;
> -			gt->mmio.size = size;
> -			gt->mmio.regs = regs;
> +		for_each_tile(tile, xe, id) {
> +			tile->mmio.size = size;
> +			tile->mmio.regs = regs;
> +
> +			regs += size;
>  		}
>  	}
>  }
> @@ -340,15 +341,16 @@ static void mmio_fini(struct drm_device *drm, void *arg)
>  
>  int xe_mmio_init(struct xe_device *xe)
>  {
> +	struct xe_tile *root_tile = xe_device_get_root_tile(xe);
>  	struct xe_gt *gt = xe_device_get_gt(xe, 0);
>  	const int mmio_bar = 0;
>  	int err;
>  
>  	/*
> -	 * Map the entire BAR, which includes registers (0-4MB), reserved space
> -	 * (4MB-8MB), and GGTT (8MB-16MB). Other parts of the driver (GTs,
> -	 * GGTTs) will derive the pointers they need from the mapping in the
> -	 * device structure.
> +	 * Map the first 16MB of th BAR, which includes the registers (0-4MB),
> +	 * reserved space (4MB-8MB), and GGTT (8MB-16MB) for a single tile.
> +	 * This will get remapped later if we determine that we're running
> +	 * on a multi-tile system.
>  	 */
>  	xe->mmio.size = SZ_16M;
>  	xe->mmio.regs = pci_iomap(to_pci_dev(xe->drm.dev), mmio_bar,
> @@ -362,9 +364,9 @@ int xe_mmio_init(struct xe_device *xe)
>  	if (err)
>  		return err;
>  
> -	/* 1 GT for now, 1 to 1 mapping, may change on multi-GT devices */
> -	gt->mmio.size = xe->mmio.size;
> -	gt->mmio.regs = xe->mmio.regs;
> +	/* Setup first tile; other tiles (if present) will be setup later. */
> +	root_tile->mmio.size = xe->mmio.size;
> +	root_tile->mmio.regs = xe->mmio.regs;
>  
>  	/*
>  	 * The boot firmware initializes local memory and assesses its health.
> diff --git a/drivers/gpu/drm/xe/xe_mmio.h b/drivers/gpu/drm/xe/xe_mmio.h
> index 1407f1189b0d..acf0b18f3111 100644
> --- a/drivers/gpu/drm/xe/xe_mmio.h
> +++ b/drivers/gpu/drm/xe/xe_mmio.h
> @@ -10,6 +10,7 @@
>  #include <linux/io-64-nonatomic-lo-hi.h>
>  
>  #include "regs/xe_reg_defs.h"
> +#include "xe_device_types.h"
>  #include "xe_gt_types.h"
>  
>  struct drm_device;
> @@ -20,27 +21,33 @@ int xe_mmio_init(struct xe_device *xe);
>  
>  static inline u8 xe_mmio_read8(struct xe_gt *gt, struct xe_reg reg)
>  {
> +	struct xe_tile *tile = gt_to_tile(gt);
> +
>  	if (reg.addr < gt->mmio.adj_limit)
>  		reg.addr += gt->mmio.adj_offset;
>  
> -	return readb(gt->mmio.regs + reg.addr);
> +	return readb(tile->mmio.regs + reg.addr);
>  }
>  
>  static inline void xe_mmio_write32(struct xe_gt *gt,
>  				   struct xe_reg reg, u32 val)
>  {
> +	struct xe_tile *tile = gt_to_tile(gt);
> +
>  	if (reg.addr < gt->mmio.adj_limit)
>  		reg.addr += gt->mmio.adj_offset;
>  
> -	writel(val, gt->mmio.regs + reg.addr);
> +	writel(val, tile->mmio.regs + reg.addr);
>  }
>  
>  static inline u32 xe_mmio_read32(struct xe_gt *gt, struct xe_reg reg)
>  {
> +	struct xe_tile *tile = gt_to_tile(gt);
> +
>  	if (reg.addr < gt->mmio.adj_limit)
>  		reg.addr += gt->mmio.adj_offset;
>  
> -	return readl(gt->mmio.regs + reg.addr);
> +	return readl(tile->mmio.regs + reg.addr);
>  }
>  
>  static inline u32 xe_mmio_rmw32(struct xe_gt *gt, struct xe_reg reg, u32 clr,
> @@ -58,18 +65,22 @@ static inline u32 xe_mmio_rmw32(struct xe_gt *gt, struct xe_reg reg, u32 clr,
>  static inline void xe_mmio_write64(struct xe_gt *gt,
>  				   struct xe_reg reg, u64 val)
>  {
> +	struct xe_tile *tile = gt_to_tile(gt);
> +
>  	if (reg.addr < gt->mmio.adj_limit)
>  		reg.addr += gt->mmio.adj_offset;
>  
> -	writeq(val, gt->mmio.regs + reg.addr);
> +	writeq(val, tile->mmio.regs + reg.addr);
>  }
>  
>  static inline u64 xe_mmio_read64(struct xe_gt *gt, struct xe_reg reg)
>  {
> +	struct xe_tile *tile = gt_to_tile(gt);
> +
>  	if (reg.addr < gt->mmio.adj_limit)
>  		reg.addr += gt->mmio.adj_offset;
>  
> -	return readq(gt->mmio.regs + reg.addr);
> +	return readq(tile->mmio.regs + reg.addr);
>  }
>  
>  static inline int xe_mmio_write32_and_verify(struct xe_gt *gt,

-- 
Jani Nikula, Intel Open Source Graphics Center

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Intel-xe] [PATCH 15/26] drm/xe: Interrupts are delivered per-tile, not per-GT
  2023-05-11 12:14   ` Iddamsetty, Aravind
@ 2023-05-11 13:50     ` Matt Roper
  0 siblings, 0 replies; 75+ messages in thread
From: Matt Roper @ 2023-05-11 13:50 UTC (permalink / raw)
  To: Iddamsetty, Aravind; +Cc: intel-xe

On Thu, May 11, 2023 at 05:44:18PM +0530, Iddamsetty, Aravind wrote:
> 
> 
> On 11-05-2023 09:17, Matt Roper wrote:
> > IRQ delivery and handling needs to be handled on a per-tile basis.  Note
> > that this is true even for the "GT interrupts" relating to engines and
> > GuCs --- the interrupts relating to both GTs get raised through a single
> > set of registers in the tile's sgunit range.
> > 
> > The (mis)use of struct xe_gt as a target for MMIO operations in the
> > driver makes the code somewhat confusing since we wind up needing a GT
> > pointer to handle programming that's unrelated to the GT.  To mitigate
> > this confusion, all of the xe_gt structures used solely as an MMIO
> > target in interrupt code are renamed to 'mmio.'  Reworking the driver's
> > MMIO handling to not be dependent on xe_gt is planned as a future
> > update.
> > 
> > Note that GT initialization code currently calls xe_gt_irq_postinstall()
> > in an attempt to enable the HWE interrupts for the GT being initialized.
> > Unfortunately xe_gt_irq_postinstall() doesn't really match its name and
> > does a bunch of other stuff unrelated to the GT interrupts (such as
> > enabling the top-level device interrupts).  That will be addressed in
> > future patches.
> > 
> > Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
> > ---
> >  drivers/gpu/drm/xe/xe_gt.c  |   2 +-
> >  drivers/gpu/drm/xe/xe_irq.c | 334 ++++++++++++++++++++----------------
> >  drivers/gpu/drm/xe/xe_irq.h |   4 +-
> >  3 files changed, 187 insertions(+), 153 deletions(-)
> > 
> 
> 
> <snip>
> >  static u32 dg1_intr_disable(struct xe_device *xe)
> >  {
> > -	struct xe_gt *gt = xe_primary_mmio_gt(xe);
> > +	struct xe_gt *mmio = xe_primary_mmio_gt(xe);
> >  	u32 val;
> >  
> >  	/* First disable interrupts */
> > -	xe_mmio_write32(gt, DG1_MSTR_TILE_INTR, 0);
> > +	xe_mmio_write32(mmio, DG1_MSTR_TILE_INTR, 0);
> 
> if sgunit registers are replicated per tile, but why DG1_MSR_TILE_INTR
> is handled on primary tile only, is this not a sgunit register?

Right, this is an sgunit register, so there's technically a copy at each
tile.  However this specific register is the one responsible for showing
the interrupts that were forwarded to the root tile from the remote
tiles, and for enabling/disabling reporting of interrupts to the OS, so
only the copy in the root tile is relevant.  I should probably add some
additional explanation of that in the commit message.


Matt

> 
> >  
> >  	/* Get the indication levels and ack the master unit */
> > -	val = xe_mmio_read32(gt, DG1_MSTR_TILE_INTR);
> > +	val = xe_mmio_read32(mmio, DG1_MSTR_TILE_INTR);
> >  	if (unlikely(!val))
> >  		return 0;
> >  
> > -	xe_mmio_write32(gt, DG1_MSTR_TILE_INTR, val);
> > +	xe_mmio_write32(mmio, DG1_MSTR_TILE_INTR, val);
> >  
> >  	return val;
> >  }
> >  
> >  static void dg1_intr_enable(struct xe_device *xe, bool stall)
> >  {
> > -	struct xe_gt *gt = xe_primary_mmio_gt(xe);
> > +	struct xe_gt *mmio = xe_primary_mmio_gt(xe);
> >  
> > -	xe_mmio_write32(gt, DG1_MSTR_TILE_INTR, DG1_MSTR_IRQ);
> > +	xe_mmio_write32(mmio, DG1_MSTR_TILE_INTR, DG1_MSTR_IRQ);
> >  	if (stall)
> > -		xe_mmio_read32(gt, DG1_MSTR_TILE_INTR);
> > +		xe_mmio_read32(mmio, DG1_MSTR_TILE_INTR);
> >  }
> >
> 
> Thanks,
> Aravind.
> 

-- 
Matt Roper
Graphics Software Engineer
Linux GPU Platform Enablement
Intel Corporation

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Intel-xe] [PATCH 01/26] drm/xe/mtl: Disable media GT
  2023-05-11  3:46 ` [Intel-xe] [PATCH 01/26] drm/xe/mtl: Disable media GT Matt Roper
@ 2023-05-11 20:50   ` Matt Atwood
  2023-05-11 23:29   ` Lucas De Marchi
  1 sibling, 0 replies; 75+ messages in thread
From: Matt Atwood @ 2023-05-11 20:50 UTC (permalink / raw)
  To: Matt Roper, intel-xe; +Cc: intel-xe

On Wed, May 10, 2023 at 08:46:57PM -0700, Matt Roper wrote:
> Xe incorrectly conflates the concept of 'tile' and 'GT.'  Since MTL's
> media support is not yet functioning properly, let's just disable it
> completely for now while we fix the fundamental driver design.  Support
> for media GTs on platforms like MTL will be re-added later.
> 
Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com>
> Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
> ---
>  drivers/gpu/drm/xe/xe_gt_mcr.c |  2 +-
>  drivers/gpu/drm/xe/xe_mmio.c   |  2 --
>  drivers/gpu/drm/xe/xe_pci.c    | 15 ++-------------
>  3 files changed, 3 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_gt_mcr.c b/drivers/gpu/drm/xe/xe_gt_mcr.c
> index 3db550c85e32..be80fdc4b5a2 100644
> --- a/drivers/gpu/drm/xe/xe_gt_mcr.c
> +++ b/drivers/gpu/drm/xe/xe_gt_mcr.c
> @@ -293,7 +293,7 @@ void xe_gt_mcr_init(struct xe_gt *gt)
>  
>  	spin_lock_init(&gt->mcr_lock);
>  
> -	if (gt->info.type == XE_GT_TYPE_MEDIA) {
> +	if (xe_gt_is_media_type(gt)) {
>  		drm_WARN_ON(&xe->drm, MEDIA_VER(xe) < 13);
>  
>  		gt->steering[OADDRM].ranges = xelpmp_oaddrm_steering_table;
> diff --git a/drivers/gpu/drm/xe/xe_mmio.c b/drivers/gpu/drm/xe/xe_mmio.c
> index c7fbb1cc1f64..4804616a3c44 100644
> --- a/drivers/gpu/drm/xe/xe_mmio.c
> +++ b/drivers/gpu/drm/xe/xe_mmio.c
> @@ -301,8 +301,6 @@ static void xe_mmio_probe_tiles(struct xe_device *xe)
>  	mtcfg = xe_mmio_read64(gt, XEHP_MTCFG_ADDR);
>  	adj_tile_count = xe->info.tile_count =
>  		REG_FIELD_GET(TILE_COUNT, mtcfg) + 1;
> -	if (xe->info.media_verx100 >= 1300)
> -		xe->info.tile_count *= 2;
>  
>  	drm_info(&xe->drm, "tile_count: %d, adj_tile_count %d\n",
>  		 xe->info.tile_count, adj_tile_count);
> diff --git a/drivers/gpu/drm/xe/xe_pci.c b/drivers/gpu/drm/xe/xe_pci.c
> index a6858fc7fe8d..bf2c234c4f6e 100644
> --- a/drivers/gpu/drm/xe/xe_pci.c
> +++ b/drivers/gpu/drm/xe/xe_pci.c
> @@ -19,6 +19,7 @@
>  #include "xe_device.h"
>  #include "xe_display.h"
>  #include "xe_drv.h"
> +#include "xe_gt.h"
>  #include "xe_macros.h"
>  #include "xe_module.h"
>  #include "xe_pci_types.h"
> @@ -271,20 +272,10 @@ static const struct xe_device_desc pvc_desc = {
>  	.extra_gts = pvc_gts,
>  };
>  
> -static const struct xe_gt_desc xelpmp_gts[] = {
> -	{
> -		.type = XE_GT_TYPE_MEDIA,
> -		.vram_id = 0,
> -		.mmio_adj_limit = 0x40000,
> -		.mmio_adj_offset = 0x380000,
> -	},
> -};
> -
>  static const struct xe_device_desc mtl_desc = {
>  	/* .graphics and .media determined via GMD_ID */
>  	.require_force_probe = true,
>  	PLATFORM(XE_METEORLAKE),
> -	.extra_gts = xelpmp_gts,
>  };
>  
>  #undef PLATFORM
> @@ -528,8 +519,6 @@ static int xe_info_init(struct xe_device *xe,
>  	 * treats it as the number of GTs rather than just the number of tiles.
>  	 */
>  	xe->info.tile_count = 1 + graphics_desc->max_remote_tiles;
> -	if (MEDIA_VER(xe) >= 13)
> -		xe->info.tile_count++;
>  
>  	xe->info.subplatform = subplatform_desc ?
>  		subplatform_desc->subplatform : XE_SUBPLATFORM_NONE;
> @@ -553,7 +542,7 @@ static int xe_info_init(struct xe_device *xe,
>  		} else {
>  			gt->info.type = desc->extra_gts[id - 1].type;
>  			gt->info.vram_id = desc->extra_gts[id - 1].vram_id;
> -			gt->info.__engine_mask = (gt->info.type == XE_GT_TYPE_MEDIA) ?
> +			gt->info.__engine_mask = xe_gt_is_media_type(gt) ?
>  				media_desc->hw_engine_mask :
>  				graphics_desc->hw_engine_mask;
>  			gt->mmio.adj_limit =
> -- 
> 2.40.0
> 

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Intel-xe] [PATCH 03/26] drm/xe: Add backpointer from gt to tile
  2023-05-11  3:46 ` [Intel-xe] [PATCH 03/26] drm/xe: Add backpointer from gt to tile Matt Roper
@ 2023-05-11 21:10   ` Matt Atwood
  2023-05-12  0:07   ` Lucas De Marchi
  1 sibling, 0 replies; 75+ messages in thread
From: Matt Atwood @ 2023-05-11 21:10 UTC (permalink / raw)
  To: Matt Roper, intel-xe; +Cc: intel-xe

On Wed, May 10, 2023 at 08:46:59PM -0700, Matt Roper wrote:
> Rather than a backpointer to the xe_device, a GT should have a
> backpointer to its tile (which can then be used to lookup the device if
> necessary).
> 
> The gt_to_xe() helper macro (which moves from xe_gt.h to xe_gt_types.h)
> can and should still be used to jump directly from an xe_gt to
> xe_device.
> 
Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com>
> Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
> ---
>  drivers/gpu/drm/xe/xe_bb.c                  |  2 +-
>  drivers/gpu/drm/xe/xe_gt.h                  |  5 -----
>  drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c |  4 ++--
>  drivers/gpu/drm/xe/xe_gt_types.h            | 14 ++++++++++++--
>  drivers/gpu/drm/xe/xe_mocs.c                | 14 +++++++-------
>  drivers/gpu/drm/xe/xe_pci.c                 | 11 +++++++----
>  drivers/gpu/drm/xe/xe_pt.c                  |  2 +-
>  7 files changed, 30 insertions(+), 22 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_bb.c b/drivers/gpu/drm/xe/xe_bb.c
> index 3deb2d55f421..bf7c94b769d7 100644
> --- a/drivers/gpu/drm/xe/xe_bb.c
> +++ b/drivers/gpu/drm/xe/xe_bb.c
> @@ -16,7 +16,7 @@
>  
>  static int bb_prefetch(struct xe_gt *gt)
>  {
> -	struct xe_device *xe = gt->xe;
> +	struct xe_device *xe = gt_to_xe(gt);
>  
>  	if (GRAPHICS_VERx100(xe) >= 1250 && !xe_gt_is_media_type(gt))
>  		/*
> diff --git a/drivers/gpu/drm/xe/xe_gt.h b/drivers/gpu/drm/xe/xe_gt.h
> index 086369f7ee6d..f4e98f499b36 100644
> --- a/drivers/gpu/drm/xe/xe_gt.h
> +++ b/drivers/gpu/drm/xe/xe_gt.h
> @@ -49,11 +49,6 @@ static inline bool xe_gt_is_media_type(struct xe_gt *gt)
>  	return gt->info.type == XE_GT_TYPE_MEDIA;
>  }
>  
> -#define gt_to_xe(gt__)								\
> -	_Generic(gt__,								\
> -		 const struct xe_gt *: (const struct xe_device *)((gt__)->xe),	\
> -		 struct xe_gt *: (gt__)->xe)
> -
>  static inline bool xe_gt_is_usm_hwe(struct xe_gt *gt, struct xe_hw_engine *hwe)
>  {
>  	struct xe_device *xe = gt_to_xe(gt);
> diff --git a/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c b/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c
> index c815a42e2cdb..c9e8825c02aa 100644
> --- a/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c
> +++ b/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c
> @@ -322,8 +322,8 @@ int xe_guc_tlb_invalidation_done_handler(struct xe_guc *guc, u32 *msg, u32 len)
>  		TLB_INVALIDATION_SEQNO_MAX;
>  	if (!expected_seqno)
>  		expected_seqno = 1;
> -	if (drm_WARN_ON(&gt->xe->drm, expected_seqno != msg[0])) {
> -		drm_err(&gt->xe->drm, "TLB expected_seqno(%d) != msg(%u)\n",
> +	if (drm_WARN_ON(&gt_to_xe(gt)->drm, expected_seqno != msg[0])) {
> +		drm_err(&gt_to_xe(gt)->drm, "TLB expected_seqno(%d) != msg(%u)\n",
>  			expected_seqno, msg[0]);
>  	}
>  
> diff --git a/drivers/gpu/drm/xe/xe_gt_types.h b/drivers/gpu/drm/xe/xe_gt_types.h
> index e0ed4508269b..c4376d50786b 100644
> --- a/drivers/gpu/drm/xe/xe_gt_types.h
> +++ b/drivers/gpu/drm/xe/xe_gt_types.h
> @@ -76,6 +76,16 @@ enum xe_steering_type {
>  	NUM_STEERING_TYPES
>  };
>  
> +#define gt_to_tile(gt__)							\
> +	_Generic(gt__,								\
> +		 const struct xe_gt *: (const struct xe_tile *)((gt__)->tile),	\
> +		 struct xe_gt *: (gt__)->tile)
> +
> +#define gt_to_xe(gt__)										\
> +	_Generic(gt__,										\
> +		 const struct xe_gt *: (const struct xe_device *)(gt_to_tile(gt__)->xe),	\
> +		 struct xe_gt *: gt_to_tile(gt__)->xe)
> +
>  /**
>   * struct xe_gt - A "Graphics Technology" unit of the GPU
>   *
> @@ -90,8 +100,8 @@ enum xe_steering_type {
>   * within a tile.
>   */
>  struct xe_gt {
> -	/** @xe: backpointer to XE device */
> -	struct xe_device *xe;
> +	/** @tile: Backpointer to GT's tile */
> +	struct xe_tile *tile;
>  
>  	/** @info: GT info */
>  	struct {
> diff --git a/drivers/gpu/drm/xe/xe_mocs.c b/drivers/gpu/drm/xe/xe_mocs.c
> index 817afd301d52..d57fbf16a3ef 100644
> --- a/drivers/gpu/drm/xe/xe_mocs.c
> +++ b/drivers/gpu/drm/xe/xe_mocs.c
> @@ -471,7 +471,7 @@ static void __init_mocs_table(struct xe_gt *gt,
>  	unsigned int i;
>  	u32 mocs;
>  
> -	mocs_dbg(&gt->xe->drm, "entries:%d\n", info->n_entries);
> +	mocs_dbg(&gt_to_xe(gt)->drm, "entries:%d\n", info->n_entries);
>  	drm_WARN_ONCE(&xe->drm, !info->unused_entries_index,
>  		      "Unused entries index should have been defined\n");
>  	for (i = 0;
> @@ -479,7 +479,7 @@ static void __init_mocs_table(struct xe_gt *gt,
>  	     i++) {
>  		struct xe_reg reg = XE_REG(addr + i * 4);
>  
> -		mocs_dbg(&gt->xe->drm, "%d 0x%x 0x%x\n", i, reg.addr, mocs);
> +		mocs_dbg(&gt_to_xe(gt)->drm, "%d 0x%x 0x%x\n", i, reg.addr, mocs);
>  		xe_mmio_write32(gt, reg, mocs);
>  	}
>  }
> @@ -508,13 +508,13 @@ static void init_l3cc_table(struct xe_gt *gt,
>  	unsigned int i;
>  	u32 l3cc;
>  
> -	mocs_dbg(&gt->xe->drm, "entries:%d\n", info->n_entries);
> +	mocs_dbg(&gt_to_xe(gt)->drm, "entries:%d\n", info->n_entries);
>  	for (i = 0;
>  	     i < (info->n_entries + 1) / 2 ?
>  	     (l3cc = l3cc_combine(get_entry_l3cc(info, 2 * i),
>  				  get_entry_l3cc(info, 2 * i + 1))), 1 : 0;
>  	     i++) {
> -		mocs_dbg(&gt->xe->drm, "%d 0x%x 0x%x\n", i, LNCFCMOCS(i).addr,
> +		mocs_dbg(&gt_to_xe(gt)->drm, "%d 0x%x 0x%x\n", i, LNCFCMOCS(i).addr,
>  			 l3cc);
>  		xe_mmio_write32(gt, LNCFCMOCS(i), l3cc);
>  	}
> @@ -524,7 +524,7 @@ void xe_mocs_init_early(struct xe_gt *gt)
>  {
>  	struct xe_mocs_info table;
>  
> -	get_mocs_settings(gt->xe, &table);
> +	get_mocs_settings(gt_to_xe(gt), &table);
>  	gt->mocs.uc_index = table.uc_index;
>  	gt->mocs.wb_index = table.wb_index;
>  }
> @@ -537,8 +537,8 @@ void xe_mocs_init(struct xe_gt *gt)
>  	/*
>  	 * LLC and eDRAM control values are not applicable to dgfx
>  	 */
> -	flags = get_mocs_settings(gt->xe, &table);
> -	mocs_dbg(&gt->xe->drm, "flag:0x%x\n", flags);
> +	flags = get_mocs_settings(gt_to_xe(gt), &table);
> +	mocs_dbg(&gt_to_xe(gt)->drm, "flag:0x%x\n", flags);
>  
>  	if (flags & HAS_GLOBAL_MOCS)
>  		__init_mocs_table(gt, &table, GLOBAL_MOCS(0).addr);
> diff --git a/drivers/gpu/drm/xe/xe_pci.c b/drivers/gpu/drm/xe/xe_pci.c
> index e79b16d8bf7f..87c328106aca 100644
> --- a/drivers/gpu/drm/xe/xe_pci.c
> +++ b/drivers/gpu/drm/xe/xe_pci.c
> @@ -471,6 +471,7 @@ static int xe_info_init(struct xe_device *xe,
>  {
>  	const struct xe_graphics_desc *graphics_desc = NULL;
>  	const struct xe_media_desc *media_desc = NULL;
> +	struct xe_tile *tile;
>  	struct xe_gt *gt;
>  	u8 id;
>  
> @@ -525,13 +526,15 @@ static int xe_info_init(struct xe_device *xe,
>  	xe->info.step = xe_step_get(xe);
>  
>  	for (id = 0; id < xe->info.tile_count; ++id) {
> -		xe->tiles[id].xe = xe;
> -		xe->tiles[id].id = id;
> +		tile = &xe->tiles[id];
> +		tile->xe = xe;
> +		tile->id = id;
>  
> -		gt = &xe->tiles[id].primary_gt;
> +		gt = &tile->primary_gt;
>  		gt->info.id = id;
> -		gt->xe = xe;
> +		gt->tile = tile;
>  
> +		gt->info.id = id;
>  		if (id == 0) {
>  			gt->info.type = XE_GT_TYPE_MAIN;
>  			gt->info.vram_id = id;
> diff --git a/drivers/gpu/drm/xe/xe_pt.c b/drivers/gpu/drm/xe/xe_pt.c
> index f15282996c3b..61126cefe0b5 100644
> --- a/drivers/gpu/drm/xe/xe_pt.c
> +++ b/drivers/gpu/drm/xe/xe_pt.c
> @@ -695,7 +695,7 @@ xe_pt_stage_bind_entry(struct xe_ptw *parent, pgoff_t offset,
>  		 * TODO: Suballocate the pt bo to avoid wasting a lot of
>  		 * memory.
>  		 */
> -		if (GRAPHICS_VERx100(xe_walk->gt->xe) >= 1250 && level == 1 &&
> +		if (GRAPHICS_VERx100(gt_to_xe(xe_walk->gt)) >= 1250 && level == 1 &&
>  		    covers && xe_pt_scan_64K(addr, next, xe_walk)) {
>  			walk->shifts = xe_compact_pt_shifts;
>  			flags |= XE_PDE_64K;
> -- 
> 2.40.0
> 

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Intel-xe] [PATCH 05/26] drm/xe: Move register MMIO into xe_tile
  2023-05-11 12:20   ` Jani Nikula
@ 2023-05-11 22:01     ` Lucas De Marchi
  0 siblings, 0 replies; 75+ messages in thread
From: Lucas De Marchi @ 2023-05-11 22:01 UTC (permalink / raw)
  To: Jani Nikula; +Cc: Matt Roper, intel-xe

On Thu, May 11, 2023 at 03:20:35PM +0300, Jani Nikula wrote:
>On Wed, 10 May 2023, Matt Roper <matthew.d.roper@intel.com> wrote:
>> Each tile has its own register region in the BAR, containing instances
>> of all registers for the platform.  In contrast, the multiple GTs within
>> a tile share the same MMIO space; there's just a small subset of
>> registers (the GSI registers) which have multiple copies at different
>> offsets (0x0 for primary GT, 0x380000 for media GT).  Move the register
>> MMIO region size/pointers to the tile structure, leaving just the GSI
>> offset information in the GT structure.
>>
>> Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
>> ---
>>  drivers/gpu/drm/xe/display/ext/i915_irq.c |  2 +-
>>  drivers/gpu/drm/xe/xe_device_types.h      | 16 ++++++++++++++
>>  drivers/gpu/drm/xe/xe_ggtt.c              |  3 ++-
>>  drivers/gpu/drm/xe/xe_gt_types.h          |  9 +++-----
>>  drivers/gpu/drm/xe/xe_mmio.c              | 26 ++++++++++++-----------
>>  drivers/gpu/drm/xe/xe_mmio.h              | 21 +++++++++++++-----
>>  6 files changed, 52 insertions(+), 25 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/xe/display/ext/i915_irq.c b/drivers/gpu/drm/xe/display/ext/i915_irq.c
>> index afde97b6faa6..a9cbd7b59360 100644
>> --- a/drivers/gpu/drm/xe/display/ext/i915_irq.c
>> +++ b/drivers/gpu/drm/xe/display/ext/i915_irq.c
>> @@ -920,7 +920,7 @@ gen8_de_irq_handler(struct drm_i915_private *dev_priv, u32 master_ctl)
>>
>>  void gen11_display_irq_handler(struct drm_i915_private *i915)
>>  {
>> -	void __iomem * const regs = to_gt(i915)->mmio.regs;
>> +	void __iomem * const regs = xe_device_get_root_tile(i915)->mmio.regs;
>
>Side note, I'm hoping to merge [1] into i915, backport (or rebase) that
>into xe, nuking ext/i915_irq.c completely.
>
>IDK if that means adding new ifdefs in the display irq file, or how we
>could abstract this in a way that doesn't require changes in i915
>display code.
>
>BR,
>Jani.
>
>
>[1] https://patchwork.freedesktop.org/series/117344/

looking at drivers/gpu/drm/i915/display/intel_display_irq.h, it seems
that would be the interface to xe, so we'd go:

xe_irq.c -> xe_display.c -> i915-display,

with xe_display.c probably having an ops style. On init:

	display_irq->enable = ...
	display_irq->disable = ...
	display_irq->reset = ...
	display_irq->handler = ...


so on xe_display.c we'd have:


void xe_display_irq_handler(struct xe_device *xe, u32 master_ctl)
{
         if (!xe->info.enable_display)
                 return;

         if (master_ctl & DISPLAY_IRQ)
                 xe->display_irq
}

... and so on


Or even keeping the same xe_display.c we have today, but pointing the
functions to the right place.

Lucas De Marchi

>
>
>
>>  	const u32 disp_ctl = raw_reg_read(regs, GEN11_DISPLAY_INT_CTL);
>>
>>  	/*
>> diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
>> index 5dcf1695925f..2481b2045284 100644
>> --- a/drivers/gpu/drm/xe/xe_device_types.h
>> +++ b/drivers/gpu/drm/xe/xe_device_types.h
>> @@ -80,6 +80,22 @@ struct xe_tile {
>>  	struct xe_gt primary_gt;
>>
>>  	/* TODO: Add media GT here */
>> +
>> +	/**
>> +	 * @mmio: MMIO info for a tile.
>> +	 *
>> +	 * Each tile has its own 16MB space in BAR0, laid out as:
>> +	 * * 0-4MB: registers
>> +	 * * 4MB-8MB: reserved
>> +	 * * 8MB-16MB: global GTT
>> +	 */
>> +	struct {
>> +		/** @size: size of tile's MMIO space */
>> +		size_t size;
>> +
>> +		/** @regs: pointer to tile's MMIO space (starting with registers) */
>> +		void *regs;
>> +	} mmio;
>>  };
>>
>>  /**
>> diff --git a/drivers/gpu/drm/xe/xe_ggtt.c b/drivers/gpu/drm/xe/xe_ggtt.c
>> index 546240261e0a..200976da3dc1 100644
>> --- a/drivers/gpu/drm/xe/xe_ggtt.c
>> +++ b/drivers/gpu/drm/xe/xe_ggtt.c
>> @@ -93,6 +93,7 @@ static void ggtt_fini_noalloc(struct drm_device *drm, void *arg)
>>  int xe_ggtt_init_noalloc(struct xe_gt *gt, struct xe_ggtt *ggtt)
>>  {
>>  	struct xe_device *xe = gt_to_xe(gt);
>> +	struct xe_tile *tile = gt_to_tile(gt);
>>  	struct pci_dev *pdev = to_pci_dev(xe->drm.dev);
>>  	unsigned int gsm_size;
>>
>> @@ -106,7 +107,7 @@ int xe_ggtt_init_noalloc(struct xe_gt *gt, struct xe_ggtt *ggtt)
>>  		return -ENOMEM;
>>  	}
>>
>> -	ggtt->gsm = gt->mmio.regs + SZ_8M;
>> +	ggtt->gsm = tile->mmio.regs + SZ_8M;
>>  	ggtt->size = (gsm_size / 8) * (u64) XE_PAGE_SIZE;
>>
>>  	if (IS_DGFX(xe) && xe->info.vram_flags & XE_VRAM_FLAGS_NEED64K)
>> diff --git a/drivers/gpu/drm/xe/xe_gt_types.h b/drivers/gpu/drm/xe/xe_gt_types.h
>> index c4376d50786b..03dd625b2781 100644
>> --- a/drivers/gpu/drm/xe/xe_gt_types.h
>> +++ b/drivers/gpu/drm/xe/xe_gt_types.h
>> @@ -124,14 +124,11 @@ struct xe_gt {
>>  	} info;
>>
>>  	/**
>> -	 * @mmio: mmio info for GT, can be subset of the global device mmio
>> -	 * space
>> +	 * @mmio: mmio info for GT.  All GTs within a tile share the same
>> +	 * register space, but have their own copy of GSI registers at a
>> +	 * specific offset, as well as their own forcewake handling.
>>  	 */
>>  	struct {
>> -		/** @size: size of MMIO space on GT */
>> -		size_t size;
>> -		/** @regs: pointer to MMIO space on GT */
>> -		void *regs;
>>  		/** @fw: force wake for GT */
>>  		struct xe_force_wake fw;
>>  		/**
>> diff --git a/drivers/gpu/drm/xe/xe_mmio.c b/drivers/gpu/drm/xe/xe_mmio.c
>> index 254b4a63d901..54fa1212fcd9 100644
>> --- a/drivers/gpu/drm/xe/xe_mmio.c
>> +++ b/drivers/gpu/drm/xe/xe_mmio.c
>> @@ -307,6 +307,7 @@ static void xe_mmio_probe_tiles(struct xe_device *xe)
>>
>>  	if (xe->info.tile_count > 1) {
>>  		const int mmio_bar = 0;
>> +		struct xe_tile *tile;
>>  		size_t size;
>>  		void *regs;
>>
>> @@ -320,11 +321,11 @@ static void xe_mmio_probe_tiles(struct xe_device *xe)
>>  		size = xe->mmio.size / adj_tile_count;
>>  		regs = xe->mmio.regs;
>>
>> -		for_each_gt(gt, xe, id) {
>> -			if (id && !xe_gt_is_media_type(gt))
>> -				regs += size;
>> -			gt->mmio.size = size;
>> -			gt->mmio.regs = regs;
>> +		for_each_tile(tile, xe, id) {
>> +			tile->mmio.size = size;
>> +			tile->mmio.regs = regs;
>> +
>> +			regs += size;
>>  		}
>>  	}
>>  }
>> @@ -340,15 +341,16 @@ static void mmio_fini(struct drm_device *drm, void *arg)
>>
>>  int xe_mmio_init(struct xe_device *xe)
>>  {
>> +	struct xe_tile *root_tile = xe_device_get_root_tile(xe);
>>  	struct xe_gt *gt = xe_device_get_gt(xe, 0);
>>  	const int mmio_bar = 0;
>>  	int err;
>>
>>  	/*
>> -	 * Map the entire BAR, which includes registers (0-4MB), reserved space
>> -	 * (4MB-8MB), and GGTT (8MB-16MB). Other parts of the driver (GTs,
>> -	 * GGTTs) will derive the pointers they need from the mapping in the
>> -	 * device structure.
>> +	 * Map the first 16MB of th BAR, which includes the registers (0-4MB),
>> +	 * reserved space (4MB-8MB), and GGTT (8MB-16MB) for a single tile.
>> +	 * This will get remapped later if we determine that we're running
>> +	 * on a multi-tile system.
>>  	 */
>>  	xe->mmio.size = SZ_16M;
>>  	xe->mmio.regs = pci_iomap(to_pci_dev(xe->drm.dev), mmio_bar,
>> @@ -362,9 +364,9 @@ int xe_mmio_init(struct xe_device *xe)
>>  	if (err)
>>  		return err;
>>
>> -	/* 1 GT for now, 1 to 1 mapping, may change on multi-GT devices */
>> -	gt->mmio.size = xe->mmio.size;
>> -	gt->mmio.regs = xe->mmio.regs;
>> +	/* Setup first tile; other tiles (if present) will be setup later. */
>> +	root_tile->mmio.size = xe->mmio.size;
>> +	root_tile->mmio.regs = xe->mmio.regs;
>>
>>  	/*
>>  	 * The boot firmware initializes local memory and assesses its health.
>> diff --git a/drivers/gpu/drm/xe/xe_mmio.h b/drivers/gpu/drm/xe/xe_mmio.h
>> index 1407f1189b0d..acf0b18f3111 100644
>> --- a/drivers/gpu/drm/xe/xe_mmio.h
>> +++ b/drivers/gpu/drm/xe/xe_mmio.h
>> @@ -10,6 +10,7 @@
>>  #include <linux/io-64-nonatomic-lo-hi.h>
>>
>>  #include "regs/xe_reg_defs.h"
>> +#include "xe_device_types.h"
>>  #include "xe_gt_types.h"
>>
>>  struct drm_device;
>> @@ -20,27 +21,33 @@ int xe_mmio_init(struct xe_device *xe);
>>
>>  static inline u8 xe_mmio_read8(struct xe_gt *gt, struct xe_reg reg)
>>  {
>> +	struct xe_tile *tile = gt_to_tile(gt);
>> +
>>  	if (reg.addr < gt->mmio.adj_limit)
>>  		reg.addr += gt->mmio.adj_offset;
>>
>> -	return readb(gt->mmio.regs + reg.addr);
>> +	return readb(tile->mmio.regs + reg.addr);
>>  }
>>
>>  static inline void xe_mmio_write32(struct xe_gt *gt,
>>  				   struct xe_reg reg, u32 val)
>>  {
>> +	struct xe_tile *tile = gt_to_tile(gt);
>> +
>>  	if (reg.addr < gt->mmio.adj_limit)
>>  		reg.addr += gt->mmio.adj_offset;
>>
>> -	writel(val, gt->mmio.regs + reg.addr);
>> +	writel(val, tile->mmio.regs + reg.addr);
>>  }
>>
>>  static inline u32 xe_mmio_read32(struct xe_gt *gt, struct xe_reg reg)
>>  {
>> +	struct xe_tile *tile = gt_to_tile(gt);
>> +
>>  	if (reg.addr < gt->mmio.adj_limit)
>>  		reg.addr += gt->mmio.adj_offset;
>>
>> -	return readl(gt->mmio.regs + reg.addr);
>> +	return readl(tile->mmio.regs + reg.addr);
>>  }
>>
>>  static inline u32 xe_mmio_rmw32(struct xe_gt *gt, struct xe_reg reg, u32 clr,
>> @@ -58,18 +65,22 @@ static inline u32 xe_mmio_rmw32(struct xe_gt *gt, struct xe_reg reg, u32 clr,
>>  static inline void xe_mmio_write64(struct xe_gt *gt,
>>  				   struct xe_reg reg, u64 val)
>>  {
>> +	struct xe_tile *tile = gt_to_tile(gt);
>> +
>>  	if (reg.addr < gt->mmio.adj_limit)
>>  		reg.addr += gt->mmio.adj_offset;
>>
>> -	writeq(val, gt->mmio.regs + reg.addr);
>> +	writeq(val, tile->mmio.regs + reg.addr);
>>  }
>>
>>  static inline u64 xe_mmio_read64(struct xe_gt *gt, struct xe_reg reg)
>>  {
>> +	struct xe_tile *tile = gt_to_tile(gt);
>> +
>>  	if (reg.addr < gt->mmio.adj_limit)
>>  		reg.addr += gt->mmio.adj_offset;
>>
>> -	return readq(gt->mmio.regs + reg.addr);
>> +	return readq(tile->mmio.regs + reg.addr);
>>  }
>>
>>  static inline int xe_mmio_write32_and_verify(struct xe_gt *gt,
>
>-- 
>Jani Nikula, Intel Open Source Graphics Center

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Intel-xe] [PATCH 04/26] drm/xe: Add for_each_tile iterator
  2023-05-11  3:47 ` [Intel-xe] [PATCH 04/26] drm/xe: Add for_each_tile iterator Matt Roper
@ 2023-05-11 23:23   ` Lucas De Marchi
  2023-05-12  5:45   ` Iddamsetty, Aravind
  1 sibling, 0 replies; 75+ messages in thread
From: Lucas De Marchi @ 2023-05-11 23:23 UTC (permalink / raw)
  To: Matt Roper; +Cc: intel-xe

On Wed, May 10, 2023 at 08:47:00PM -0700, Matt Roper wrote:
>As we start splitting tile handling out from GT handling, we'll need to
>be able to iterate over tiles separately from GTs.  This iterator will
>be used in upcoming patches.
>
>Signed-off-by: Matt Roper <matthew.d.roper@intel.com>


Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>

Lucas De Marchi

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Intel-xe] [PATCH 01/26] drm/xe/mtl: Disable media GT
  2023-05-11  3:46 ` [Intel-xe] [PATCH 01/26] drm/xe/mtl: Disable media GT Matt Roper
  2023-05-11 20:50   ` Matt Atwood
@ 2023-05-11 23:29   ` Lucas De Marchi
  2023-05-12 15:38     ` Matt Roper
  1 sibling, 1 reply; 75+ messages in thread
From: Lucas De Marchi @ 2023-05-11 23:29 UTC (permalink / raw)
  To: Matt Roper; +Cc: intel-xe

On Wed, May 10, 2023 at 08:46:57PM -0700, Matt Roper wrote:
>Xe incorrectly conflates the concept of 'tile' and 'GT.'  Since MTL's
>media support is not yet functioning properly, let's just disable it
>completely for now while we fix the fundamental driver design.  Support
>for media GTs on platforms like MTL will be re-added later.
>
>Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
>---
> drivers/gpu/drm/xe/xe_gt_mcr.c |  2 +-
> drivers/gpu/drm/xe/xe_mmio.c   |  2 --
> drivers/gpu/drm/xe/xe_pci.c    | 15 ++-------------
> 3 files changed, 3 insertions(+), 16 deletions(-)
>
>diff --git a/drivers/gpu/drm/xe/xe_gt_mcr.c b/drivers/gpu/drm/xe/xe_gt_mcr.c
>index 3db550c85e32..be80fdc4b5a2 100644
>--- a/drivers/gpu/drm/xe/xe_gt_mcr.c
>+++ b/drivers/gpu/drm/xe/xe_gt_mcr.c
>@@ -293,7 +293,7 @@ void xe_gt_mcr_init(struct xe_gt *gt)
>
> 	spin_lock_init(&gt->mcr_lock);
>
>-	if (gt->info.type == XE_GT_TYPE_MEDIA) {
>+	if (xe_gt_is_media_type(gt)) {

was there a squashing issue here?  if media GT is being removed,
why are you replacing the gt type check xe_gt_is_media_type()?
Shouldn't you then remove xe_gt_is_media_type() is all this branch here
since it's for MEDIA_VER(xe) >= 13?

or just leave this branch as is...

> 		drm_WARN_ON(&xe->drm, MEDIA_VER(xe) < 13);
>
> 		gt->steering[OADDRM].ranges = xelpmp_oaddrm_steering_table;
>diff --git a/drivers/gpu/drm/xe/xe_mmio.c b/drivers/gpu/drm/xe/xe_mmio.c
>index c7fbb1cc1f64..4804616a3c44 100644
>--- a/drivers/gpu/drm/xe/xe_mmio.c
>+++ b/drivers/gpu/drm/xe/xe_mmio.c
>@@ -301,8 +301,6 @@ static void xe_mmio_probe_tiles(struct xe_device *xe)
> 	mtcfg = xe_mmio_read64(gt, XEHP_MTCFG_ADDR);
> 	adj_tile_count = xe->info.tile_count =
> 		REG_FIELD_GET(TILE_COUNT, mtcfg) + 1;
>-	if (xe->info.media_verx100 >= 1300)
>-		xe->info.tile_count *= 2;
>
> 	drm_info(&xe->drm, "tile_count: %d, adj_tile_count %d\n",
> 		 xe->info.tile_count, adj_tile_count);
>diff --git a/drivers/gpu/drm/xe/xe_pci.c b/drivers/gpu/drm/xe/xe_pci.c
>index a6858fc7fe8d..bf2c234c4f6e 100644
>--- a/drivers/gpu/drm/xe/xe_pci.c
>+++ b/drivers/gpu/drm/xe/xe_pci.c
>@@ -19,6 +19,7 @@
> #include "xe_device.h"
> #include "xe_display.h"
> #include "xe_drv.h"
>+#include "xe_gt.h"
> #include "xe_macros.h"
> #include "xe_module.h"
> #include "xe_pci_types.h"
>@@ -271,20 +272,10 @@ static const struct xe_device_desc pvc_desc = {
> 	.extra_gts = pvc_gts,
> };
>
>-static const struct xe_gt_desc xelpmp_gts[] = {
>-	{
>-		.type = XE_GT_TYPE_MEDIA,
>-		.vram_id = 0,
>-		.mmio_adj_limit = 0x40000,
>-		.mmio_adj_offset = 0x380000,
>-	},
>-};
>-
> static const struct xe_device_desc mtl_desc = {
> 	/* .graphics and .media determined via GMD_ID */
> 	.require_force_probe = true,
> 	PLATFORM(XE_METEORLAKE),
>-	.extra_gts = xelpmp_gts,
> };
>
> #undef PLATFORM
>@@ -528,8 +519,6 @@ static int xe_info_init(struct xe_device *xe,
> 	 * treats it as the number of GTs rather than just the number of tiles.
> 	 */
> 	xe->info.tile_count = 1 + graphics_desc->max_remote_tiles;
>-	if (MEDIA_VER(xe) >= 13)
>-		xe->info.tile_count++;
>
> 	xe->info.subplatform = subplatform_desc ?
> 		subplatform_desc->subplatform : XE_SUBPLATFORM_NONE;
>@@ -553,7 +542,7 @@ static int xe_info_init(struct xe_device *xe,
> 		} else {
> 			gt->info.type = desc->extra_gts[id - 1].type;
> 			gt->info.vram_id = desc->extra_gts[id - 1].vram_id;
>-			gt->info.__engine_mask = (gt->info.type == XE_GT_TYPE_MEDIA) ?
>+			gt->info.__engine_mask = xe_gt_is_media_type(gt) ?

same thing here. /me confused

The rest of the patch makes sense and we could leave these lines alone.
It seems the main goal of this patch is actually to free up the
tile_count to be used with proper tile in later patches rather than gt.

Lucas De Marchi

> 				media_desc->hw_engine_mask :
> 				graphics_desc->hw_engine_mask;
> 			gt->mmio.adj_limit =
>-- 
>2.40.0
>

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Intel-xe] [PATCH 03/26] drm/xe: Add backpointer from gt to tile
  2023-05-11  3:46 ` [Intel-xe] [PATCH 03/26] drm/xe: Add backpointer from gt to tile Matt Roper
  2023-05-11 21:10   ` Matt Atwood
@ 2023-05-12  0:07   ` Lucas De Marchi
  2023-05-12 16:20     ` Matt Roper
  1 sibling, 1 reply; 75+ messages in thread
From: Lucas De Marchi @ 2023-05-12  0:07 UTC (permalink / raw)
  To: Matt Roper; +Cc: intel-xe

On Wed, May 10, 2023 at 08:46:59PM -0700, Matt Roper wrote:
>Rather than a backpointer to the xe_device, a GT should have a
>backpointer to its tile (which can then be used to lookup the device if
>necessary).
>
>The gt_to_xe() helper macro (which moves from xe_gt.h to xe_gt_types.h)
>can and should still be used to jump directly from an xe_gt to
>xe_device.
>
>Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
>---
> drivers/gpu/drm/xe/xe_bb.c                  |  2 +-
> drivers/gpu/drm/xe/xe_gt.h                  |  5 -----
> drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c |  4 ++--
> drivers/gpu/drm/xe/xe_gt_types.h            | 14 ++++++++++++--
> drivers/gpu/drm/xe/xe_mocs.c                | 14 +++++++-------
> drivers/gpu/drm/xe/xe_pci.c                 | 11 +++++++----
> drivers/gpu/drm/xe/xe_pt.c                  |  2 +-
> 7 files changed, 30 insertions(+), 22 deletions(-)
>
>diff --git a/drivers/gpu/drm/xe/xe_bb.c b/drivers/gpu/drm/xe/xe_bb.c
>index 3deb2d55f421..bf7c94b769d7 100644
>--- a/drivers/gpu/drm/xe/xe_bb.c
>+++ b/drivers/gpu/drm/xe/xe_bb.c
>@@ -16,7 +16,7 @@
>
> static int bb_prefetch(struct xe_gt *gt)
> {
>-	struct xe_device *xe = gt->xe;
>+	struct xe_device *xe = gt_to_xe(gt);
>
> 	if (GRAPHICS_VERx100(xe) >= 1250 && !xe_gt_is_media_type(gt))
> 		/*
>diff --git a/drivers/gpu/drm/xe/xe_gt.h b/drivers/gpu/drm/xe/xe_gt.h
>index 086369f7ee6d..f4e98f499b36 100644
>--- a/drivers/gpu/drm/xe/xe_gt.h
>+++ b/drivers/gpu/drm/xe/xe_gt.h
>@@ -49,11 +49,6 @@ static inline bool xe_gt_is_media_type(struct xe_gt *gt)
> 	return gt->info.type == XE_GT_TYPE_MEDIA;
> }
>
>-#define gt_to_xe(gt__)								\
>-	_Generic(gt__,								\
>-		 const struct xe_gt *: (const struct xe_device *)((gt__)->xe),	\
>-		 struct xe_gt *: (gt__)->xe)
>-
> static inline bool xe_gt_is_usm_hwe(struct xe_gt *gt, struct xe_hw_engine *hwe)
> {
> 	struct xe_device *xe = gt_to_xe(gt);
>diff --git a/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c b/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c
>index c815a42e2cdb..c9e8825c02aa 100644
>--- a/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c
>+++ b/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c
>@@ -322,8 +322,8 @@ int xe_guc_tlb_invalidation_done_handler(struct xe_guc *guc, u32 *msg, u32 len)
> 		TLB_INVALIDATION_SEQNO_MAX;
> 	if (!expected_seqno)
> 		expected_seqno = 1;
>-	if (drm_WARN_ON(&gt->xe->drm, expected_seqno != msg[0])) {
>-		drm_err(&gt->xe->drm, "TLB expected_seqno(%d) != msg(%u)\n",
>+	if (drm_WARN_ON(&gt_to_xe(gt)->drm, expected_seqno != msg[0])) {
>+		drm_err(&gt_to_xe(gt)->drm, "TLB expected_seqno(%d) != msg(%u)\n",
> 			expected_seqno, msg[0]);
> 	}
>
>diff --git a/drivers/gpu/drm/xe/xe_gt_types.h b/drivers/gpu/drm/xe/xe_gt_types.h
>index e0ed4508269b..c4376d50786b 100644
>--- a/drivers/gpu/drm/xe/xe_gt_types.h
>+++ b/drivers/gpu/drm/xe/xe_gt_types.h
>@@ -76,6 +76,16 @@ enum xe_steering_type {
> 	NUM_STEERING_TYPES
> };
>
>+#define gt_to_tile(gt__)							\
>+	_Generic(gt__,								\
>+		 const struct xe_gt *: (const struct xe_tile *)((gt__)->tile),	\
>+		 struct xe_gt *: (gt__)->tile)
>+
>+#define gt_to_xe(gt__)										\
>+	_Generic(gt__,										\
>+		 const struct xe_gt *: (const struct xe_device *)(gt_to_tile(gt__)->xe),	\
>+		 struct xe_gt *: gt_to_tile(gt__)->xe)
>+
> /**
>  * struct xe_gt - A "Graphics Technology" unit of the GPU
>  *
>@@ -90,8 +100,8 @@ enum xe_steering_type {
>  * within a tile.
>  */
> struct xe_gt {
>-	/** @xe: backpointer to XE device */
>-	struct xe_device *xe;
>+	/** @tile: Backpointer to GT's tile */
>+	struct xe_tile *tile;
>
> 	/** @info: GT info */
> 	struct {
>diff --git a/drivers/gpu/drm/xe/xe_mocs.c b/drivers/gpu/drm/xe/xe_mocs.c
>index 817afd301d52..d57fbf16a3ef 100644
>--- a/drivers/gpu/drm/xe/xe_mocs.c
>+++ b/drivers/gpu/drm/xe/xe_mocs.c
>@@ -471,7 +471,7 @@ static void __init_mocs_table(struct xe_gt *gt,
> 	unsigned int i;
> 	u32 mocs;
>
>-	mocs_dbg(&gt->xe->drm, "entries:%d\n", info->n_entries);
>+	mocs_dbg(&gt_to_xe(gt)->drm, "entries:%d\n", info->n_entries);
> 	drm_WARN_ONCE(&xe->drm, !info->unused_entries_index,
> 		      "Unused entries index should have been defined\n");
> 	for (i = 0;
>@@ -479,7 +479,7 @@ static void __init_mocs_table(struct xe_gt *gt,
> 	     i++) {
> 		struct xe_reg reg = XE_REG(addr + i * 4);
>
>-		mocs_dbg(&gt->xe->drm, "%d 0x%x 0x%x\n", i, reg.addr, mocs);
>+		mocs_dbg(&gt_to_xe(gt)->drm, "%d 0x%x 0x%x\n", i, reg.addr, mocs);
> 		xe_mmio_write32(gt, reg, mocs);
> 	}
> }
>@@ -508,13 +508,13 @@ static void init_l3cc_table(struct xe_gt *gt,
> 	unsigned int i;
> 	u32 l3cc;
>
>-	mocs_dbg(&gt->xe->drm, "entries:%d\n", info->n_entries);
>+	mocs_dbg(&gt_to_xe(gt)->drm, "entries:%d\n", info->n_entries);
> 	for (i = 0;
> 	     i < (info->n_entries + 1) / 2 ?
> 	     (l3cc = l3cc_combine(get_entry_l3cc(info, 2 * i),
> 				  get_entry_l3cc(info, 2 * i + 1))), 1 : 0;
> 	     i++) {
>-		mocs_dbg(&gt->xe->drm, "%d 0x%x 0x%x\n", i, LNCFCMOCS(i).addr,
>+		mocs_dbg(&gt_to_xe(gt)->drm, "%d 0x%x 0x%x\n", i, LNCFCMOCS(i).addr,
> 			 l3cc);
> 		xe_mmio_write32(gt, LNCFCMOCS(i), l3cc);
> 	}
>@@ -524,7 +524,7 @@ void xe_mocs_init_early(struct xe_gt *gt)
> {
> 	struct xe_mocs_info table;
>
>-	get_mocs_settings(gt->xe, &table);
>+	get_mocs_settings(gt_to_xe(gt), &table);
> 	gt->mocs.uc_index = table.uc_index;
> 	gt->mocs.wb_index = table.wb_index;
> }
>@@ -537,8 +537,8 @@ void xe_mocs_init(struct xe_gt *gt)
> 	/*
> 	 * LLC and eDRAM control values are not applicable to dgfx
> 	 */
>-	flags = get_mocs_settings(gt->xe, &table);
>-	mocs_dbg(&gt->xe->drm, "flag:0x%x\n", flags);
>+	flags = get_mocs_settings(gt_to_xe(gt), &table);
>+	mocs_dbg(&gt_to_xe(gt)->drm, "flag:0x%x\n", flags);
>
> 	if (flags & HAS_GLOBAL_MOCS)
> 		__init_mocs_table(gt, &table, GLOBAL_MOCS(0).addr);
>diff --git a/drivers/gpu/drm/xe/xe_pci.c b/drivers/gpu/drm/xe/xe_pci.c
>index e79b16d8bf7f..87c328106aca 100644
>--- a/drivers/gpu/drm/xe/xe_pci.c
>+++ b/drivers/gpu/drm/xe/xe_pci.c
>@@ -471,6 +471,7 @@ static int xe_info_init(struct xe_device *xe,
> {
> 	const struct xe_graphics_desc *graphics_desc = NULL;
> 	const struct xe_media_desc *media_desc = NULL;
>+	struct xe_tile *tile;
> 	struct xe_gt *gt;
> 	u8 id;
>
>@@ -525,13 +526,15 @@ static int xe_info_init(struct xe_device *xe,
> 	xe->info.step = xe_step_get(xe);
>
> 	for (id = 0; id < xe->info.tile_count; ++id) {
>-		xe->tiles[id].xe = xe;
>-		xe->tiles[id].id = id;
>+		tile = &xe->tiles[id];
>+		tile->xe = xe;
>+		tile->id = id;
>
>-		gt = &xe->tiles[id].primary_gt;
>+		gt = &tile->primary_gt;

these seem to have been squash in the wrong patch, changing the style
from using xe->tiles[i].* to declaring a local var. Maybe squash it in
the previous one to avoid the back and forth here?

otherwise,


Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>

> 		gt->info.id = id;
>-		gt->xe = xe;
>+		gt->tile = tile;
>
>+		gt->info.id = id;

the gt than has the id of the tile? Because right now there can only be
primary_gt? Looks odd, but it's a consequence of removing the only
platform with 1 tile and multiple gts.

Lucas De Marchi

> 		if (id == 0) {
> 			gt->info.type = XE_GT_TYPE_MAIN;
> 			gt->info.vram_id = id;
>diff --git a/drivers/gpu/drm/xe/xe_pt.c b/drivers/gpu/drm/xe/xe_pt.c
>index f15282996c3b..61126cefe0b5 100644
>--- a/drivers/gpu/drm/xe/xe_pt.c
>+++ b/drivers/gpu/drm/xe/xe_pt.c
>@@ -695,7 +695,7 @@ xe_pt_stage_bind_entry(struct xe_ptw *parent, pgoff_t offset,
> 		 * TODO: Suballocate the pt bo to avoid wasting a lot of
> 		 * memory.
> 		 */
>-		if (GRAPHICS_VERx100(xe_walk->gt->xe) >= 1250 && level == 1 &&
>+		if (GRAPHICS_VERx100(gt_to_xe(xe_walk->gt)) >= 1250 && level == 1 &&
> 		    covers && xe_pt_scan_64K(addr, next, xe_walk)) {
> 			walk->shifts = xe_compact_pt_shifts;
> 			flags |= XE_PDE_64K;
>-- 
>2.40.0
>

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Intel-xe] [PATCH 02/26] drm/xe: Introduce xe_tile
  2023-05-11  3:46 ` [Intel-xe] [PATCH 02/26] drm/xe: Introduce xe_tile Matt Roper
  2023-05-11  5:46   ` Lucas De Marchi
@ 2023-05-12  5:33   ` Iddamsetty, Aravind
  2023-05-12 16:27     ` Matt Roper
  2023-05-12  5:45   ` Iddamsetty, Aravind
  2023-05-18 17:35   ` Rodrigo Vivi
  3 siblings, 1 reply; 75+ messages in thread
From: Iddamsetty, Aravind @ 2023-05-12  5:33 UTC (permalink / raw)
  To: Matt Roper, intel-xe



On 11-05-2023 09:16, Matt Roper wrote:
> Create a new xe_tile structure to begin separating the concept of "tile"
> from "GT."  A tile is effectively a complete GPU, and a GT is just one
> part of that.  On platforms like MTL, there's only a single full GPU
> (tile) which has its IP blocks provided by two GTs.  In contrast, a
> "multi-tile" platform like PVC is basically multiple complete GPUs
> packed behind a single PCI device.
> 
> For now, just create xe_tile as a simple wrapper around xe_gt.  The
> items in xe_gt that are truly tied to the tile rather than the GT will
> be moved in future patches.  Support for multiple GTs per tile (i.e.,
> the MTL standalone media case) will also be re-introduced in a future
> patch.
> 
> Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
> ---
>  drivers/gpu/drm/xe/xe_device.h       | 11 +++++---
>  drivers/gpu/drm/xe/xe_device_types.h | 40 +++++++++++++++++++++++++---
>  drivers/gpu/drm/xe/xe_gt_types.h     | 15 +++++++----
>  drivers/gpu/drm/xe/xe_mmio.c         | 13 ++++-----
>  drivers/gpu/drm/xe/xe_pci.c          |  5 +++-
>  drivers/gpu/drm/xe/xe_vm.c           |  2 +-
>  drivers/gpu/drm/xe/xe_vm_types.h     |  8 +++---
>  7 files changed, 71 insertions(+), 23 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_device.h b/drivers/gpu/drm/xe/xe_device.h
> index cbae480a2092..f7acaf51a1fc 100644
> --- a/drivers/gpu/drm/xe/xe_device.h
> +++ b/drivers/gpu/drm/xe/xe_device.h
> @@ -48,12 +48,17 @@ static inline struct xe_file *to_xe_file(const struct drm_file *file)
>  	return file->driver_priv;
>  }
>  
> +static inline struct xe_tile *xe_device_get_root_tile(struct xe_device *xe)
> +{
> +	return &xe->tiles[0];
> +}
> +
>  static inline struct xe_gt *xe_device_get_gt(struct xe_device *xe, u8 gt_id)
>  {
>  	struct xe_gt *gt;
>  
> -	XE_BUG_ON(gt_id > XE_MAX_GT);
> -	gt = xe->gt + gt_id;
> +	XE_BUG_ON(gt_id > XE_MAX_TILES_PER_DEVICE);

why do we expect the number of GTs to be less than tiles
> +	gt = &xe->tiles[gt_id].primary_gt;

some how this doesn't look correct to me, if GT is per tile, using GT_ID
to reference tile might not be right.

also through this routine we always return primary_gt but the GT ID can
correspond to other GTs as well.

please check below.

>  	XE_BUG_ON(gt->info.id != gt_id);
>  	XE_BUG_ON(gt->info.type == XE_GT_TYPE_UNINITIALIZED);
>  
> @@ -65,7 +70,7 @@ static inline struct xe_gt *xe_device_get_gt(struct xe_device *xe, u8 gt_id)
>   */
>  static inline struct xe_gt *to_gt(struct xe_device *xe)
>  {
> -	return xe->gt;
> +	return &xe_device_get_root_tile(xe)->primary_gt;
>  }
>  
>  static inline bool xe_device_guc_submission_enabled(struct xe_device *xe)
> diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
> index 6490a04614ce..5dcf1695925f 100644
> --- a/drivers/gpu/drm/xe/xe_device_types.h
> +++ b/drivers/gpu/drm/xe/xe_device_types.h
> @@ -34,7 +34,7 @@
>  
>  #define XE_GT0		0
>  #define XE_GT1		1
> -#define XE_MAX_GT	(XE_GT1 + 1)
> +#define XE_MAX_TILES_PER_DEVICE	(XE_GT1 + 1)
>  
>  #define XE_MAX_ASID	(BIT(20))
>  
> @@ -48,6 +48,40 @@
>  	 (_xe)->info.step.graphics >= (min_step) &&			\
>  	 (_xe)->info.step.graphics < (max_step))
>  
> +#define tile_to_xe(tile__)								\
> +	_Generic(tile__,								\
> +		 const struct xe_tile *: (const struct xe_device *)((tile__)->xe),	\
> +		 struct xe_tile *: (tile__)->xe)
> +
> +/**
> + * struct xe_tile - hardware tile structure
> + *
> + * From a driver perspective, a "tile" is effectively a complete GPU, containing
> + * an SGunit, 1-2 GTs, and (for discrete platforms) VRAM.
> + *
> + * Multi-tile platforms effectively bundle multiple GPUs behind a single PCI
> + * device and designate one "root" tile as being responsible for external PCI
> + * communication.  PCI BAR0 exposes the GGTT and MMIO register space for each
> + * tile in a stacked layout, and PCI BAR2 exposes the local memory associated
> + * with each tile similarly.  Device-wide interrupts can be enabled/disabled
> + * at the root tile, and the MSTR_TILE_INTR register will report which tiles
> + * have interrupts that need servicing.
> + */
> +struct xe_tile {
> +	/** @xe: Backpointer to tile's PCI device */
> +	struct xe_device *xe;
> +
> +	/** @id: ID of the tile */
> +	u8 id;
> +
> +	/**
> +	 * @primary_gt: Primary GT
> +	 */
> +	struct xe_gt primary_gt;
> +
> +	/* TODO: Add media GT here */
> +};
> +
>  /**
>   * struct xe_device - Top level struct of XE device
>   */
> @@ -248,8 +282,8 @@ struct xe_device {
>  	/** @ordered_wq: used to serialize compute mode resume */
>  	struct workqueue_struct *ordered_wq;
>  
> -	/** @gt: graphics tile */
> -	struct xe_gt gt[XE_MAX_GT];
> +	/** @tiles: device tiles */
> +	struct xe_tile tiles[XE_MAX_TILES_PER_DEVICE];
>  
>  	/**
>  	 * @mem_access: keep track of memory access in the device, possibly
> diff --git a/drivers/gpu/drm/xe/xe_gt_types.h b/drivers/gpu/drm/xe/xe_gt_types.h
> index 7c47d67aa8be..e0ed4508269b 100644
> --- a/drivers/gpu/drm/xe/xe_gt_types.h
> +++ b/drivers/gpu/drm/xe/xe_gt_types.h
> @@ -77,12 +77,17 @@ enum xe_steering_type {
>  };
>  
>  /**
> - * struct xe_gt - Top level struct of a graphics tile
> + * struct xe_gt - A "Graphics Technology" unit of the GPU
>   *
> - * A graphics tile may be a physical split (duplicate pieces of silicon,
> - * different GGTT + VRAM) or a virtual split (shared GGTT + VRAM). Either way
> - * this structure encapsulates of everything a GT is (MMIO, VRAM, memory
> - * management, microcontrols, and a hardware set of engines).
> + * A GT ("Graphics Technology") is the subset of a GPU primarily responsible
> + * for implementing the graphics and/or media IP.  It encapsulates the hardware
> + * engines, programmable execution units, and GuC.   Each GT has its own
> + * handling of power management (RC6+forcewake) and multicast register
> + * steering.
> + *
> + * A GPU/tile may have a single GT that supplies all graphics and media
> + * functionality, or the graphics and media may be split into separate GTs
> + * within a tile.
>   */
>  struct xe_gt {
>  	/** @xe: backpointer to XE device */
> diff --git a/drivers/gpu/drm/xe/xe_mmio.c b/drivers/gpu/drm/xe/xe_mmio.c
> index 4804616a3c44..254b4a63d901 100644
> --- a/drivers/gpu/drm/xe/xe_mmio.c
> +++ b/drivers/gpu/drm/xe/xe_mmio.c
> @@ -399,6 +399,7 @@ int xe_mmio_ioctl(struct drm_device *dev, void *data,
>  		  struct drm_file *file)
>  {
>  	struct xe_device *xe = to_xe_device(dev);
> +	struct xe_gt *gt = xe_device_get_gt(xe, 0);
>  	struct drm_xe_mmio *args = data;
>  	unsigned int bits_flag, bytes;
>  	struct xe_reg reg;
> @@ -440,7 +441,7 @@ int xe_mmio_ioctl(struct drm_device *dev, void *data,
>  	 */
>  	reg = XE_REG(args->addr);
>  
> -	xe_force_wake_get(gt_to_fw(&xe->gt[0]), XE_FORCEWAKE_ALL);
> +	xe_force_wake_get(gt_to_fw(gt), XE_FORCEWAKE_ALL);
>  
>  	if (args->flags & DRM_XE_MMIO_WRITE) {
>  		switch (bits_flag) {
> @@ -449,10 +450,10 @@ int xe_mmio_ioctl(struct drm_device *dev, void *data,
>  				ret = -EINVAL;
>  				goto exit;
>  			}
> -			xe_mmio_write32(to_gt(xe), reg, args->value);
> +			xe_mmio_write32(gt, reg, args->value);
>  			break;
>  		case DRM_XE_MMIO_64BIT:
> -			xe_mmio_write64(to_gt(xe), reg, args->value);
> +			xe_mmio_write64(gt, reg, args->value);
>  			break;
>  		default:
>  			drm_dbg(&xe->drm, "Invalid MMIO bit size");
> @@ -467,10 +468,10 @@ int xe_mmio_ioctl(struct drm_device *dev, void *data,
>  	if (args->flags & DRM_XE_MMIO_READ) {
>  		switch (bits_flag) {
>  		case DRM_XE_MMIO_32BIT:
> -			args->value = xe_mmio_read32(to_gt(xe), reg);
> +			args->value = xe_mmio_read32(gt, reg);
>  			break;
>  		case DRM_XE_MMIO_64BIT:
> -			args->value = xe_mmio_read64(to_gt(xe), reg);
> +			args->value = xe_mmio_read64(gt, reg);
>  			break;
>  		default:
>  			drm_dbg(&xe->drm, "Invalid MMIO bit size");
> @@ -482,7 +483,7 @@ int xe_mmio_ioctl(struct drm_device *dev, void *data,
>  	}
>  
>  exit:
> -	xe_force_wake_put(gt_to_fw(&xe->gt[0]), XE_FORCEWAKE_ALL);
> +	xe_force_wake_put(gt_to_fw(gt), XE_FORCEWAKE_ALL);
>  
>  	return ret;
>  }
> diff --git a/drivers/gpu/drm/xe/xe_pci.c b/drivers/gpu/drm/xe/xe_pci.c
> index bf2c234c4f6e..e79b16d8bf7f 100644
> --- a/drivers/gpu/drm/xe/xe_pci.c
> +++ b/drivers/gpu/drm/xe/xe_pci.c
> @@ -525,7 +525,10 @@ static int xe_info_init(struct xe_device *xe,
>  	xe->info.step = xe_step_get(xe);
>  
>  	for (id = 0; id < xe->info.tile_count; ++id) {
> -		gt = xe->gt + id;
> +		xe->tiles[id].xe = xe;
> +		xe->tiles[id].id = id;
> +
> +		gt = &xe->tiles[id].primary_gt;
>  		gt->info.id = id;

since GT is per tile, shouldn't it's numbering also start from 0

Thanks,
Aravind.
>  		gt->xe = xe;
>  
> diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
> index 0a4becdf4675..fe6abb6ed6ca 100644
> --- a/drivers/gpu/drm/xe/xe_vm.c
> +++ b/drivers/gpu/drm/xe/xe_vm.c
> @@ -3347,7 +3347,7 @@ int xe_vm_invalidate_vma(struct xe_vma *vma)
>  	struct xe_device *xe = vma->vm->xe;
>  	struct xe_gt *gt;
>  	u32 gt_needs_invalidate = 0;
> -	int seqno[XE_MAX_GT];
> +	int seqno[XE_MAX_TILES_PER_DEVICE];
>  	u8 id;
>  	int ret;
>  
> diff --git a/drivers/gpu/drm/xe/xe_vm_types.h b/drivers/gpu/drm/xe/xe_vm_types.h
> index fada7896867f..203ba9d946b8 100644
> --- a/drivers/gpu/drm/xe/xe_vm_types.h
> +++ b/drivers/gpu/drm/xe/xe_vm_types.h
> @@ -159,7 +159,7 @@ struct xe_vm {
>  	struct kref refcount;
>  
>  	/* engine used for (un)binding vma's */
> -	struct xe_engine *eng[XE_MAX_GT];
> +	struct xe_engine *eng[XE_MAX_TILES_PER_DEVICE];
>  
>  	/** Protects @rebind_list and the page-table structures */
>  	struct dma_resv resv;
> @@ -167,9 +167,9 @@ struct xe_vm {
>  	u64 size;
>  	struct rb_root vmas;
>  
> -	struct xe_pt *pt_root[XE_MAX_GT];
> -	struct xe_bo *scratch_bo[XE_MAX_GT];
> -	struct xe_pt *scratch_pt[XE_MAX_GT][XE_VM_MAX_LEVEL];
> +	struct xe_pt *pt_root[XE_MAX_TILES_PER_DEVICE];
> +	struct xe_bo *scratch_bo[XE_MAX_TILES_PER_DEVICE];
> +	struct xe_pt *scratch_pt[XE_MAX_TILES_PER_DEVICE][XE_VM_MAX_LEVEL];
>  
>  	/** @flags: flags for this VM, statically setup a creation time */
>  #define XE_VM_FLAGS_64K			BIT(0)

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Intel-xe] [PATCH 04/26] drm/xe: Add for_each_tile iterator
  2023-05-11  3:47 ` [Intel-xe] [PATCH 04/26] drm/xe: Add for_each_tile iterator Matt Roper
  2023-05-11 23:23   ` Lucas De Marchi
@ 2023-05-12  5:45   ` Iddamsetty, Aravind
  2023-05-12 16:28     ` Matt Roper
  1 sibling, 1 reply; 75+ messages in thread
From: Iddamsetty, Aravind @ 2023-05-12  5:45 UTC (permalink / raw)
  To: Matt Roper, intel-xe



On 11-05-2023 09:17, Matt Roper wrote:
> As we start splitting tile handling out from GT handling, we'll need to
> be able to iterate over tiles separately from GTs.  This iterator will
> be used in upcoming patches.
> 
> Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
> ---
>  drivers/gpu/drm/xe/xe_device.h | 4 ++++
>  drivers/gpu/drm/xe/xe_pci.c    | 3 +--
>  2 files changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_device.h b/drivers/gpu/drm/xe/xe_device.h
> index f7acaf51a1fc..745dbb16d417 100644
> --- a/drivers/gpu/drm/xe/xe_device.h
> +++ b/drivers/gpu/drm/xe/xe_device.h
> @@ -83,6 +83,10 @@ static inline void xe_device_guc_submission_disable(struct xe_device *xe)
>  	xe->info.enable_guc = false;
>  }
>  
> +#define for_each_tile(tile__, xe__, id__) \
> +	for ((id__) = 0; (id__) < (xe__)->info.tile_count; (id__++)) \
> +		for_each_if ((tile__) = &(xe__)->tiles[(id__)])
> +
>  #define for_each_gt(gt__, xe__, id__) \
>  	for ((id__) = 0; (id__) < (xe__)->info.tile_count; (id__++)) \
>  		for_each_if ((gt__) = xe_device_get_gt((xe__), (id__)))

as mentioned in earlier patch, this looks to always return primary gt only

Thanks,
Aravind.
> diff --git a/drivers/gpu/drm/xe/xe_pci.c b/drivers/gpu/drm/xe/xe_pci.c
> index 87c328106aca..bef65d3a440e 100644
> --- a/drivers/gpu/drm/xe/xe_pci.c
> +++ b/drivers/gpu/drm/xe/xe_pci.c
> @@ -525,8 +525,7 @@ static int xe_info_init(struct xe_device *xe,
>  		subplatform_desc->subplatform : XE_SUBPLATFORM_NONE;
>  	xe->info.step = xe_step_get(xe);
>  
> -	for (id = 0; id < xe->info.tile_count; ++id) {
> -		tile = &xe->tiles[id];
> +	for_each_tile(tile, xe, id) {
>  		tile->xe = xe;
>  		tile->id = id;
>  

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Intel-xe] [PATCH 02/26] drm/xe: Introduce xe_tile
  2023-05-11  3:46 ` [Intel-xe] [PATCH 02/26] drm/xe: Introduce xe_tile Matt Roper
  2023-05-11  5:46   ` Lucas De Marchi
  2023-05-12  5:33   ` Iddamsetty, Aravind
@ 2023-05-12  5:45   ` Iddamsetty, Aravind
  2023-05-18 17:35   ` Rodrigo Vivi
  3 siblings, 0 replies; 75+ messages in thread
From: Iddamsetty, Aravind @ 2023-05-12  5:45 UTC (permalink / raw)
  To: Matt Roper, intel-xe



On 11-05-2023 09:16, Matt Roper wrote:
> Create a new xe_tile structure to begin separating the concept of "tile"
> from "GT."  A tile is effectively a complete GPU, and a GT is just one
> part of that.  On platforms like MTL, there's only a single full GPU
> (tile) which has its IP blocks provided by two GTs.  In contrast, a
> "multi-tile" platform like PVC is basically multiple complete GPUs
> packed behind a single PCI device.
> 
> For now, just create xe_tile as a simple wrapper around xe_gt.  The
> items in xe_gt that are truly tied to the tile rather than the GT will
> be moved in future patches.  Support for multiple GTs per tile (i.e.,
> the MTL standalone media case) will also be re-introduced in a future
> patch.
> 
> Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
> ---
>  drivers/gpu/drm/xe/xe_device.h       | 11 +++++---
>  drivers/gpu/drm/xe/xe_device_types.h | 40 +++++++++++++++++++++++++---
>  drivers/gpu/drm/xe/xe_gt_types.h     | 15 +++++++----
>  drivers/gpu/drm/xe/xe_mmio.c         | 13 ++++-----
>  drivers/gpu/drm/xe/xe_pci.c          |  5 +++-
>  drivers/gpu/drm/xe/xe_vm.c           |  2 +-
>  drivers/gpu/drm/xe/xe_vm_types.h     |  8 +++---
>  7 files changed, 71 insertions(+), 23 deletions(-)
> 
> 

<snip>
> +struct xe_tile {
> +	/** @xe: Backpointer to tile's PCI device */
> +	struct xe_device *xe;
> +
> +	/** @id: ID of the tile */
> +	u8 id;
> +
> +	/**
> +	 * @primary_gt: Primary GT
> +	 */
> +	struct xe_gt primary_gt;

can we have an array of GTs with primary always located at 0 and others
stored as per ID, that way we could properly retrieve a GT from a tile
using GT_ID assuming we keep GT id local to tile.

Thanks,
Aravind.
> +
> +	/* TODO: Add media GT here */
> +};
> +

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [Intel-xe] ✓ CI.Patch_applied: success for Separate GT and tile (rev2)
  2023-05-11  3:46 [Intel-xe] [PATCH 00/26] Separate GT and tile Matt Roper
                   ` (29 preceding siblings ...)
  2023-05-11  7:10 ` [Intel-xe] ✗ CI.KUnit: failure " Patchwork
@ 2023-05-12  7:21 ` Patchwork
  2023-05-12  7:23 ` [Intel-xe] ✗ CI.KUnit: failure " Patchwork
                   ` (3 subsequent siblings)
  34 siblings, 0 replies; 75+ messages in thread
From: Patchwork @ 2023-05-12  7:21 UTC (permalink / raw)
  To: Matt Roper; +Cc: intel-xe

== Series Details ==

Series: Separate GT and tile (rev2)
URL   : https://patchwork.freedesktop.org/series/117614/
State : success

== Summary ==

=== Applying kernel patches on branch 'drm-xe-next' with base: ===
Base commit: bf565615b fixup! drm/xe/display: Implement display support
=== git am output follows ===
Applying: drm/xe/mtl: Disable media GT
Applying: drm/xe: Introduce xe_tile
Applying: drm/xe: Add backpointer from gt to tile
Applying: drm/xe: Add for_each_tile iterator
Applying: drm/xe: Move register MMIO into xe_tile
Applying: drm/xe: Move VRAM from GT to tile
Applying: drm/xe: Memory allocations are tile-based, not GT-based
Applying: drm/xe: Move migration from GT to tile
Applying: drm/xe: Clarify 'gt' retrieval for primary tile
Applying: drm/xe: Drop vram_id
Applying: drm/xe: Drop extra_gts[] declarations and XE_GT_TYPE_REMOTE
Applying: drm/xe: Allocate GT dynamically
Applying: drm/xe: Add media GT to tile
Applying: drm/xe: Move display IRQ postinstall out of GT function
Applying: drm/xe: Interrupts are delivered per-tile, not per-GT
Applying: drm/xe/irq: Handle ASLE backlight interrupts at same time as display
Applying: drm/xe/irq: Actually call xe_irq_postinstall()
Applying: drm/xe/irq: Ensure primary GuC won't clobber media GuC's interrupt mask
Applying: drm/xe/irq: Untangle postinstall functions
Applying: drm/xe: Replace xe_gt_irq_postinstall with xe_irq_enable_hwe
Applying: drm/xe: Invalidate TLB on all affected GTs during GGTT updates
Applying: drm/xe/tlb: Obtain forcewake when doing GGTT TLB invalidations
Applying: drm/xe: Allow GT looping and lookup on standalone media
Applying: drm/xe: Update query uapi to support standalone media
Applying: drm/xe: Reinstate media GT support
Applying: drm/xe: Clarify source of GT log messages



^ permalink raw reply	[flat|nested] 75+ messages in thread

* [Intel-xe] ✗ CI.KUnit: failure for Separate GT and tile (rev2)
  2023-05-11  3:46 [Intel-xe] [PATCH 00/26] Separate GT and tile Matt Roper
                   ` (30 preceding siblings ...)
  2023-05-12  7:21 ` [Intel-xe] ✓ CI.Patch_applied: success " Patchwork
@ 2023-05-12  7:23 ` Patchwork
  2023-05-15 13:08 ` [Intel-xe] [PATCH 00/26] Separate GT and tile Thomas Hellström
                   ` (2 subsequent siblings)
  34 siblings, 0 replies; 75+ messages in thread
From: Patchwork @ 2023-05-12  7:23 UTC (permalink / raw)
  To: Matt Roper; +Cc: intel-xe

== Series Details ==

Series: Separate GT and tile (rev2)
URL   : https://patchwork.freedesktop.org/series/117614/
State : failure

== Summary ==

+ trap cleanup EXIT
+ /kernel/tools/testing/kunit/kunit.py run --kunitconfig /kernel/drivers/gpu/drm/xe/.kunitconfig
ERROR:root:../drivers/gpu/drm/xe/tests/xe_rtp_test.c: In function ‘xe_rtp_process_tests’:
../drivers/gpu/drm/xe/tests/xe_rtp_test.c:239:32: error: ‘struct xe_device’ has no member named ‘gt’
  239 |  struct xe_reg_sr *reg_sr = &xe->gt[0].reg_sr;
      |                                ^~
../drivers/gpu/drm/xe/tests/xe_rtp_test.c:244:44: error: ‘struct xe_device’ has no member named ‘gt’
  244 |  xe_rtp_process(param->entries, reg_sr, &xe->gt[0], NULL);
      |                                            ^~
make[7]: *** [../scripts/Makefile.build:252: drivers/gpu/drm/xe/tests/xe_rtp_test.o] Error 1
make[7]: *** Waiting for unfinished jobs....
make[6]: *** [../scripts/Makefile.build:494: drivers/gpu/drm/xe/tests] Error 2
make[6]: *** Waiting for unfinished jobs....
make[5]: *** [../scripts/Makefile.build:494: drivers/gpu/drm/xe] Error 2
make[5]: *** Waiting for unfinished jobs....
make[4]: *** [../scripts/Makefile.build:494: drivers/gpu/drm] Error 2
make[3]: *** [../scripts/Makefile.build:494: drivers/gpu] Error 2
make[3]: *** Waiting for unfinished jobs....
make[2]: *** [../scripts/Makefile.build:494: drivers] Error 2
make[2]: *** Waiting for unfinished jobs....
make[1]: *** [/kernel/Makefile:2025: .] Error 2
make: *** [Makefile:226: __sub-make] Error 2

[07:23:24] Configuring KUnit Kernel ...
Generating .config ...
Populating config with:
$ make ARCH=um O=.kunit olddefconfig
[07:23:28] Building KUnit Kernel ...
Populating config with:
$ make ARCH=um O=.kunit olddefconfig
Building with:
$ make ARCH=um O=.kunit --jobs=48
+ cleanup
++ stat -c %u:%g /kernel
+ chown -R 1003:1003 /kernel



^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Intel-xe] [PATCH 01/26] drm/xe/mtl: Disable media GT
  2023-05-11 23:29   ` Lucas De Marchi
@ 2023-05-12 15:38     ` Matt Roper
  0 siblings, 0 replies; 75+ messages in thread
From: Matt Roper @ 2023-05-12 15:38 UTC (permalink / raw)
  To: Lucas De Marchi; +Cc: intel-xe

On Thu, May 11, 2023 at 04:29:55PM -0700, Lucas De Marchi wrote:
> On Wed, May 10, 2023 at 08:46:57PM -0700, Matt Roper wrote:
> > Xe incorrectly conflates the concept of 'tile' and 'GT.'  Since MTL's
> > media support is not yet functioning properly, let's just disable it
> > completely for now while we fix the fundamental driver design.  Support
> > for media GTs on platforms like MTL will be re-added later.
> > 
> > Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
> > ---
> > drivers/gpu/drm/xe/xe_gt_mcr.c |  2 +-
> > drivers/gpu/drm/xe/xe_mmio.c   |  2 --
> > drivers/gpu/drm/xe/xe_pci.c    | 15 ++-------------
> > 3 files changed, 3 insertions(+), 16 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/xe/xe_gt_mcr.c b/drivers/gpu/drm/xe/xe_gt_mcr.c
> > index 3db550c85e32..be80fdc4b5a2 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_mcr.c
> > +++ b/drivers/gpu/drm/xe/xe_gt_mcr.c
> > @@ -293,7 +293,7 @@ void xe_gt_mcr_init(struct xe_gt *gt)
> > 
> > 	spin_lock_init(&gt->mcr_lock);
> > 
> > -	if (gt->info.type == XE_GT_TYPE_MEDIA) {
> > +	if (xe_gt_is_media_type(gt)) {
> 
> was there a squashing issue here?  if media GT is being removed,
> why are you replacing the gt type check xe_gt_is_media_type()?
> Shouldn't you then remove xe_gt_is_media_type() is all this branch here
> since it's for MEDIA_VER(xe) >= 13?
> 
> or just leave this branch as is...

Yeah, I think it's more a matter of me changing directions while I was
working on this and not completely removing all traces of the original
attempt.  In the next version I'll drop out the unnecessary changes (and
maybe add them back as a later patch, since they are a legitimate
cleanup on their own).


Matt

> 
> > 		drm_WARN_ON(&xe->drm, MEDIA_VER(xe) < 13);
> > 
> > 		gt->steering[OADDRM].ranges = xelpmp_oaddrm_steering_table;
> > diff --git a/drivers/gpu/drm/xe/xe_mmio.c b/drivers/gpu/drm/xe/xe_mmio.c
> > index c7fbb1cc1f64..4804616a3c44 100644
> > --- a/drivers/gpu/drm/xe/xe_mmio.c
> > +++ b/drivers/gpu/drm/xe/xe_mmio.c
> > @@ -301,8 +301,6 @@ static void xe_mmio_probe_tiles(struct xe_device *xe)
> > 	mtcfg = xe_mmio_read64(gt, XEHP_MTCFG_ADDR);
> > 	adj_tile_count = xe->info.tile_count =
> > 		REG_FIELD_GET(TILE_COUNT, mtcfg) + 1;
> > -	if (xe->info.media_verx100 >= 1300)
> > -		xe->info.tile_count *= 2;
> > 
> > 	drm_info(&xe->drm, "tile_count: %d, adj_tile_count %d\n",
> > 		 xe->info.tile_count, adj_tile_count);
> > diff --git a/drivers/gpu/drm/xe/xe_pci.c b/drivers/gpu/drm/xe/xe_pci.c
> > index a6858fc7fe8d..bf2c234c4f6e 100644
> > --- a/drivers/gpu/drm/xe/xe_pci.c
> > +++ b/drivers/gpu/drm/xe/xe_pci.c
> > @@ -19,6 +19,7 @@
> > #include "xe_device.h"
> > #include "xe_display.h"
> > #include "xe_drv.h"
> > +#include "xe_gt.h"
> > #include "xe_macros.h"
> > #include "xe_module.h"
> > #include "xe_pci_types.h"
> > @@ -271,20 +272,10 @@ static const struct xe_device_desc pvc_desc = {
> > 	.extra_gts = pvc_gts,
> > };
> > 
> > -static const struct xe_gt_desc xelpmp_gts[] = {
> > -	{
> > -		.type = XE_GT_TYPE_MEDIA,
> > -		.vram_id = 0,
> > -		.mmio_adj_limit = 0x40000,
> > -		.mmio_adj_offset = 0x380000,
> > -	},
> > -};
> > -
> > static const struct xe_device_desc mtl_desc = {
> > 	/* .graphics and .media determined via GMD_ID */
> > 	.require_force_probe = true,
> > 	PLATFORM(XE_METEORLAKE),
> > -	.extra_gts = xelpmp_gts,
> > };
> > 
> > #undef PLATFORM
> > @@ -528,8 +519,6 @@ static int xe_info_init(struct xe_device *xe,
> > 	 * treats it as the number of GTs rather than just the number of tiles.
> > 	 */
> > 	xe->info.tile_count = 1 + graphics_desc->max_remote_tiles;
> > -	if (MEDIA_VER(xe) >= 13)
> > -		xe->info.tile_count++;
> > 
> > 	xe->info.subplatform = subplatform_desc ?
> > 		subplatform_desc->subplatform : XE_SUBPLATFORM_NONE;
> > @@ -553,7 +542,7 @@ static int xe_info_init(struct xe_device *xe,
> > 		} else {
> > 			gt->info.type = desc->extra_gts[id - 1].type;
> > 			gt->info.vram_id = desc->extra_gts[id - 1].vram_id;
> > -			gt->info.__engine_mask = (gt->info.type == XE_GT_TYPE_MEDIA) ?
> > +			gt->info.__engine_mask = xe_gt_is_media_type(gt) ?
> 
> same thing here. /me confused
> 
> The rest of the patch makes sense and we could leave these lines alone.
> It seems the main goal of this patch is actually to free up the
> tile_count to be used with proper tile in later patches rather than gt.
> 
> Lucas De Marchi
> 
> > 				media_desc->hw_engine_mask :
> > 				graphics_desc->hw_engine_mask;
> > 			gt->mmio.adj_limit =
> > -- 
> > 2.40.0
> > 

-- 
Matt Roper
Graphics Software Engineer
Linux GPU Platform Enablement
Intel Corporation

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Intel-xe] [PATCH 03/26] drm/xe: Add backpointer from gt to tile
  2023-05-12  0:07   ` Lucas De Marchi
@ 2023-05-12 16:20     ` Matt Roper
  2023-05-12 16:31       ` Matt Atwood
  0 siblings, 1 reply; 75+ messages in thread
From: Matt Roper @ 2023-05-12 16:20 UTC (permalink / raw)
  To: Lucas De Marchi; +Cc: intel-xe

On Thu, May 11, 2023 at 05:07:46PM -0700, Lucas De Marchi wrote:
> On Wed, May 10, 2023 at 08:46:59PM -0700, Matt Roper wrote:
> > Rather than a backpointer to the xe_device, a GT should have a
> > backpointer to its tile (which can then be used to lookup the device if
> > necessary).
> > 
> > The gt_to_xe() helper macro (which moves from xe_gt.h to xe_gt_types.h)
> > can and should still be used to jump directly from an xe_gt to
> > xe_device.
> > 
> > Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
> > ---
> > drivers/gpu/drm/xe/xe_bb.c                  |  2 +-
> > drivers/gpu/drm/xe/xe_gt.h                  |  5 -----
> > drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c |  4 ++--
> > drivers/gpu/drm/xe/xe_gt_types.h            | 14 ++++++++++++--
> > drivers/gpu/drm/xe/xe_mocs.c                | 14 +++++++-------
> > drivers/gpu/drm/xe/xe_pci.c                 | 11 +++++++----
> > drivers/gpu/drm/xe/xe_pt.c                  |  2 +-
> > 7 files changed, 30 insertions(+), 22 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/xe/xe_bb.c b/drivers/gpu/drm/xe/xe_bb.c
> > index 3deb2d55f421..bf7c94b769d7 100644
> > --- a/drivers/gpu/drm/xe/xe_bb.c
> > +++ b/drivers/gpu/drm/xe/xe_bb.c
> > @@ -16,7 +16,7 @@
> > 
> > static int bb_prefetch(struct xe_gt *gt)
> > {
> > -	struct xe_device *xe = gt->xe;
> > +	struct xe_device *xe = gt_to_xe(gt);
> > 
> > 	if (GRAPHICS_VERx100(xe) >= 1250 && !xe_gt_is_media_type(gt))
> > 		/*
> > diff --git a/drivers/gpu/drm/xe/xe_gt.h b/drivers/gpu/drm/xe/xe_gt.h
> > index 086369f7ee6d..f4e98f499b36 100644
> > --- a/drivers/gpu/drm/xe/xe_gt.h
> > +++ b/drivers/gpu/drm/xe/xe_gt.h
> > @@ -49,11 +49,6 @@ static inline bool xe_gt_is_media_type(struct xe_gt *gt)
> > 	return gt->info.type == XE_GT_TYPE_MEDIA;
> > }
> > 
> > -#define gt_to_xe(gt__)								\
> > -	_Generic(gt__,								\
> > -		 const struct xe_gt *: (const struct xe_device *)((gt__)->xe),	\
> > -		 struct xe_gt *: (gt__)->xe)
> > -
> > static inline bool xe_gt_is_usm_hwe(struct xe_gt *gt, struct xe_hw_engine *hwe)
> > {
> > 	struct xe_device *xe = gt_to_xe(gt);
> > diff --git a/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c b/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c
> > index c815a42e2cdb..c9e8825c02aa 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c
> > +++ b/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c
> > @@ -322,8 +322,8 @@ int xe_guc_tlb_invalidation_done_handler(struct xe_guc *guc, u32 *msg, u32 len)
> > 		TLB_INVALIDATION_SEQNO_MAX;
> > 	if (!expected_seqno)
> > 		expected_seqno = 1;
> > -	if (drm_WARN_ON(&gt->xe->drm, expected_seqno != msg[0])) {
> > -		drm_err(&gt->xe->drm, "TLB expected_seqno(%d) != msg(%u)\n",
> > +	if (drm_WARN_ON(&gt_to_xe(gt)->drm, expected_seqno != msg[0])) {
> > +		drm_err(&gt_to_xe(gt)->drm, "TLB expected_seqno(%d) != msg(%u)\n",
> > 			expected_seqno, msg[0]);
> > 	}
> > 
> > diff --git a/drivers/gpu/drm/xe/xe_gt_types.h b/drivers/gpu/drm/xe/xe_gt_types.h
> > index e0ed4508269b..c4376d50786b 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_types.h
> > +++ b/drivers/gpu/drm/xe/xe_gt_types.h
> > @@ -76,6 +76,16 @@ enum xe_steering_type {
> > 	NUM_STEERING_TYPES
> > };
> > 
> > +#define gt_to_tile(gt__)							\
> > +	_Generic(gt__,								\
> > +		 const struct xe_gt *: (const struct xe_tile *)((gt__)->tile),	\
> > +		 struct xe_gt *: (gt__)->tile)
> > +
> > +#define gt_to_xe(gt__)										\
> > +	_Generic(gt__,										\
> > +		 const struct xe_gt *: (const struct xe_device *)(gt_to_tile(gt__)->xe),	\
> > +		 struct xe_gt *: gt_to_tile(gt__)->xe)
> > +
> > /**
> >  * struct xe_gt - A "Graphics Technology" unit of the GPU
> >  *
> > @@ -90,8 +100,8 @@ enum xe_steering_type {
> >  * within a tile.
> >  */
> > struct xe_gt {
> > -	/** @xe: backpointer to XE device */
> > -	struct xe_device *xe;
> > +	/** @tile: Backpointer to GT's tile */
> > +	struct xe_tile *tile;
> > 
> > 	/** @info: GT info */
> > 	struct {
> > diff --git a/drivers/gpu/drm/xe/xe_mocs.c b/drivers/gpu/drm/xe/xe_mocs.c
> > index 817afd301d52..d57fbf16a3ef 100644
> > --- a/drivers/gpu/drm/xe/xe_mocs.c
> > +++ b/drivers/gpu/drm/xe/xe_mocs.c
> > @@ -471,7 +471,7 @@ static void __init_mocs_table(struct xe_gt *gt,
> > 	unsigned int i;
> > 	u32 mocs;
> > 
> > -	mocs_dbg(&gt->xe->drm, "entries:%d\n", info->n_entries);
> > +	mocs_dbg(&gt_to_xe(gt)->drm, "entries:%d\n", info->n_entries);
> > 	drm_WARN_ONCE(&xe->drm, !info->unused_entries_index,
> > 		      "Unused entries index should have been defined\n");
> > 	for (i = 0;
> > @@ -479,7 +479,7 @@ static void __init_mocs_table(struct xe_gt *gt,
> > 	     i++) {
> > 		struct xe_reg reg = XE_REG(addr + i * 4);
> > 
> > -		mocs_dbg(&gt->xe->drm, "%d 0x%x 0x%x\n", i, reg.addr, mocs);
> > +		mocs_dbg(&gt_to_xe(gt)->drm, "%d 0x%x 0x%x\n", i, reg.addr, mocs);
> > 		xe_mmio_write32(gt, reg, mocs);
> > 	}
> > }
> > @@ -508,13 +508,13 @@ static void init_l3cc_table(struct xe_gt *gt,
> > 	unsigned int i;
> > 	u32 l3cc;
> > 
> > -	mocs_dbg(&gt->xe->drm, "entries:%d\n", info->n_entries);
> > +	mocs_dbg(&gt_to_xe(gt)->drm, "entries:%d\n", info->n_entries);
> > 	for (i = 0;
> > 	     i < (info->n_entries + 1) / 2 ?
> > 	     (l3cc = l3cc_combine(get_entry_l3cc(info, 2 * i),
> > 				  get_entry_l3cc(info, 2 * i + 1))), 1 : 0;
> > 	     i++) {
> > -		mocs_dbg(&gt->xe->drm, "%d 0x%x 0x%x\n", i, LNCFCMOCS(i).addr,
> > +		mocs_dbg(&gt_to_xe(gt)->drm, "%d 0x%x 0x%x\n", i, LNCFCMOCS(i).addr,
> > 			 l3cc);
> > 		xe_mmio_write32(gt, LNCFCMOCS(i), l3cc);
> > 	}
> > @@ -524,7 +524,7 @@ void xe_mocs_init_early(struct xe_gt *gt)
> > {
> > 	struct xe_mocs_info table;
> > 
> > -	get_mocs_settings(gt->xe, &table);
> > +	get_mocs_settings(gt_to_xe(gt), &table);
> > 	gt->mocs.uc_index = table.uc_index;
> > 	gt->mocs.wb_index = table.wb_index;
> > }
> > @@ -537,8 +537,8 @@ void xe_mocs_init(struct xe_gt *gt)
> > 	/*
> > 	 * LLC and eDRAM control values are not applicable to dgfx
> > 	 */
> > -	flags = get_mocs_settings(gt->xe, &table);
> > -	mocs_dbg(&gt->xe->drm, "flag:0x%x\n", flags);
> > +	flags = get_mocs_settings(gt_to_xe(gt), &table);
> > +	mocs_dbg(&gt_to_xe(gt)->drm, "flag:0x%x\n", flags);
> > 
> > 	if (flags & HAS_GLOBAL_MOCS)
> > 		__init_mocs_table(gt, &table, GLOBAL_MOCS(0).addr);
> > diff --git a/drivers/gpu/drm/xe/xe_pci.c b/drivers/gpu/drm/xe/xe_pci.c
> > index e79b16d8bf7f..87c328106aca 100644
> > --- a/drivers/gpu/drm/xe/xe_pci.c
> > +++ b/drivers/gpu/drm/xe/xe_pci.c
> > @@ -471,6 +471,7 @@ static int xe_info_init(struct xe_device *xe,
> > {
> > 	const struct xe_graphics_desc *graphics_desc = NULL;
> > 	const struct xe_media_desc *media_desc = NULL;
> > +	struct xe_tile *tile;
> > 	struct xe_gt *gt;
> > 	u8 id;
> > 
> > @@ -525,13 +526,15 @@ static int xe_info_init(struct xe_device *xe,
> > 	xe->info.step = xe_step_get(xe);
> > 
> > 	for (id = 0; id < xe->info.tile_count; ++id) {
> > -		xe->tiles[id].xe = xe;
> > -		xe->tiles[id].id = id;
> > +		tile = &xe->tiles[id];
> > +		tile->xe = xe;
> > +		tile->id = id;
> > 
> > -		gt = &xe->tiles[id].primary_gt;
> > +		gt = &tile->primary_gt;
> 
> these seem to have been squash in the wrong patch, changing the style
> from using xe->tiles[i].* to declaring a local var. Maybe squash it in
> the previous one to avoid the back and forth here?
> 
> otherwise,
> 
> 
> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
> 
> > 		gt->info.id = id;
> > -		gt->xe = xe;
> > +		gt->tile = tile;
> > 
> > +		gt->info.id = id;
> 
> the gt than has the id of the tile? Because right now there can only be
> primary_gt? Looks odd, but it's a consequence of removing the only
> platform with 1 tile and multiple gts.

Yeah, deciding how we want to number GTs is a bit of an open question.
Right now at the end of this series, the numbering is

 * PVC:  0 = root tile primary, 1 = remote tile primary
 * MTL:  0 = root tile primary, 1 = root tile media
 * everything else:  only has GT0

Some day we may have a platform with multiple tiles and separate
graphics/media GTs within each tile.  In that case would all graphics
GTs get even numbers and all media GTs get odd?  Or would all primary
GTs get gt_id = tile_id and all media GTs get gt_id = tile_id +
tile_count?

The bigger open question is whether uapi should still expose GTs (like
it does today) or whether it should be converted to just expose tiles
and keep the graphics vs media separation an internal implementation
detail.  If uapi changes to only expose tiles, then we probably don't
even need IDs for the GTs...


Matt

> 
> Lucas De Marchi
> 
> > 		if (id == 0) {
> > 			gt->info.type = XE_GT_TYPE_MAIN;
> > 			gt->info.vram_id = id;
> > diff --git a/drivers/gpu/drm/xe/xe_pt.c b/drivers/gpu/drm/xe/xe_pt.c
> > index f15282996c3b..61126cefe0b5 100644
> > --- a/drivers/gpu/drm/xe/xe_pt.c
> > +++ b/drivers/gpu/drm/xe/xe_pt.c
> > @@ -695,7 +695,7 @@ xe_pt_stage_bind_entry(struct xe_ptw *parent, pgoff_t offset,
> > 		 * TODO: Suballocate the pt bo to avoid wasting a lot of
> > 		 * memory.
> > 		 */
> > -		if (GRAPHICS_VERx100(xe_walk->gt->xe) >= 1250 && level == 1 &&
> > +		if (GRAPHICS_VERx100(gt_to_xe(xe_walk->gt)) >= 1250 && level == 1 &&
> > 		    covers && xe_pt_scan_64K(addr, next, xe_walk)) {
> > 			walk->shifts = xe_compact_pt_shifts;
> > 			flags |= XE_PDE_64K;
> > -- 
> > 2.40.0
> > 

-- 
Matt Roper
Graphics Software Engineer
Linux GPU Platform Enablement
Intel Corporation

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Intel-xe] [PATCH 02/26] drm/xe: Introduce xe_tile
  2023-05-12  5:33   ` Iddamsetty, Aravind
@ 2023-05-12 16:27     ` Matt Roper
  0 siblings, 0 replies; 75+ messages in thread
From: Matt Roper @ 2023-05-12 16:27 UTC (permalink / raw)
  To: Iddamsetty, Aravind; +Cc: intel-xe

On Fri, May 12, 2023 at 11:03:22AM +0530, Iddamsetty, Aravind wrote:
> 
> 
> On 11-05-2023 09:16, Matt Roper wrote:
> > Create a new xe_tile structure to begin separating the concept of "tile"
> > from "GT."  A tile is effectively a complete GPU, and a GT is just one
> > part of that.  On platforms like MTL, there's only a single full GPU
> > (tile) which has its IP blocks provided by two GTs.  In contrast, a
> > "multi-tile" platform like PVC is basically multiple complete GPUs
> > packed behind a single PCI device.
> > 
> > For now, just create xe_tile as a simple wrapper around xe_gt.  The
> > items in xe_gt that are truly tied to the tile rather than the GT will
> > be moved in future patches.  Support for multiple GTs per tile (i.e.,
> > the MTL standalone media case) will also be re-introduced in a future
> > patch.
> > 
> > Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
> > ---
> >  drivers/gpu/drm/xe/xe_device.h       | 11 +++++---
> >  drivers/gpu/drm/xe/xe_device_types.h | 40 +++++++++++++++++++++++++---
> >  drivers/gpu/drm/xe/xe_gt_types.h     | 15 +++++++----
> >  drivers/gpu/drm/xe/xe_mmio.c         | 13 ++++-----
> >  drivers/gpu/drm/xe/xe_pci.c          |  5 +++-
> >  drivers/gpu/drm/xe/xe_vm.c           |  2 +-
> >  drivers/gpu/drm/xe/xe_vm_types.h     |  8 +++---
> >  7 files changed, 71 insertions(+), 23 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/xe/xe_device.h b/drivers/gpu/drm/xe/xe_device.h
> > index cbae480a2092..f7acaf51a1fc 100644
> > --- a/drivers/gpu/drm/xe/xe_device.h
> > +++ b/drivers/gpu/drm/xe/xe_device.h
> > @@ -48,12 +48,17 @@ static inline struct xe_file *to_xe_file(const struct drm_file *file)
> >  	return file->driver_priv;
> >  }
> >  
> > +static inline struct xe_tile *xe_device_get_root_tile(struct xe_device *xe)
> > +{
> > +	return &xe->tiles[0];
> > +}
> > +
> >  static inline struct xe_gt *xe_device_get_gt(struct xe_device *xe, u8 gt_id)
> >  {
> >  	struct xe_gt *gt;
> >  
> > -	XE_BUG_ON(gt_id > XE_MAX_GT);
> > -	gt = xe->gt + gt_id;
> > +	XE_BUG_ON(gt_id > XE_MAX_TILES_PER_DEVICE);
> 
> why do we expect the number of GTs to be less than tiles

Patch #1 of the series completely disabled media GT support for MTL
since it didn't work at all.  Easier to clear out the rubble, do a bunch
of fundamental refactoring, and then re-add the media GT support at the
end of the series once all of the prerequisite design is in place.

> > +	gt = &xe->tiles[gt_id].primary_gt;
> 
> some how this doesn't look correct to me, if GT is per tile, using GT_ID
> to reference tile might not be right.
> 
> also through this routine we always return primary_gt but the GT ID can
> correspond to other GTs as well.

So at this point in the series, there's only a single primary GT per
tile.  This code will change later as other refactors land and we
eventually re-add media GT support.

> 
> please check below.
> 
...
> > diff --git a/drivers/gpu/drm/xe/xe_pci.c b/drivers/gpu/drm/xe/xe_pci.c
> > index bf2c234c4f6e..e79b16d8bf7f 100644
> > --- a/drivers/gpu/drm/xe/xe_pci.c
> > +++ b/drivers/gpu/drm/xe/xe_pci.c
> > @@ -525,7 +525,10 @@ static int xe_info_init(struct xe_device *xe,
> >  	xe->info.step = xe_step_get(xe);
> >  
> >  	for (id = 0; id < xe->info.tile_count; ++id) {
> > -		gt = xe->gt + id;
> > +		xe->tiles[id].xe = xe;
> > +		xe->tiles[id].id = id;
> > +
> > +		gt = &xe->tiles[id].primary_gt;
> >  		gt->info.id = id;
> 
> since GT is per tile, shouldn't it's numbering also start from 0
> 
> Thanks,
> Aravind.

At the moment GTs are still exposed at the UAPI level, so each GT still
needs a unique ID.  At the end of this series you'd have:

 * PVC:  0 = root tile primary, 1 = remote tile primary
 * MTL:  0 = root tile primary, 1 = root tile media
 * everything else:  only has GT0

We might have platforms in the future that are both multi-tile and have
separate graphics/media GTs, so it's undecided how we'll number those
cases.  We could go primary-media-primary-media or
primary-primary-media-media.  Although there's a larger discussion here
too --- do we actually still want to expose GTs through the uapi or
should the UAPI be updated to only expose tiles (and keep the graphics
vs media separation an internal kernel driver detail that userspace
doesn't need to worry about)?  If userspace stops seeing the individual
GTs and only sees the tiles, we may be able to eliminate GT IDs
entirely.


Matt

> >  		gt->xe = xe;
> >  
> > diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
> > index 0a4becdf4675..fe6abb6ed6ca 100644
> > --- a/drivers/gpu/drm/xe/xe_vm.c
> > +++ b/drivers/gpu/drm/xe/xe_vm.c
> > @@ -3347,7 +3347,7 @@ int xe_vm_invalidate_vma(struct xe_vma *vma)
> >  	struct xe_device *xe = vma->vm->xe;
> >  	struct xe_gt *gt;
> >  	u32 gt_needs_invalidate = 0;
> > -	int seqno[XE_MAX_GT];
> > +	int seqno[XE_MAX_TILES_PER_DEVICE];
> >  	u8 id;
> >  	int ret;
> >  
> > diff --git a/drivers/gpu/drm/xe/xe_vm_types.h b/drivers/gpu/drm/xe/xe_vm_types.h
> > index fada7896867f..203ba9d946b8 100644
> > --- a/drivers/gpu/drm/xe/xe_vm_types.h
> > +++ b/drivers/gpu/drm/xe/xe_vm_types.h
> > @@ -159,7 +159,7 @@ struct xe_vm {
> >  	struct kref refcount;
> >  
> >  	/* engine used for (un)binding vma's */
> > -	struct xe_engine *eng[XE_MAX_GT];
> > +	struct xe_engine *eng[XE_MAX_TILES_PER_DEVICE];
> >  
> >  	/** Protects @rebind_list and the page-table structures */
> >  	struct dma_resv resv;
> > @@ -167,9 +167,9 @@ struct xe_vm {
> >  	u64 size;
> >  	struct rb_root vmas;
> >  
> > -	struct xe_pt *pt_root[XE_MAX_GT];
> > -	struct xe_bo *scratch_bo[XE_MAX_GT];
> > -	struct xe_pt *scratch_pt[XE_MAX_GT][XE_VM_MAX_LEVEL];
> > +	struct xe_pt *pt_root[XE_MAX_TILES_PER_DEVICE];
> > +	struct xe_bo *scratch_bo[XE_MAX_TILES_PER_DEVICE];
> > +	struct xe_pt *scratch_pt[XE_MAX_TILES_PER_DEVICE][XE_VM_MAX_LEVEL];
> >  
> >  	/** @flags: flags for this VM, statically setup a creation time */
> >  #define XE_VM_FLAGS_64K			BIT(0)

-- 
Matt Roper
Graphics Software Engineer
Linux GPU Platform Enablement
Intel Corporation

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Intel-xe] [PATCH 04/26] drm/xe: Add for_each_tile iterator
  2023-05-12  5:45   ` Iddamsetty, Aravind
@ 2023-05-12 16:28     ` Matt Roper
  0 siblings, 0 replies; 75+ messages in thread
From: Matt Roper @ 2023-05-12 16:28 UTC (permalink / raw)
  To: Iddamsetty, Aravind; +Cc: intel-xe

On Fri, May 12, 2023 at 11:15:08AM +0530, Iddamsetty, Aravind wrote:
> 
> 
> On 11-05-2023 09:17, Matt Roper wrote:
> > As we start splitting tile handling out from GT handling, we'll need to
> > be able to iterate over tiles separately from GTs.  This iterator will
> > be used in upcoming patches.
> > 
> > Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
> > ---
> >  drivers/gpu/drm/xe/xe_device.h | 4 ++++
> >  drivers/gpu/drm/xe/xe_pci.c    | 3 +--
> >  2 files changed, 5 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/xe/xe_device.h b/drivers/gpu/drm/xe/xe_device.h
> > index f7acaf51a1fc..745dbb16d417 100644
> > --- a/drivers/gpu/drm/xe/xe_device.h
> > +++ b/drivers/gpu/drm/xe/xe_device.h
> > @@ -83,6 +83,10 @@ static inline void xe_device_guc_submission_disable(struct xe_device *xe)
> >  	xe->info.enable_guc = false;
> >  }
> >  
> > +#define for_each_tile(tile__, xe__, id__) \
> > +	for ((id__) = 0; (id__) < (xe__)->info.tile_count; (id__++)) \
> > +		for_each_if ((tile__) = &(xe__)->tiles[(id__)])
> > +
> >  #define for_each_gt(gt__, xe__, id__) \
> >  	for ((id__) = 0; (id__) < (xe__)->info.tile_count; (id__++)) \
> >  		for_each_if ((gt__) = xe_device_get_gt((xe__), (id__)))
> 
> as mentioned in earlier patch, this looks to always return primary gt only

Correct; primary GT is the only GT that exists at this point in the
series.  Patch #23 of the series updates this code to prepare for the
return of media GTs, and patch #25 finally re-adds the media GT support.


Matt

> 
> Thanks,
> Aravind.
> > diff --git a/drivers/gpu/drm/xe/xe_pci.c b/drivers/gpu/drm/xe/xe_pci.c
> > index 87c328106aca..bef65d3a440e 100644
> > --- a/drivers/gpu/drm/xe/xe_pci.c
> > +++ b/drivers/gpu/drm/xe/xe_pci.c
> > @@ -525,8 +525,7 @@ static int xe_info_init(struct xe_device *xe,
> >  		subplatform_desc->subplatform : XE_SUBPLATFORM_NONE;
> >  	xe->info.step = xe_step_get(xe);
> >  
> > -	for (id = 0; id < xe->info.tile_count; ++id) {
> > -		tile = &xe->tiles[id];
> > +	for_each_tile(tile, xe, id) {
> >  		tile->xe = xe;
> >  		tile->id = id;
> >  

-- 
Matt Roper
Graphics Software Engineer
Linux GPU Platform Enablement
Intel Corporation

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Intel-xe] [PATCH 03/26] drm/xe: Add backpointer from gt to tile
  2023-05-12 16:20     ` Matt Roper
@ 2023-05-12 16:31       ` Matt Atwood
  2023-05-12 17:00         ` Matt Roper
  0 siblings, 1 reply; 75+ messages in thread
From: Matt Atwood @ 2023-05-12 16:31 UTC (permalink / raw)
  To: Matt Roper, lucas.demarchi, intel-xe; +Cc: Lucas De Marchi, intel-xe

On Fri, May 12, 2023 at 09:20:02AM -0700, Matt Roper wrote:
> On Thu, May 11, 2023 at 05:07:46PM -0700, Lucas De Marchi wrote:
> > On Wed, May 10, 2023 at 08:46:59PM -0700, Matt Roper wrote:
> > > Rather than a backpointer to the xe_device, a GT should have a
> > > backpointer to its tile (which can then be used to lookup the device if
> > > necessary).
> > > 
> > > The gt_to_xe() helper macro (which moves from xe_gt.h to xe_gt_types.h)
> > > can and should still be used to jump directly from an xe_gt to
> > > xe_device.
> > > 
> > > Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
> > > ---
> > > drivers/gpu/drm/xe/xe_bb.c                  |  2 +-
> > > drivers/gpu/drm/xe/xe_gt.h                  |  5 -----
> > > drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c |  4 ++--
> > > drivers/gpu/drm/xe/xe_gt_types.h            | 14 ++++++++++++--
> > > drivers/gpu/drm/xe/xe_mocs.c                | 14 +++++++-------
> > > drivers/gpu/drm/xe/xe_pci.c                 | 11 +++++++----
> > > drivers/gpu/drm/xe/xe_pt.c                  |  2 +-
> > > 7 files changed, 30 insertions(+), 22 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/xe/xe_bb.c b/drivers/gpu/drm/xe/xe_bb.c
> > > index 3deb2d55f421..bf7c94b769d7 100644
> > > --- a/drivers/gpu/drm/xe/xe_bb.c
> > > +++ b/drivers/gpu/drm/xe/xe_bb.c
> > > @@ -16,7 +16,7 @@
> > > 
> > > static int bb_prefetch(struct xe_gt *gt)
> > > {
> > > -	struct xe_device *xe = gt->xe;
> > > +	struct xe_device *xe = gt_to_xe(gt);
> > > 
> > > 	if (GRAPHICS_VERx100(xe) >= 1250 && !xe_gt_is_media_type(gt))
> > > 		/*
> > > diff --git a/drivers/gpu/drm/xe/xe_gt.h b/drivers/gpu/drm/xe/xe_gt.h
> > > index 086369f7ee6d..f4e98f499b36 100644
> > > --- a/drivers/gpu/drm/xe/xe_gt.h
> > > +++ b/drivers/gpu/drm/xe/xe_gt.h
> > > @@ -49,11 +49,6 @@ static inline bool xe_gt_is_media_type(struct xe_gt *gt)
> > > 	return gt->info.type == XE_GT_TYPE_MEDIA;
> > > }
> > > 
> > > -#define gt_to_xe(gt__)								\
> > > -	_Generic(gt__,								\
> > > -		 const struct xe_gt *: (const struct xe_device *)((gt__)->xe),	\
> > > -		 struct xe_gt *: (gt__)->xe)
> > > -
> > > static inline bool xe_gt_is_usm_hwe(struct xe_gt *gt, struct xe_hw_engine *hwe)
> > > {
> > > 	struct xe_device *xe = gt_to_xe(gt);
> > > diff --git a/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c b/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c
> > > index c815a42e2cdb..c9e8825c02aa 100644
> > > --- a/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c
> > > +++ b/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c
> > > @@ -322,8 +322,8 @@ int xe_guc_tlb_invalidation_done_handler(struct xe_guc *guc, u32 *msg, u32 len)
> > > 		TLB_INVALIDATION_SEQNO_MAX;
> > > 	if (!expected_seqno)
> > > 		expected_seqno = 1;
> > > -	if (drm_WARN_ON(&gt->xe->drm, expected_seqno != msg[0])) {
> > > -		drm_err(&gt->xe->drm, "TLB expected_seqno(%d) != msg(%u)\n",
> > > +	if (drm_WARN_ON(&gt_to_xe(gt)->drm, expected_seqno != msg[0])) {
> > > +		drm_err(&gt_to_xe(gt)->drm, "TLB expected_seqno(%d) != msg(%u)\n",
> > > 			expected_seqno, msg[0]);
> > > 	}
> > > 
> > > diff --git a/drivers/gpu/drm/xe/xe_gt_types.h b/drivers/gpu/drm/xe/xe_gt_types.h
> > > index e0ed4508269b..c4376d50786b 100644
> > > --- a/drivers/gpu/drm/xe/xe_gt_types.h
> > > +++ b/drivers/gpu/drm/xe/xe_gt_types.h
> > > @@ -76,6 +76,16 @@ enum xe_steering_type {
> > > 	NUM_STEERING_TYPES
> > > };
> > > 
> > > +#define gt_to_tile(gt__)							\
> > > +	_Generic(gt__,								\
> > > +		 const struct xe_gt *: (const struct xe_tile *)((gt__)->tile),	\
> > > +		 struct xe_gt *: (gt__)->tile)
> > > +
> > > +#define gt_to_xe(gt__)										\
> > > +	_Generic(gt__,										\
> > > +		 const struct xe_gt *: (const struct xe_device *)(gt_to_tile(gt__)->xe),	\
> > > +		 struct xe_gt *: gt_to_tile(gt__)->xe)
> > > +
> > > /**
> > >  * struct xe_gt - A "Graphics Technology" unit of the GPU
> > >  *
> > > @@ -90,8 +100,8 @@ enum xe_steering_type {
> > >  * within a tile.
> > >  */
> > > struct xe_gt {
> > > -	/** @xe: backpointer to XE device */
> > > -	struct xe_device *xe;
> > > +	/** @tile: Backpointer to GT's tile */
> > > +	struct xe_tile *tile;
> > > 
> > > 	/** @info: GT info */
> > > 	struct {
> > > diff --git a/drivers/gpu/drm/xe/xe_mocs.c b/drivers/gpu/drm/xe/xe_mocs.c
> > > index 817afd301d52..d57fbf16a3ef 100644
> > > --- a/drivers/gpu/drm/xe/xe_mocs.c
> > > +++ b/drivers/gpu/drm/xe/xe_mocs.c
> > > @@ -471,7 +471,7 @@ static void __init_mocs_table(struct xe_gt *gt,
> > > 	unsigned int i;
> > > 	u32 mocs;
> > > 
> > > -	mocs_dbg(&gt->xe->drm, "entries:%d\n", info->n_entries);
> > > +	mocs_dbg(&gt_to_xe(gt)->drm, "entries:%d\n", info->n_entries);
> > > 	drm_WARN_ONCE(&xe->drm, !info->unused_entries_index,
> > > 		      "Unused entries index should have been defined\n");
> > > 	for (i = 0;
> > > @@ -479,7 +479,7 @@ static void __init_mocs_table(struct xe_gt *gt,
> > > 	     i++) {
> > > 		struct xe_reg reg = XE_REG(addr + i * 4);
> > > 
> > > -		mocs_dbg(&gt->xe->drm, "%d 0x%x 0x%x\n", i, reg.addr, mocs);
> > > +		mocs_dbg(&gt_to_xe(gt)->drm, "%d 0x%x 0x%x\n", i, reg.addr, mocs);
> > > 		xe_mmio_write32(gt, reg, mocs);
> > > 	}
> > > }
> > > @@ -508,13 +508,13 @@ static void init_l3cc_table(struct xe_gt *gt,
> > > 	unsigned int i;
> > > 	u32 l3cc;
> > > 
> > > -	mocs_dbg(&gt->xe->drm, "entries:%d\n", info->n_entries);
> > > +	mocs_dbg(&gt_to_xe(gt)->drm, "entries:%d\n", info->n_entries);
> > > 	for (i = 0;
> > > 	     i < (info->n_entries + 1) / 2 ?
> > > 	     (l3cc = l3cc_combine(get_entry_l3cc(info, 2 * i),
> > > 				  get_entry_l3cc(info, 2 * i + 1))), 1 : 0;
> > > 	     i++) {
> > > -		mocs_dbg(&gt->xe->drm, "%d 0x%x 0x%x\n", i, LNCFCMOCS(i).addr,
> > > +		mocs_dbg(&gt_to_xe(gt)->drm, "%d 0x%x 0x%x\n", i, LNCFCMOCS(i).addr,
> > > 			 l3cc);
> > > 		xe_mmio_write32(gt, LNCFCMOCS(i), l3cc);
> > > 	}
> > > @@ -524,7 +524,7 @@ void xe_mocs_init_early(struct xe_gt *gt)
> > > {
> > > 	struct xe_mocs_info table;
> > > 
> > > -	get_mocs_settings(gt->xe, &table);
> > > +	get_mocs_settings(gt_to_xe(gt), &table);
> > > 	gt->mocs.uc_index = table.uc_index;
> > > 	gt->mocs.wb_index = table.wb_index;
> > > }
> > > @@ -537,8 +537,8 @@ void xe_mocs_init(struct xe_gt *gt)
> > > 	/*
> > > 	 * LLC and eDRAM control values are not applicable to dgfx
> > > 	 */
> > > -	flags = get_mocs_settings(gt->xe, &table);
> > > -	mocs_dbg(&gt->xe->drm, "flag:0x%x\n", flags);
> > > +	flags = get_mocs_settings(gt_to_xe(gt), &table);
> > > +	mocs_dbg(&gt_to_xe(gt)->drm, "flag:0x%x\n", flags);
> > > 
> > > 	if (flags & HAS_GLOBAL_MOCS)
> > > 		__init_mocs_table(gt, &table, GLOBAL_MOCS(0).addr);
> > > diff --git a/drivers/gpu/drm/xe/xe_pci.c b/drivers/gpu/drm/xe/xe_pci.c
> > > index e79b16d8bf7f..87c328106aca 100644
> > > --- a/drivers/gpu/drm/xe/xe_pci.c
> > > +++ b/drivers/gpu/drm/xe/xe_pci.c
> > > @@ -471,6 +471,7 @@ static int xe_info_init(struct xe_device *xe,
> > > {
> > > 	const struct xe_graphics_desc *graphics_desc = NULL;
> > > 	const struct xe_media_desc *media_desc = NULL;
> > > +	struct xe_tile *tile;
> > > 	struct xe_gt *gt;
> > > 	u8 id;
> > > 
> > > @@ -525,13 +526,15 @@ static int xe_info_init(struct xe_device *xe,
> > > 	xe->info.step = xe_step_get(xe);
> > > 
> > > 	for (id = 0; id < xe->info.tile_count; ++id) {
> > > -		xe->tiles[id].xe = xe;
> > > -		xe->tiles[id].id = id;
> > > +		tile = &xe->tiles[id];
> > > +		tile->xe = xe;
> > > +		tile->id = id;
> > > 
> > > -		gt = &xe->tiles[id].primary_gt;
> > > +		gt = &tile->primary_gt;
> > 
> > these seem to have been squash in the wrong patch, changing the style
> > from using xe->tiles[i].* to declaring a local var. Maybe squash it in
> > the previous one to avoid the back and forth here?
> > 
> > otherwise,
> > 
> > 
> > Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
> > 
> > > 		gt->info.id = id;
> > > -		gt->xe = xe;
> > > +		gt->tile = tile;
> > > 
> > > +		gt->info.id = id;
> > 
> > the gt than has the id of the tile? Because right now there can only be
> > primary_gt? Looks odd, but it's a consequence of removing the only
> > platform with 1 tile and multiple gts.
> 
> Yeah, deciding how we want to number GTs is a bit of an open question.
> Right now at the end of this series, the numbering is
> 
>  * PVC:  0 = root tile primary, 1 = remote tile primary
>  * MTL:  0 = root tile primary, 1 = root tile media
>  * everything else:  only has GT0
> 
> Some day we may have a platform with multiple tiles and separate
> graphics/media GTs within each tile.  In that case would all graphics
> GTs get even numbers and all media GTs get odd?  Or would all primary
> GTs get gt_id = tile_id and all media GTs get gt_id = tile_id +
> tile_count?
> 
> The bigger open question is whether uapi should still expose GTs (like
> it does today) or whether it should be converted to just expose tiles
> and keep the graphics vs media separation an internal implementation
> detail.  If uapi changes to only expose tiles, then we probably don't
> even need IDs for the GTs...
I recall doing the uapi changes for topology that UMD seemed really
bound to the idea of interacting with "GTs". Likely only exposing tiles
would cause a need for alot of changes in implementation and methodolgy in the UMD. 
MattA
> 
> 
> Matt
> 
> > 
> > Lucas De Marchi
> > 
> > > 		if (id == 0) {
> > > 			gt->info.type = XE_GT_TYPE_MAIN;
> > > 			gt->info.vram_id = id;
> > > diff --git a/drivers/gpu/drm/xe/xe_pt.c b/drivers/gpu/drm/xe/xe_pt.c
> > > index f15282996c3b..61126cefe0b5 100644
> > > --- a/drivers/gpu/drm/xe/xe_pt.c
> > > +++ b/drivers/gpu/drm/xe/xe_pt.c
> > > @@ -695,7 +695,7 @@ xe_pt_stage_bind_entry(struct xe_ptw *parent, pgoff_t offset,
> > > 		 * TODO: Suballocate the pt bo to avoid wasting a lot of
> > > 		 * memory.
> > > 		 */
> > > -		if (GRAPHICS_VERx100(xe_walk->gt->xe) >= 1250 && level == 1 &&
> > > +		if (GRAPHICS_VERx100(gt_to_xe(xe_walk->gt)) >= 1250 && level == 1 &&
> > > 		    covers && xe_pt_scan_64K(addr, next, xe_walk)) {
> > > 			walk->shifts = xe_compact_pt_shifts;
> > > 			flags |= XE_PDE_64K;
> > > -- 
> > > 2.40.0
> > > 
> 
> -- 
> Matt Roper
> Graphics Software Engineer
> Linux GPU Platform Enablement
> Intel Corporation

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Intel-xe] [PATCH 03/26] drm/xe: Add backpointer from gt to tile
  2023-05-12 16:31       ` Matt Atwood
@ 2023-05-12 17:00         ` Matt Roper
  0 siblings, 0 replies; 75+ messages in thread
From: Matt Roper @ 2023-05-12 17:00 UTC (permalink / raw)
  To: Matt Atwood; +Cc: lucas.demarchi, intel-xe

On Fri, May 12, 2023 at 09:31:27AM -0700, Matt Atwood wrote:
> On Fri, May 12, 2023 at 09:20:02AM -0700, Matt Roper wrote:
> > On Thu, May 11, 2023 at 05:07:46PM -0700, Lucas De Marchi wrote:
> > > On Wed, May 10, 2023 at 08:46:59PM -0700, Matt Roper wrote:
> > > > Rather than a backpointer to the xe_device, a GT should have a
> > > > backpointer to its tile (which can then be used to lookup the device if
> > > > necessary).
> > > > 
> > > > The gt_to_xe() helper macro (which moves from xe_gt.h to xe_gt_types.h)
> > > > can and should still be used to jump directly from an xe_gt to
> > > > xe_device.
> > > > 
> > > > Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
> > > > ---
> > > > drivers/gpu/drm/xe/xe_bb.c                  |  2 +-
> > > > drivers/gpu/drm/xe/xe_gt.h                  |  5 -----
> > > > drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c |  4 ++--
> > > > drivers/gpu/drm/xe/xe_gt_types.h            | 14 ++++++++++++--
> > > > drivers/gpu/drm/xe/xe_mocs.c                | 14 +++++++-------
> > > > drivers/gpu/drm/xe/xe_pci.c                 | 11 +++++++----
> > > > drivers/gpu/drm/xe/xe_pt.c                  |  2 +-
> > > > 7 files changed, 30 insertions(+), 22 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/xe/xe_bb.c b/drivers/gpu/drm/xe/xe_bb.c
> > > > index 3deb2d55f421..bf7c94b769d7 100644
> > > > --- a/drivers/gpu/drm/xe/xe_bb.c
> > > > +++ b/drivers/gpu/drm/xe/xe_bb.c
> > > > @@ -16,7 +16,7 @@
> > > > 
> > > > static int bb_prefetch(struct xe_gt *gt)
> > > > {
> > > > -	struct xe_device *xe = gt->xe;
> > > > +	struct xe_device *xe = gt_to_xe(gt);
> > > > 
> > > > 	if (GRAPHICS_VERx100(xe) >= 1250 && !xe_gt_is_media_type(gt))
> > > > 		/*
> > > > diff --git a/drivers/gpu/drm/xe/xe_gt.h b/drivers/gpu/drm/xe/xe_gt.h
> > > > index 086369f7ee6d..f4e98f499b36 100644
> > > > --- a/drivers/gpu/drm/xe/xe_gt.h
> > > > +++ b/drivers/gpu/drm/xe/xe_gt.h
> > > > @@ -49,11 +49,6 @@ static inline bool xe_gt_is_media_type(struct xe_gt *gt)
> > > > 	return gt->info.type == XE_GT_TYPE_MEDIA;
> > > > }
> > > > 
> > > > -#define gt_to_xe(gt__)								\
> > > > -	_Generic(gt__,								\
> > > > -		 const struct xe_gt *: (const struct xe_device *)((gt__)->xe),	\
> > > > -		 struct xe_gt *: (gt__)->xe)
> > > > -
> > > > static inline bool xe_gt_is_usm_hwe(struct xe_gt *gt, struct xe_hw_engine *hwe)
> > > > {
> > > > 	struct xe_device *xe = gt_to_xe(gt);
> > > > diff --git a/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c b/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c
> > > > index c815a42e2cdb..c9e8825c02aa 100644
> > > > --- a/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c
> > > > +++ b/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c
> > > > @@ -322,8 +322,8 @@ int xe_guc_tlb_invalidation_done_handler(struct xe_guc *guc, u32 *msg, u32 len)
> > > > 		TLB_INVALIDATION_SEQNO_MAX;
> > > > 	if (!expected_seqno)
> > > > 		expected_seqno = 1;
> > > > -	if (drm_WARN_ON(&gt->xe->drm, expected_seqno != msg[0])) {
> > > > -		drm_err(&gt->xe->drm, "TLB expected_seqno(%d) != msg(%u)\n",
> > > > +	if (drm_WARN_ON(&gt_to_xe(gt)->drm, expected_seqno != msg[0])) {
> > > > +		drm_err(&gt_to_xe(gt)->drm, "TLB expected_seqno(%d) != msg(%u)\n",
> > > > 			expected_seqno, msg[0]);
> > > > 	}
> > > > 
> > > > diff --git a/drivers/gpu/drm/xe/xe_gt_types.h b/drivers/gpu/drm/xe/xe_gt_types.h
> > > > index e0ed4508269b..c4376d50786b 100644
> > > > --- a/drivers/gpu/drm/xe/xe_gt_types.h
> > > > +++ b/drivers/gpu/drm/xe/xe_gt_types.h
> > > > @@ -76,6 +76,16 @@ enum xe_steering_type {
> > > > 	NUM_STEERING_TYPES
> > > > };
> > > > 
> > > > +#define gt_to_tile(gt__)							\
> > > > +	_Generic(gt__,								\
> > > > +		 const struct xe_gt *: (const struct xe_tile *)((gt__)->tile),	\
> > > > +		 struct xe_gt *: (gt__)->tile)
> > > > +
> > > > +#define gt_to_xe(gt__)										\
> > > > +	_Generic(gt__,										\
> > > > +		 const struct xe_gt *: (const struct xe_device *)(gt_to_tile(gt__)->xe),	\
> > > > +		 struct xe_gt *: gt_to_tile(gt__)->xe)
> > > > +
> > > > /**
> > > >  * struct xe_gt - A "Graphics Technology" unit of the GPU
> > > >  *
> > > > @@ -90,8 +100,8 @@ enum xe_steering_type {
> > > >  * within a tile.
> > > >  */
> > > > struct xe_gt {
> > > > -	/** @xe: backpointer to XE device */
> > > > -	struct xe_device *xe;
> > > > +	/** @tile: Backpointer to GT's tile */
> > > > +	struct xe_tile *tile;
> > > > 
> > > > 	/** @info: GT info */
> > > > 	struct {
> > > > diff --git a/drivers/gpu/drm/xe/xe_mocs.c b/drivers/gpu/drm/xe/xe_mocs.c
> > > > index 817afd301d52..d57fbf16a3ef 100644
> > > > --- a/drivers/gpu/drm/xe/xe_mocs.c
> > > > +++ b/drivers/gpu/drm/xe/xe_mocs.c
> > > > @@ -471,7 +471,7 @@ static void __init_mocs_table(struct xe_gt *gt,
> > > > 	unsigned int i;
> > > > 	u32 mocs;
> > > > 
> > > > -	mocs_dbg(&gt->xe->drm, "entries:%d\n", info->n_entries);
> > > > +	mocs_dbg(&gt_to_xe(gt)->drm, "entries:%d\n", info->n_entries);
> > > > 	drm_WARN_ONCE(&xe->drm, !info->unused_entries_index,
> > > > 		      "Unused entries index should have been defined\n");
> > > > 	for (i = 0;
> > > > @@ -479,7 +479,7 @@ static void __init_mocs_table(struct xe_gt *gt,
> > > > 	     i++) {
> > > > 		struct xe_reg reg = XE_REG(addr + i * 4);
> > > > 
> > > > -		mocs_dbg(&gt->xe->drm, "%d 0x%x 0x%x\n", i, reg.addr, mocs);
> > > > +		mocs_dbg(&gt_to_xe(gt)->drm, "%d 0x%x 0x%x\n", i, reg.addr, mocs);
> > > > 		xe_mmio_write32(gt, reg, mocs);
> > > > 	}
> > > > }
> > > > @@ -508,13 +508,13 @@ static void init_l3cc_table(struct xe_gt *gt,
> > > > 	unsigned int i;
> > > > 	u32 l3cc;
> > > > 
> > > > -	mocs_dbg(&gt->xe->drm, "entries:%d\n", info->n_entries);
> > > > +	mocs_dbg(&gt_to_xe(gt)->drm, "entries:%d\n", info->n_entries);
> > > > 	for (i = 0;
> > > > 	     i < (info->n_entries + 1) / 2 ?
> > > > 	     (l3cc = l3cc_combine(get_entry_l3cc(info, 2 * i),
> > > > 				  get_entry_l3cc(info, 2 * i + 1))), 1 : 0;
> > > > 	     i++) {
> > > > -		mocs_dbg(&gt->xe->drm, "%d 0x%x 0x%x\n", i, LNCFCMOCS(i).addr,
> > > > +		mocs_dbg(&gt_to_xe(gt)->drm, "%d 0x%x 0x%x\n", i, LNCFCMOCS(i).addr,
> > > > 			 l3cc);
> > > > 		xe_mmio_write32(gt, LNCFCMOCS(i), l3cc);
> > > > 	}
> > > > @@ -524,7 +524,7 @@ void xe_mocs_init_early(struct xe_gt *gt)
> > > > {
> > > > 	struct xe_mocs_info table;
> > > > 
> > > > -	get_mocs_settings(gt->xe, &table);
> > > > +	get_mocs_settings(gt_to_xe(gt), &table);
> > > > 	gt->mocs.uc_index = table.uc_index;
> > > > 	gt->mocs.wb_index = table.wb_index;
> > > > }
> > > > @@ -537,8 +537,8 @@ void xe_mocs_init(struct xe_gt *gt)
> > > > 	/*
> > > > 	 * LLC and eDRAM control values are not applicable to dgfx
> > > > 	 */
> > > > -	flags = get_mocs_settings(gt->xe, &table);
> > > > -	mocs_dbg(&gt->xe->drm, "flag:0x%x\n", flags);
> > > > +	flags = get_mocs_settings(gt_to_xe(gt), &table);
> > > > +	mocs_dbg(&gt_to_xe(gt)->drm, "flag:0x%x\n", flags);
> > > > 
> > > > 	if (flags & HAS_GLOBAL_MOCS)
> > > > 		__init_mocs_table(gt, &table, GLOBAL_MOCS(0).addr);
> > > > diff --git a/drivers/gpu/drm/xe/xe_pci.c b/drivers/gpu/drm/xe/xe_pci.c
> > > > index e79b16d8bf7f..87c328106aca 100644
> > > > --- a/drivers/gpu/drm/xe/xe_pci.c
> > > > +++ b/drivers/gpu/drm/xe/xe_pci.c
> > > > @@ -471,6 +471,7 @@ static int xe_info_init(struct xe_device *xe,
> > > > {
> > > > 	const struct xe_graphics_desc *graphics_desc = NULL;
> > > > 	const struct xe_media_desc *media_desc = NULL;
> > > > +	struct xe_tile *tile;
> > > > 	struct xe_gt *gt;
> > > > 	u8 id;
> > > > 
> > > > @@ -525,13 +526,15 @@ static int xe_info_init(struct xe_device *xe,
> > > > 	xe->info.step = xe_step_get(xe);
> > > > 
> > > > 	for (id = 0; id < xe->info.tile_count; ++id) {
> > > > -		xe->tiles[id].xe = xe;
> > > > -		xe->tiles[id].id = id;
> > > > +		tile = &xe->tiles[id];
> > > > +		tile->xe = xe;
> > > > +		tile->id = id;
> > > > 
> > > > -		gt = &xe->tiles[id].primary_gt;
> > > > +		gt = &tile->primary_gt;
> > > 
> > > these seem to have been squash in the wrong patch, changing the style
> > > from using xe->tiles[i].* to declaring a local var. Maybe squash it in
> > > the previous one to avoid the back and forth here?
> > > 
> > > otherwise,
> > > 
> > > 
> > > Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
> > > 
> > > > 		gt->info.id = id;
> > > > -		gt->xe = xe;
> > > > +		gt->tile = tile;
> > > > 
> > > > +		gt->info.id = id;
> > > 
> > > the gt than has the id of the tile? Because right now there can only be
> > > primary_gt? Looks odd, but it's a consequence of removing the only
> > > platform with 1 tile and multiple gts.
> > 
> > Yeah, deciding how we want to number GTs is a bit of an open question.
> > Right now at the end of this series, the numbering is
> > 
> >  * PVC:  0 = root tile primary, 1 = remote tile primary
> >  * MTL:  0 = root tile primary, 1 = root tile media
> >  * everything else:  only has GT0
> > 
> > Some day we may have a platform with multiple tiles and separate
> > graphics/media GTs within each tile.  In that case would all graphics
> > GTs get even numbers and all media GTs get odd?  Or would all primary
> > GTs get gt_id = tile_id and all media GTs get gt_id = tile_id +
> > tile_count?
> > 
> > The bigger open question is whether uapi should still expose GTs (like
> > it does today) or whether it should be converted to just expose tiles
> > and keep the graphics vs media separation an internal implementation
> > detail.  If uapi changes to only expose tiles, then we probably don't
> > even need IDs for the GTs...
> I recall doing the uapi changes for topology that UMD seemed really
> bound to the idea of interacting with "GTs". Likely only exposing tiles
> would cause a need for alot of changes in implementation and methodolgy in the UMD. 
> MattA

On i915, neither tiles nor GTs are exposed to userspace.  Exposing GTs
as a first-class concepts is a large design change on the Xe driver.

I think there's even been talk about splitting this out at the /dev/dri
level, with separate render nodes being exposed per tile.  So there may
be even larger design changes coming before we settle on exactly how Xe
should work.


Matt

> > 
> > 
> > Matt
> > 
> > > 
> > > Lucas De Marchi
> > > 
> > > > 		if (id == 0) {
> > > > 			gt->info.type = XE_GT_TYPE_MAIN;
> > > > 			gt->info.vram_id = id;
> > > > diff --git a/drivers/gpu/drm/xe/xe_pt.c b/drivers/gpu/drm/xe/xe_pt.c
> > > > index f15282996c3b..61126cefe0b5 100644
> > > > --- a/drivers/gpu/drm/xe/xe_pt.c
> > > > +++ b/drivers/gpu/drm/xe/xe_pt.c
> > > > @@ -695,7 +695,7 @@ xe_pt_stage_bind_entry(struct xe_ptw *parent, pgoff_t offset,
> > > > 		 * TODO: Suballocate the pt bo to avoid wasting a lot of
> > > > 		 * memory.
> > > > 		 */
> > > > -		if (GRAPHICS_VERx100(xe_walk->gt->xe) >= 1250 && level == 1 &&
> > > > +		if (GRAPHICS_VERx100(gt_to_xe(xe_walk->gt)) >= 1250 && level == 1 &&
> > > > 		    covers && xe_pt_scan_64K(addr, next, xe_walk)) {
> > > > 			walk->shifts = xe_compact_pt_shifts;
> > > > 			flags |= XE_PDE_64K;
> > > > -- 
> > > > 2.40.0
> > > > 
> > 
> > -- 
> > Matt Roper
> > Graphics Software Engineer
> > Linux GPU Platform Enablement
> > Intel Corporation

-- 
Matt Roper
Graphics Software Engineer
Linux GPU Platform Enablement
Intel Corporation

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Intel-xe] [PATCH 05/26] drm/xe: Move register MMIO into xe_tile
  2023-05-11  3:47 ` [Intel-xe] [PATCH 05/26] drm/xe: Move register MMIO into xe_tile Matt Roper
  2023-05-11 12:20   ` Jani Nikula
@ 2023-05-13  5:53   ` Lucas De Marchi
  1 sibling, 0 replies; 75+ messages in thread
From: Lucas De Marchi @ 2023-05-13  5:53 UTC (permalink / raw)
  To: Matt Roper; +Cc: intel-xe

On Wed, May 10, 2023 at 08:47:01PM -0700, Matt Roper wrote:
>Each tile has its own register region in the BAR, containing instances
>of all registers for the platform.  In contrast, the multiple GTs within
>a tile share the same MMIO space; there's just a small subset of
>registers (the GSI registers) which have multiple copies at different
>offsets (0x0 for primary GT, 0x380000 for media GT).  Move the register
>MMIO region size/pointers to the tile structure, leaving just the GSI
>offset information in the GT structure.

Side node:

I was not very confortable with the sentiment "the abstraction is
currently completely broken as gt is used to access registers that are
not per gt".  I was not the one doing the previous abstraction, but I
understand it as a valid one too: the gt, just like uncore in i915, is
the leaf node that knows, depending on the register, what is the offset
to be applied, which essentially boils down to

	tile_base + addr
	tile_base + gsi_base + addr

gt->regs previously pointed to the right tile_base depending on what
they were ("normal" vs media-type). Here we are creating the tile
abstraction so we differentiate the a complete tile vs gt. It seems
cleaner and I like the direction.

>
>Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
>---
> drivers/gpu/drm/xe/display/ext/i915_irq.c |  2 +-
> drivers/gpu/drm/xe/xe_device_types.h      | 16 ++++++++++++++
> drivers/gpu/drm/xe/xe_ggtt.c              |  3 ++-
> drivers/gpu/drm/xe/xe_gt_types.h          |  9 +++-----
> drivers/gpu/drm/xe/xe_mmio.c              | 26 ++++++++++++-----------
> drivers/gpu/drm/xe/xe_mmio.h              | 21 +++++++++++++-----
> 6 files changed, 52 insertions(+), 25 deletions(-)
>
>diff --git a/drivers/gpu/drm/xe/display/ext/i915_irq.c b/drivers/gpu/drm/xe/display/ext/i915_irq.c
>index afde97b6faa6..a9cbd7b59360 100644
>--- a/drivers/gpu/drm/xe/display/ext/i915_irq.c
>+++ b/drivers/gpu/drm/xe/display/ext/i915_irq.c
>@@ -920,7 +920,7 @@ gen8_de_irq_handler(struct drm_i915_private *dev_priv, u32 master_ctl)
>
> void gen11_display_irq_handler(struct drm_i915_private *i915)
> {
>-	void __iomem * const regs = to_gt(i915)->mmio.regs;
>+	void __iomem * const regs = xe_device_get_root_tile(i915)->mmio.regs;
> 	const u32 disp_ctl = raw_reg_read(regs, GEN11_DISPLAY_INT_CTL);
>
> 	/*
>diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
>index 5dcf1695925f..2481b2045284 100644
>--- a/drivers/gpu/drm/xe/xe_device_types.h
>+++ b/drivers/gpu/drm/xe/xe_device_types.h
>@@ -80,6 +80,22 @@ struct xe_tile {
> 	struct xe_gt primary_gt;
>
> 	/* TODO: Add media GT here */
>+
>+	/**
>+	 * @mmio: MMIO info for a tile.
>+	 *
>+	 * Each tile has its own 16MB space in BAR0, laid out as:
>+	 * * 0-4MB: registers
>+	 * * 4MB-8MB: reserved
>+	 * * 8MB-16MB: global GTT
>+	 */
>+	struct {
>+		/** @size: size of tile's MMIO space */
>+		size_t size;
>+
>+		/** @regs: pointer to tile's MMIO space (starting with registers) */
>+		void *regs;
>+	} mmio;
> };
>
> /**
>diff --git a/drivers/gpu/drm/xe/xe_ggtt.c b/drivers/gpu/drm/xe/xe_ggtt.c
>index 546240261e0a..200976da3dc1 100644
>--- a/drivers/gpu/drm/xe/xe_ggtt.c
>+++ b/drivers/gpu/drm/xe/xe_ggtt.c
>@@ -93,6 +93,7 @@ static void ggtt_fini_noalloc(struct drm_device *drm, void *arg)
> int xe_ggtt_init_noalloc(struct xe_gt *gt, struct xe_ggtt *ggtt)
> {
> 	struct xe_device *xe = gt_to_xe(gt);
>+	struct xe_tile *tile = gt_to_tile(gt);
> 	struct pci_dev *pdev = to_pci_dev(xe->drm.dev);
> 	unsigned int gsm_size;
>
>@@ -106,7 +107,7 @@ int xe_ggtt_init_noalloc(struct xe_gt *gt, struct xe_ggtt *ggtt)
> 		return -ENOMEM;
> 	}
>
>-	ggtt->gsm = gt->mmio.regs + SZ_8M;
>+	ggtt->gsm = tile->mmio.regs + SZ_8M;
> 	ggtt->size = (gsm_size / 8) * (u64) XE_PAGE_SIZE;
>
> 	if (IS_DGFX(xe) && xe->info.vram_flags & XE_VRAM_FLAGS_NEED64K)
>diff --git a/drivers/gpu/drm/xe/xe_gt_types.h b/drivers/gpu/drm/xe/xe_gt_types.h
>index c4376d50786b..03dd625b2781 100644
>--- a/drivers/gpu/drm/xe/xe_gt_types.h
>+++ b/drivers/gpu/drm/xe/xe_gt_types.h
>@@ -124,14 +124,11 @@ struct xe_gt {
> 	} info;
>
> 	/**
>-	 * @mmio: mmio info for GT, can be subset of the global device mmio
>-	 * space
>+	 * @mmio: mmio info for GT.  All GTs within a tile share the same
>+	 * register space, but have their own copy of GSI registers at a
>+	 * specific offset, as well as their own forcewake handling.
> 	 */
> 	struct {
>-		/** @size: size of MMIO space on GT */
>-		size_t size;
>-		/** @regs: pointer to MMIO space on GT */
>-		void *regs;
> 		/** @fw: force wake for GT */
> 		struct xe_force_wake fw;
> 		/**
>diff --git a/drivers/gpu/drm/xe/xe_mmio.c b/drivers/gpu/drm/xe/xe_mmio.c
>index 254b4a63d901..54fa1212fcd9 100644
>--- a/drivers/gpu/drm/xe/xe_mmio.c
>+++ b/drivers/gpu/drm/xe/xe_mmio.c
>@@ -307,6 +307,7 @@ static void xe_mmio_probe_tiles(struct xe_device *xe)
>
> 	if (xe->info.tile_count > 1) {
> 		const int mmio_bar = 0;
>+		struct xe_tile *tile;
> 		size_t size;
> 		void *regs;
>
>@@ -320,11 +321,11 @@ static void xe_mmio_probe_tiles(struct xe_device *xe)
> 		size = xe->mmio.size / adj_tile_count;
> 		regs = xe->mmio.regs;
>
>-		for_each_gt(gt, xe, id) {
>-			if (id && !xe_gt_is_media_type(gt))
>-				regs += size;
>-			gt->mmio.size = size;
>-			gt->mmio.regs = regs;
>+		for_each_tile(tile, xe, id) {
>+			tile->mmio.size = size;
>+			tile->mmio.regs = regs;

it looks like the adj_tile_count meaning is now lost and w could have a
single tile_count variable and info printed?

also, I'm not sure there is much value to do

hardcoded -------->>----.
	xe->mmio.size = SZ_16M * tile_count

and then do

	size = xe->mmio.size / tile_count

size is always SZ_16M, as documented in the mmio struct.
AFAICS, this whole thing could now be simplified to:

	mtcfg = xe_mmio_read64(gt, XEHP_MTCFG_ADDR);
	xe->info.tile_count = REG_FIELD_GET(TILE_COUNT, mtcfg) + 1;

	/* re-map IO to cover all tiles */
	if (xe->info.tile_count > 1) {
		pci_iounmap(to_pci_dev(xe->drm.dev), xe->mmio.regs);
		xe->mmio.size = SZ_16M * xe->info.tile_count;
		xe->mmio.regs = pci_iomap(to_pci_dev(xe->drm.dev),
					  mmio_bar, xe->mmio.size);
	}

	for_each_tile(tile, xe, id) {
		tile->mmio.size = SZ_16M;
		tile->mmio.regs = xe->mmio.regs + SZ_16M * id;
	}


anyway, this is a small simplification. Patch looks correct to me.


Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>

Lucas De Marchi

>+
>+			regs += size;
> 		}
> 	}
> }
>@@ -340,15 +341,16 @@ static void mmio_fini(struct drm_device *drm, void *arg)
>
> int xe_mmio_init(struct xe_device *xe)
> {
>+	struct xe_tile *root_tile = xe_device_get_root_tile(xe);
> 	struct xe_gt *gt = xe_device_get_gt(xe, 0);
> 	const int mmio_bar = 0;
> 	int err;
>
> 	/*
>-	 * Map the entire BAR, which includes registers (0-4MB), reserved space
>-	 * (4MB-8MB), and GGTT (8MB-16MB). Other parts of the driver (GTs,
>-	 * GGTTs) will derive the pointers they need from the mapping in the
>-	 * device structure.
>+	 * Map the first 16MB of th BAR, which includes the registers (0-4MB),
>+	 * reserved space (4MB-8MB), and GGTT (8MB-16MB) for a single tile.
>+	 * This will get remapped later if we determine that we're running
>+	 * on a multi-tile system.
> 	 */
> 	xe->mmio.size = SZ_16M;
> 	xe->mmio.regs = pci_iomap(to_pci_dev(xe->drm.dev), mmio_bar,
>@@ -362,9 +364,9 @@ int xe_mmio_init(struct xe_device *xe)
> 	if (err)
> 		return err;
>
>-	/* 1 GT for now, 1 to 1 mapping, may change on multi-GT devices */
>-	gt->mmio.size = xe->mmio.size;
>-	gt->mmio.regs = xe->mmio.regs;
>+	/* Setup first tile; other tiles (if present) will be setup later. */
>+	root_tile->mmio.size = xe->mmio.size;
>+	root_tile->mmio.regs = xe->mmio.regs;
>
> 	/*
> 	 * The boot firmware initializes local memory and assesses its health.
>diff --git a/drivers/gpu/drm/xe/xe_mmio.h b/drivers/gpu/drm/xe/xe_mmio.h
>index 1407f1189b0d..acf0b18f3111 100644
>--- a/drivers/gpu/drm/xe/xe_mmio.h
>+++ b/drivers/gpu/drm/xe/xe_mmio.h
>@@ -10,6 +10,7 @@
> #include <linux/io-64-nonatomic-lo-hi.h>
>
> #include "regs/xe_reg_defs.h"
>+#include "xe_device_types.h"
> #include "xe_gt_types.h"
>
> struct drm_device;
>@@ -20,27 +21,33 @@ int xe_mmio_init(struct xe_device *xe);
>
> static inline u8 xe_mmio_read8(struct xe_gt *gt, struct xe_reg reg)
> {
>+	struct xe_tile *tile = gt_to_tile(gt);
>+
> 	if (reg.addr < gt->mmio.adj_limit)
> 		reg.addr += gt->mmio.adj_offset;
>
>-	return readb(gt->mmio.regs + reg.addr);
>+	return readb(tile->mmio.regs + reg.addr);
> }
>
> static inline void xe_mmio_write32(struct xe_gt *gt,
> 				   struct xe_reg reg, u32 val)
> {
>+	struct xe_tile *tile = gt_to_tile(gt);
>+
> 	if (reg.addr < gt->mmio.adj_limit)
> 		reg.addr += gt->mmio.adj_offset;
>
>-	writel(val, gt->mmio.regs + reg.addr);
>+	writel(val, tile->mmio.regs + reg.addr);
> }
>
> static inline u32 xe_mmio_read32(struct xe_gt *gt, struct xe_reg reg)
> {
>+	struct xe_tile *tile = gt_to_tile(gt);
>+
> 	if (reg.addr < gt->mmio.adj_limit)
> 		reg.addr += gt->mmio.adj_offset;
>
>-	return readl(gt->mmio.regs + reg.addr);
>+	return readl(tile->mmio.regs + reg.addr);
> }
>
> static inline u32 xe_mmio_rmw32(struct xe_gt *gt, struct xe_reg reg, u32 clr,
>@@ -58,18 +65,22 @@ static inline u32 xe_mmio_rmw32(struct xe_gt *gt, struct xe_reg reg, u32 clr,
> static inline void xe_mmio_write64(struct xe_gt *gt,
> 				   struct xe_reg reg, u64 val)
> {
>+	struct xe_tile *tile = gt_to_tile(gt);
>+
> 	if (reg.addr < gt->mmio.adj_limit)
> 		reg.addr += gt->mmio.adj_offset;
>
>-	writeq(val, gt->mmio.regs + reg.addr);
>+	writeq(val, tile->mmio.regs + reg.addr);
> }
>
> static inline u64 xe_mmio_read64(struct xe_gt *gt, struct xe_reg reg)
> {
>+	struct xe_tile *tile = gt_to_tile(gt);
>+
> 	if (reg.addr < gt->mmio.adj_limit)
> 		reg.addr += gt->mmio.adj_offset;
>
>-	return readq(gt->mmio.regs + reg.addr);
>+	return readq(tile->mmio.regs + reg.addr);
> }
>
> static inline int xe_mmio_write32_and_verify(struct xe_gt *gt,
>-- 
>2.40.0
>

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Intel-xe] [PATCH 00/26] Separate GT and tile
  2023-05-11  3:46 [Intel-xe] [PATCH 00/26] Separate GT and tile Matt Roper
                   ` (31 preceding siblings ...)
  2023-05-12  7:23 ` [Intel-xe] ✗ CI.KUnit: failure " Patchwork
@ 2023-05-15 13:08 ` Thomas Hellström
  2023-05-15 18:11   ` Matt Roper
  2023-05-16 14:18 ` Das, Nirmoy
  2023-05-18 17:47 ` Rodrigo Vivi
  34 siblings, 1 reply; 75+ messages in thread
From: Thomas Hellström @ 2023-05-15 13:08 UTC (permalink / raw)
  To: Matt Roper, intel-xe; +Cc: Nirmoy Das, Lucas De Marchi, Rodrigo Vivi

Hi, Matt,

On Wed, 2023-05-10 at 20:46 -0700, Matt Roper wrote:
> A 'tile' is not the same thing as a 'GT.'  For historical reasons,
> i915
> attempted to use a single 'struct intel_gt' to represent both
> concepts,
> although this design hasn't worked out terribly well.  For Xe we have
> the opportunity to design the driver in a way that more accurately
> reflects the real hardware behavior.
> 
> Different vendors use the term "tile" a bit differently, but in the
> Intel world, a 'tile' is pretty close to what most people would think
> of
> as being a complete GPU.  When multiple GPUs are placed behind a
> single
> PCI device, that's what we refer to as a "multi-tile device."  In
> such
> cases, pretty much all hardware is replicated per-tile, although
> certain
> responsibilities like PCI communication, reporting of interrupts to
> the
> OS, etc. are handled solely by the "root tile."  A multi-tile
> platform
> takes care of tying the tiles together in a way such that interrupt
> notifications from remote tiles are forwarded to the root tile, the
> per-tile vram is combined into a single address space, etc.
> 
> In contrast, a "GT" (which officially stands for "Graphics
> Technology")
> is the subset of a GPU/tile that is responsible for implementing
> graphics and/or media operations.  The GT is where a lot of the
> driver
> implementation happens since it's where the hardware engines, the
> execution units, and the GuC all reside.
> 
> Historically most Intel devices were single-tile devices that
> contained
> a single GT.  PVC is currently the only released Intel platform built
> on
> a multi-tile design (i.e., multiple GPUs behind a single PCI device);
> each PVC tile only has a single GT.  In contrast, platforms like MTL
> that have separate chips for render and media IP are still only a
> single
> logical GPU, but the graphics and media IP blocks are exposed each
> exposed as a separate GT within that single GPU.  This is important
> from
> a software perspective because multi-GT platforms like MTL only
> replicate a subset of the GPU hardware and behave differently than
> multi-tile platforms like PVC where nearly everything is replicated.
> 
> This series separates tiles from GTs in a manner that more closely
> matches the hardware behavior.  We now consider a PCI device
> (xe_device)
> to contain one or more tiles (struct xe_tile).  Each tile will
> contain
> one or two GTs (struct xe_gt).  Although we don't have any platforms
> yet
> that are multi-tile *and* contain more than one GT per tile, that may
> change in the future.  This driver redesign splits functionality as
> follows:
> 
> Per-tile functionality (shared by all GTs within the tile):
>  - Complete 4MB MMIO space (containing SGunit/SoC registers, GT
>    registers, display registers, etc.)
>  - Global GTT
>  - VRAM (if discrete)
>  - Interrupt flows
>  - Migration context
>  - kernel batchbuffer pool
>  - Primary GT
>  - Media GT (if media version >= 13)
> 
> Per-GT functionality:
>  - GuC
>  - Hardware engines
>  - Programmable hardware units (subslices, EUs)
>  - GSI subset of registers (multiple copies of these registers reside
>    within the complete MMIO space provided by the tile, but at
> different
>    offsets --- 0 for render, 0x380000 for media)
>  - Multicast register steering
>  - TLBs to cache page table translations
>  - Reset capability
>  - Low-level power management (e.g., C6)
>  - Clock frequency
>  - MOCS and PAT programming
> 

With that detailed cover-letter description, I think this makes sense.

I figure pagetables will need to be per tile with this splitup? What
about per-tile resources, like VRAM, that is accessible from all tiles
but with separate throughput / latencies depending on from which tile
they are accessed? Should those perhaps be per device with a per-tile
pointer to "preferred VRAM" and a map [tile][memory_type] of access
cost?

/Thomas



^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Intel-xe] [PATCH 00/26] Separate GT and tile
  2023-05-15 13:08 ` [Intel-xe] [PATCH 00/26] Separate GT and tile Thomas Hellström
@ 2023-05-15 18:11   ` Matt Roper
  0 siblings, 0 replies; 75+ messages in thread
From: Matt Roper @ 2023-05-15 18:11 UTC (permalink / raw)
  To: Thomas Hellström; +Cc: Nirmoy Das, Lucas De Marchi, intel-xe, Rodrigo Vivi

On Mon, May 15, 2023 at 03:08:24PM +0200, Thomas Hellström wrote:
> Hi, Matt,
> 
> On Wed, 2023-05-10 at 20:46 -0700, Matt Roper wrote:
> > A 'tile' is not the same thing as a 'GT.'  For historical reasons,
> > i915
> > attempted to use a single 'struct intel_gt' to represent both
> > concepts,
> > although this design hasn't worked out terribly well.  For Xe we have
> > the opportunity to design the driver in a way that more accurately
> > reflects the real hardware behavior.
> > 
> > Different vendors use the term "tile" a bit differently, but in the
> > Intel world, a 'tile' is pretty close to what most people would think
> > of
> > as being a complete GPU.  When multiple GPUs are placed behind a
> > single
> > PCI device, that's what we refer to as a "multi-tile device."  In
> > such
> > cases, pretty much all hardware is replicated per-tile, although
> > certain
> > responsibilities like PCI communication, reporting of interrupts to
> > the
> > OS, etc. are handled solely by the "root tile."  A multi-tile
> > platform
> > takes care of tying the tiles together in a way such that interrupt
> > notifications from remote tiles are forwarded to the root tile, the
> > per-tile vram is combined into a single address space, etc.
> > 
> > In contrast, a "GT" (which officially stands for "Graphics
> > Technology")
> > is the subset of a GPU/tile that is responsible for implementing
> > graphics and/or media operations.  The GT is where a lot of the
> > driver
> > implementation happens since it's where the hardware engines, the
> > execution units, and the GuC all reside.
> > 
> > Historically most Intel devices were single-tile devices that
> > contained
> > a single GT.  PVC is currently the only released Intel platform built
> > on
> > a multi-tile design (i.e., multiple GPUs behind a single PCI device);
> > each PVC tile only has a single GT.  In contrast, platforms like MTL
> > that have separate chips for render and media IP are still only a
> > single
> > logical GPU, but the graphics and media IP blocks are exposed each
> > exposed as a separate GT within that single GPU.  This is important
> > from
> > a software perspective because multi-GT platforms like MTL only
> > replicate a subset of the GPU hardware and behave differently than
> > multi-tile platforms like PVC where nearly everything is replicated.
> > 
> > This series separates tiles from GTs in a manner that more closely
> > matches the hardware behavior.  We now consider a PCI device
> > (xe_device)
> > to contain one or more tiles (struct xe_tile).  Each tile will
> > contain
> > one or two GTs (struct xe_gt).  Although we don't have any platforms
> > yet
> > that are multi-tile *and* contain more than one GT per tile, that may
> > change in the future.  This driver redesign splits functionality as
> > follows:
> > 
> > Per-tile functionality (shared by all GTs within the tile):
> >  - Complete 4MB MMIO space (containing SGunit/SoC registers, GT
> >    registers, display registers, etc.)
> >  - Global GTT
> >  - VRAM (if discrete)
> >  - Interrupt flows
> >  - Migration context
> >  - kernel batchbuffer pool
> >  - Primary GT
> >  - Media GT (if media version >= 13)
> > 
> > Per-GT functionality:
> >  - GuC
> >  - Hardware engines
> >  - Programmable hardware units (subslices, EUs)
> >  - GSI subset of registers (multiple copies of these registers reside
> >    within the complete MMIO space provided by the tile, but at
> > different
> >    offsets --- 0 for render, 0x380000 for media)
> >  - Multicast register steering
> >  - TLBs to cache page table translations
> >  - Reset capability
> >  - Low-level power management (e.g., C6)
> >  - Clock frequency
> >  - MOCS and PAT programming
> > 
> 
> With that detailed cover-letter description, I think this makes sense.
> 
> I figure pagetables will need to be per tile with this splitup? What

Yeah, the GGTT moves into the xe_tile in this series.  I thought I had a
specific patch for that, but it looks like I might have accidentally
squashed it into the VRAM patch; I should separate those back out into
two separate patches in the next series revision.

> about per-tile resources, like VRAM, that is accessible from all tiles
> but with separate throughput / latencies depending on from which tile
> they are accessed? Should those perhaps be per device with a per-tile
> pointer to "preferred VRAM" and a map [tile][memory_type] of access
> cost?

I kept the VRAM inside the tile in this series, but we could definitely
promote it up to the device level if we think that makes sense (e.g., if
we suspect that future platforms might not have a 1:1 relationship
between GPU/tile and VRAM).  That would probably be worth doing as a
follow-up series though; since vram was already inside the xe_gt, moving
it to the xe_tile (which matches the reality of how PVC works) seemed
like the natural first step.


Matt

> 
> /Thomas
> 
> 

-- 
Matt Roper
Graphics Software Engineer
Linux GPU Platform Enablement
Intel Corporation

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Intel-xe] [PATCH 06/26] drm/xe: Move VRAM from GT to tile
  2023-05-11  3:47 ` [Intel-xe] [PATCH 06/26] drm/xe: Move VRAM from GT to tile Matt Roper
@ 2023-05-15 22:40   ` Lucas De Marchi
  2023-05-18 17:29     ` Rodrigo Vivi
  0 siblings, 1 reply; 75+ messages in thread
From: Lucas De Marchi @ 2023-05-15 22:40 UTC (permalink / raw)
  To: Matt Roper; +Cc: intel-xe

On Wed, May 10, 2023 at 08:47:02PM -0700, Matt Roper wrote:
>On platforms with VRAM, the VRAM is associated with the tile, not the
>GT.
>
>Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
>---
> drivers/gpu/drm/xe/Makefile                   |  1 +
> drivers/gpu/drm/xe/display/xe_fb_pin.c        |  6 +-
> drivers/gpu/drm/xe/display/xe_plane_initial.c |  8 +-

I'm not sure the best way to handle the display. On my refactors I've
been leaving them on a separate patch, even if it breaks the build. The
reason is that when the rebase happens and display is moved up, we don't
risk losing the these hunks.

>diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
>index 2481b2045284..6b9e7847161c 100644
>--- a/drivers/gpu/drm/xe/xe_device_types.h
>+++ b/drivers/gpu/drm/xe/xe_device_types.h
>@@ -53,6 +53,8 @@
> 		 const struct xe_tile *: (const struct xe_device *)((tile__)->xe),	\
> 		 struct xe_tile *: (tile__)->xe)
>
>+struct xe_ggtt;
>+
> /**
>  * struct xe_tile - hardware tile structure
>  *
>@@ -96,6 +98,40 @@ struct xe_tile {
> 		/** @regs: pointer to tile's MMIO space (starting with registers) */
> 		void *regs;
> 	} mmio;
>+
>+	/** @mem: memory management info for tile */
>+	struct {
>+		/**
>+		 * @vram: VRAM info for tile.
>+		 *
>+		 * Although VRAM is associated with a specific tile, it can
>+		 * still be accessed by all tiles' GTs.
>+		 */
>+		struct {
>+			/** @io_start: IO start address of this VRAM instance */
>+			resource_size_t io_start;
>+			/**
>+			 * @io_size: IO size of this VRAM instance
>+			 *
>+			 * This represents how much of this VRAM we can access
>+			 * via the CPU through the VRAM BAR. This can be smaller
>+			 * than @size, in which case only part of VRAM is CPU
>+			 * accessible (typically the first 256M). This
>+			 * configuration is known as small-bar.
>+			 */
>+			resource_size_t io_size;
>+			/** @size: size of VRAM. */
>+			resource_size_t size;
>+			/** @mapping: pointer to VRAM mappable space */
>+			void *__iomem mapping;
>+		} vram;
>+
>+		/** @vram_mgr: VRAM TTM manager */
>+		struct xe_ttm_vram_mgr *vram_mgr;
>+
>+		/** @ggtt: Global graphics translation table */
>+		struct xe_ggtt *ggtt;

I guess the ggtt should be moved on a separate patch?

other than that it seems good to me, but since it has a lot of
mechanical changes, but not CI, hard to judge for correctness.


Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>


Lucas De Marchi

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Intel-xe] [PATCH 00/26] Separate GT and tile
  2023-05-11  3:46 [Intel-xe] [PATCH 00/26] Separate GT and tile Matt Roper
                   ` (32 preceding siblings ...)
  2023-05-15 13:08 ` [Intel-xe] [PATCH 00/26] Separate GT and tile Thomas Hellström
@ 2023-05-16 14:18 ` Das, Nirmoy
  2023-05-18 17:47 ` Rodrigo Vivi
  34 siblings, 0 replies; 75+ messages in thread
From: Das, Nirmoy @ 2023-05-16 14:18 UTC (permalink / raw)
  To: Matt Roper, intel-xe; +Cc: Nirmoy Das, Lucas De Marchi, Rodrigo Vivi


On 5/11/2023 5:46 AM, Matt Roper wrote:
> A 'tile' is not the same thing as a 'GT.'  For historical reasons, i915
> attempted to use a single 'struct intel_gt' to represent both concepts,
> although this design hasn't worked out terribly well.  For Xe we have
> the opportunity to design the driver in a way that more accurately
> reflects the real hardware behavior.
>
> Different vendors use the term "tile" a bit differently, but in the
> Intel world, a 'tile' is pretty close to what most people would think of
> as being a complete GPU.  When multiple GPUs are placed behind a single
> PCI device, that's what we refer to as a "multi-tile device."  In such
> cases, pretty much all hardware is replicated per-tile, although certain
> responsibilities like PCI communication, reporting of interrupts to the
> OS, etc. are handled solely by the "root tile."  A multi-tile platform
> takes care of tying the tiles together in a way such that interrupt
> notifications from remote tiles are forwarded to the root tile, the
> per-tile vram is combined into a single address space, etc.
>
> In contrast, a "GT" (which officially stands for "Graphics Technology")
> is the subset of a GPU/tile that is responsible for implementing
> graphics and/or media operations.  The GT is where a lot of the driver
> implementation happens since it's where the hardware engines, the
> execution units, and the GuC all reside.
>
> Historically most Intel devices were single-tile devices that contained
> a single GT.  PVC is currently the only released Intel platform built on
> a multi-tile design (i.e., multiple GPUs behind a single PCI device);
> each PVC tile only has a single GT.  In contrast, platforms like MTL
> that have separate chips for render and media IP are still only a single
> logical GPU, but the graphics and media IP blocks are exposed each
> exposed as a separate GT within that single GPU.  This is important from
> a software perspective because multi-GT platforms like MTL only
> replicate a subset of the GPU hardware and behave differently than
> multi-tile platforms like PVC where nearly everything is replicated.
>
> This series separates tiles from GTs in a manner that more closely
> matches the hardware behavior.  We now consider a PCI device (xe_device)
> to contain one or more tiles (struct xe_tile).  Each tile will contain
> one or two GTs (struct xe_gt).  Although we don't have any platforms yet
> that are multi-tile *and* contain more than one GT per tile, that may
> change in the future.  This driver redesign splits functionality as
> follows:
>
> Per-tile functionality (shared by all GTs within the tile):
>   - Complete 4MB MMIO space (containing SGunit/SoC registers, GT
>     registers, display registers, etc.)
>   - Global GTT
>   - VRAM (if discrete)
>   - Interrupt flows
>   - Migration context
>   - kernel batchbuffer pool
>   - Primary GT
>   - Media GT (if media version >= 13)
>
> Per-GT functionality:
>   - GuC
>   - Hardware engines
>   - Programmable hardware units (subslices, EUs)
>   - GSI subset of registers (multiple copies of these registers reside
>     within the complete MMIO space provided by the tile, but at different
>     offsets --- 0 for render, 0x380000 for media)
>   - Multicast register steering
>   - TLBs to cache page table translations
>   - Reset capability
>   - Low-level power management (e.g., C6)
>   - Clock frequency
>   - MOCS and PAT programming
>
> At the moment I've left USM / pagefault handling at the GT level,
> although I'm not familiar enough with that specific feature to know
> whether it's truly correct or not.
>
> The first patch in this series temporarily drops MTL media GT support.
> The driver doesn't load properly on MTL today, largely due to the
> mishandling of GT vs tile; dropping support completely allows us to more
> easily make the necessary driver redesign required.  The media GT is
> re-enabled (properly this time) near the end of the series and this
> allows the driver to load successfully without error on MTL for the
> first time.  There are still issues when submitting workloads to MTL
> after driver load (i.e., CAT errors), but those seem to be a separate
> platform-specific issues unrelated to the GT/tile work in this series
> that will need to be debugged and fixed separately.
>
>
> This series leaves a few open questions and FIXME's:
>   - Unlike i915, the Xe driver has chosen to expose GTs to userspace
>     rather than keeping them a hidden implementation detail.  With the
>     separation of xe_tile and xe_gt, we need to decide whether we also
>     want to expose tiles (in addition to GTs), whether we want to _only_
>     expose tiles (and keep the primary vs media GT separation a hidden
>     internal detail), or something else.
>   - How should GTs be numbered?  Today it's straightforward --- PVC
>     assigns GT IDs 0 and 1 to the primary GT of each tile.  MTL assigns
>     GT IDs 0 and 1 to the primary and media GTs of its sole tile.  But if
>     we have a platform in the future that has multiple tiles _and_
>     multiple GTs per tile, how should we handle the numbering in that
>     case?
>   - Xe (mis)design used xe_gt as the target of all MMIO operations (i.e.,
>     xe_mmio_*()).  This really doesn't make sense, especially since
>     there's a lot of MMIO accesses that are completely unrelated to GT
>     (i.e., sgunit registers, display registers, etc.).  i915 used
>     'intel_uncore' as the MMIO target, although that wasn't really an
>     accurate reflection of the hardware either.  What we really want is
>     something that combines the MMIO register space (stored in the tile)
>     with the GSI offset (stored in the GT).  My current plan is to
>     introduce an "xe_mmio_view" (name may change) in a future series that
>     will serve as a target for register operations.  There will be
>     sensible APIs to obtain an xe_mmio_view appropriate to the type of
>     register access being performed (and that will also be able to do
>     some range sanity checking in debug drivers to help catch misuse).
>     That's a somewhat large/invasive change, so I'm saving that for a
>     follow-up series after this one is completed.
>
>
> Cc: Matthew Brost <matthew.brost@intel.com>
> Cc: Lucas De Marchi <lucas.demarchi@intel.com>
> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
> Cc: Michael J. Ruhl <michael.j.ruhl@intel.com>
> Cc: Nirmoy Das <nirmoy.das@intel.com>


Tested this on my MTL C0. You might be working on next revision but feel 
free to add Tested-by: Nirmoy Das <nirmoy.das@intel.com> for the series.

>
>
> Matt Roper (26):
>    drm/xe/mtl: Disable media GT
>    drm/xe: Introduce xe_tile
>    drm/xe: Add backpointer from gt to tile
>    drm/xe: Add for_each_tile iterator
>    drm/xe: Move register MMIO into xe_tile
>    drm/xe: Move VRAM from GT to tile
>    drm/xe: Memory allocations are tile-based, not GT-based
>    drm/xe: Move migration from GT to tile
>    drm/xe: Clarify 'gt' retrieval for primary tile
>    drm/xe: Drop vram_id
>    drm/xe: Drop extra_gts[] declarations and XE_GT_TYPE_REMOTE
>    drm/xe: Allocate GT dynamically
>    drm/xe: Add media GT to tile
>    drm/xe: Move display IRQ postinstall out of GT function
>    drm/xe: Interrupts are delivered per-tile, not per-GT
>    drm/xe/irq: Handle ASLE backlight interrupts at same time as display
>    drm/xe/irq: Actually call xe_irq_postinstall()
>    drm/xe/irq: Ensure primary GuC won't clobber media GuC's interrupt
>      mask
>    drm/xe/irq: Untangle postinstall functions
>    drm/xe: Replace xe_gt_irq_postinstall with xe_irq_enable_hwe
>    drm/xe: Invalidate TLB on all affected GTs during GGTT updates
>    drm/xe/tlb: Obtain forcewake when doing GGTT TLB invalidations
>    drm/xe: Allow GT looping and lookup on standalone media
>    drm/xe: Update query uapi to support standalone media
>    drm/xe: Reinstate media GT support
>    drm/xe: Clarify source of GT log messages
>
>   drivers/gpu/drm/i915/display/intel_dsb.c      |   5 +-
>   drivers/gpu/drm/i915/display/intel_fbc.c      |   3 +-
>   drivers/gpu/drm/i915/display/intel_fbdev.c    |   7 +-
>   drivers/gpu/drm/xe/Makefile                   |   1 +
>   .../drm/xe/compat-i915-headers/intel_uncore.h |   2 +-
>   drivers/gpu/drm/xe/display/ext/i915_irq.c     |   2 +-
>   drivers/gpu/drm/xe/display/xe_fb_pin.c        |  13 +-
>   drivers/gpu/drm/xe/display/xe_plane_initial.c |   8 +-
>   drivers/gpu/drm/xe/regs/xe_gt_regs.h          |   8 +
>   drivers/gpu/drm/xe/tests/xe_bo.c              |   8 +-
>   drivers/gpu/drm/xe/tests/xe_migrate.c         |  15 +-
>   drivers/gpu/drm/xe/xe_bb.c                    |   5 +-
>   drivers/gpu/drm/xe/xe_bo.c                    | 104 ++---
>   drivers/gpu/drm/xe/xe_bo.h                    |  20 +-
>   drivers/gpu/drm/xe/xe_bo_evict.c              |  22 +-
>   drivers/gpu/drm/xe/xe_bo_types.h              |   4 +-
>   drivers/gpu/drm/xe/xe_device.c                |  12 +-
>   drivers/gpu/drm/xe/xe_device.h                |  49 ++-
>   drivers/gpu/drm/xe/xe_device_types.h          | 107 ++++-
>   drivers/gpu/drm/xe/xe_engine.c                |   2 +-
>   drivers/gpu/drm/xe/xe_ggtt.c                  |  45 +-
>   drivers/gpu/drm/xe/xe_ggtt.h                  |   6 +-
>   drivers/gpu/drm/xe/xe_ggtt_types.h            |   2 +-
>   drivers/gpu/drm/xe/xe_gt.c                    | 191 ++-------
>   drivers/gpu/drm/xe/xe_gt.h                    |   8 +-
>   drivers/gpu/drm/xe/xe_gt_debugfs.c            |   8 +-
>   drivers/gpu/drm/xe/xe_gt_mcr.c                |   2 +-
>   drivers/gpu/drm/xe/xe_gt_pagefault.c          |  16 +-
>   drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c   |   4 +-
>   drivers/gpu/drm/xe/xe_gt_types.h              |  87 ++--
>   drivers/gpu/drm/xe/xe_guc.c                   |  11 +-
>   drivers/gpu/drm/xe/xe_guc_ads.c               |   5 +-
>   drivers/gpu/drm/xe/xe_guc_ct.c                |   5 +-
>   drivers/gpu/drm/xe/xe_guc_hwconfig.c          |   5 +-
>   drivers/gpu/drm/xe/xe_guc_log.c               |   6 +-
>   drivers/gpu/drm/xe/xe_guc_pc.c                |   5 +-
>   drivers/gpu/drm/xe/xe_hw_engine.c             |   6 +-
>   drivers/gpu/drm/xe/xe_irq.c                   | 393 +++++++++---------
>   drivers/gpu/drm/xe/xe_irq.h                   |   3 +-
>   drivers/gpu/drm/xe/xe_lrc.c                   |  13 +-
>   drivers/gpu/drm/xe/xe_lrc_types.h             |   4 +-
>   drivers/gpu/drm/xe/xe_migrate.c               |  76 ++--
>   drivers/gpu/drm/xe/xe_migrate.h               |   9 +-
>   drivers/gpu/drm/xe/xe_mmio.c                  |  92 ++--
>   drivers/gpu/drm/xe/xe_mmio.h                  |  21 +-
>   drivers/gpu/drm/xe/xe_mocs.c                  |  14 +-
>   drivers/gpu/drm/xe/xe_pci.c                   |  92 ++--
>   drivers/gpu/drm/xe/xe_pt.c                    | 150 ++++---
>   drivers/gpu/drm/xe/xe_pt.h                    |  14 +-
>   drivers/gpu/drm/xe/xe_query.c                 |  32 +-
>   drivers/gpu/drm/xe/xe_res_cursor.h            |   2 +-
>   drivers/gpu/drm/xe/xe_sa.c                    |  13 +-
>   drivers/gpu/drm/xe/xe_sa.h                    |   4 +-
>   drivers/gpu/drm/xe/xe_tile.c                  |  89 ++++
>   drivers/gpu/drm/xe/xe_tile.h                  |  16 +
>   drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c        |   4 +-
>   drivers/gpu/drm/xe/xe_ttm_vram_mgr.c          |  16 +-
>   drivers/gpu/drm/xe/xe_ttm_vram_mgr.h          |   4 +-
>   drivers/gpu/drm/xe/xe_ttm_vram_mgr_types.h    |   6 +-
>   drivers/gpu/drm/xe/xe_uc_fw.c                 |   5 +-
>   drivers/gpu/drm/xe/xe_vm.c                    | 156 ++++---
>   drivers/gpu/drm/xe/xe_vm.h                    |   2 +-
>   drivers/gpu/drm/xe/xe_vm_types.h              |  22 +-
>   include/uapi/drm/xe_drm.h                     |   4 +-
>   64 files changed, 1108 insertions(+), 957 deletions(-)
>   create mode 100644 drivers/gpu/drm/xe/xe_tile.c
>   create mode 100644 drivers/gpu/drm/xe/xe_tile.h
>

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Intel-xe] [PATCH 07/26] drm/xe: Memory allocations are tile-based, not GT-based
  2023-05-11  3:47 ` [Intel-xe] [PATCH 07/26] drm/xe: Memory allocations are tile-based, not GT-based Matt Roper
@ 2023-05-17  4:56   ` Lucas De Marchi
  0 siblings, 0 replies; 75+ messages in thread
From: Lucas De Marchi @ 2023-05-17  4:56 UTC (permalink / raw)
  To: Matt Roper; +Cc: intel-xe

On Wed, May 10, 2023 at 08:47:03PM -0700, Matt Roper wrote:
>Since memory and address spaces are a tile concept rather than a GT
>concept, we need to plumb tile-based handling through lots of
>memory-related code.
>
>Note that one remaining shortcoming here that will need to be addressed
>before media GT support can be re-enabled is that although the address
>space is shared between a tile's GTs, each GT caches the PTEs
>independently in their own TLB and thus TLB invalidation should be
>handled at the GT level.
>
>Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
>---
> drivers/gpu/drm/i915/display/intel_dsb.c      |   5 +-
> drivers/gpu/drm/i915/display/intel_fbc.c      |   3 +-
> drivers/gpu/drm/i915/display/intel_fbdev.c    |   7 +-
> drivers/gpu/drm/xe/display/xe_fb_pin.c        |   7 +-
> drivers/gpu/drm/xe/display/xe_plane_initial.c |   2 +-
> drivers/gpu/drm/xe/tests/xe_bo.c              |   2 +-
> drivers/gpu/drm/xe/tests/xe_migrate.c         |  15 +-
> drivers/gpu/drm/xe/xe_bb.c                    |   3 +-
> drivers/gpu/drm/xe/xe_bo.c                    |  66 ++++----
> drivers/gpu/drm/xe/xe_bo.h                    |  18 +--
> drivers/gpu/drm/xe/xe_bo_evict.c              |   2 +-
> drivers/gpu/drm/xe/xe_bo_types.h              |   4 +-
> drivers/gpu/drm/xe/xe_device_types.h          |   7 +
> drivers/gpu/drm/xe/xe_ggtt.c                  |   5 +-
> drivers/gpu/drm/xe/xe_gt.c                    |  21 +--
> drivers/gpu/drm/xe/xe_gt_debugfs.c            |   6 +-
> drivers/gpu/drm/xe/xe_gt_pagefault.c          |  10 +-
> drivers/gpu/drm/xe/xe_gt_types.h              |   7 -
> drivers/gpu/drm/xe/xe_guc_ads.c               |   5 +-
> drivers/gpu/drm/xe/xe_guc_ct.c                |   5 +-
> drivers/gpu/drm/xe/xe_guc_hwconfig.c          |   5 +-
> drivers/gpu/drm/xe/xe_guc_log.c               |   6 +-
> drivers/gpu/drm/xe/xe_guc_pc.c                |   5 +-
> drivers/gpu/drm/xe/xe_hw_engine.c             |   5 +-
> drivers/gpu/drm/xe/xe_lrc.c                   |  13 +-
> drivers/gpu/drm/xe/xe_lrc_types.h             |   4 +-
> drivers/gpu/drm/xe/xe_migrate.c               |  23 +--
> drivers/gpu/drm/xe/xe_migrate.h               |   5 +-
> drivers/gpu/drm/xe/xe_pt.c                    | 146 ++++++++---------
> drivers/gpu/drm/xe/xe_pt.h                    |  14 +-
> drivers/gpu/drm/xe/xe_sa.c                    |  13 +-
> drivers/gpu/drm/xe/xe_sa.h                    |   4 +-
> drivers/gpu/drm/xe/xe_tile.c                  |   7 +
> drivers/gpu/drm/xe/xe_uc_fw.c                 |   5 +-
> drivers/gpu/drm/xe/xe_vm.c                    | 152 +++++++++---------
> drivers/gpu/drm/xe/xe_vm.h                    |   2 +-
> drivers/gpu/drm/xe/xe_vm_types.h              |  12 +-
> include/uapi/drm/xe_drm.h                     |   4 +-
> 38 files changed, 307 insertions(+), 318 deletions(-)
>
>diff --git a/drivers/gpu/drm/i915/display/intel_dsb.c b/drivers/gpu/drm/i915/display/intel_dsb.c
>index 7c93580282b4..3830309aacf4 100644
>--- a/drivers/gpu/drm/i915/display/intel_dsb.c
>+++ b/drivers/gpu/drm/i915/display/intel_dsb.c
>@@ -379,9 +379,10 @@ struct intel_dsb *intel_dsb_prepare(struct intel_crtc *crtc,
> #else
> 	/* ~1 qword per instruction, full cachelines */
> 	size = ALIGN(max_cmds * 8, 64);
>-	obj = xe_bo_create_pin_map(i915, to_gt(i915), NULL, PAGE_ALIGN(size),
>+	obj = xe_bo_create_pin_map(i915, xe_device_get_root_tile(i915),
>+				   NULL, PAGE_ALIGN(size),
> 				   ttm_bo_type_kernel,
>-				   XE_BO_CREATE_VRAM_IF_DGFX(to_gt(i915)) |
>+				   XE_BO_CREATE_VRAM_IF_DGFX(xe_device_get_root_tile(i915)) |
> 				   XE_BO_CREATE_GGTT_BIT);
> 	if (IS_ERR(obj)) {
> 		kfree(dsb);
>diff --git a/drivers/gpu/drm/i915/display/intel_fbc.c b/drivers/gpu/drm/i915/display/intel_fbc.c
>index 9dc7083fe974..0e8e899f596b 100644
>--- a/drivers/gpu/drm/i915/display/intel_fbc.c
>+++ b/drivers/gpu/drm/i915/display/intel_fbc.c
>@@ -71,7 +71,8 @@ static int i915_gem_stolen_insert_node_in_range(struct xe_device *xe, struct xe_
> 	int err;
> 	u32 flags = XE_BO_CREATE_PINNED_BIT | XE_BO_CREATE_STOLEN_BIT;
>
>-	*bo = xe_bo_create_locked_range(xe, to_gt(xe), NULL, size, start, end,
>+	*bo = xe_bo_create_locked_range(xe, xe_device_get_root_tile(xe),
>+					NULL, size, start, end,
> 					ttm_bo_type_kernel, flags);
> 	if (IS_ERR(*bo)) {
> 		err = PTR_ERR(*bo);
>diff --git a/drivers/gpu/drm/i915/display/intel_fbdev.c b/drivers/gpu/drm/i915/display/intel_fbdev.c
>index 6362c4ce15b6..814b89b99718 100644
>--- a/drivers/gpu/drm/i915/display/intel_fbdev.c
>+++ b/drivers/gpu/drm/i915/display/intel_fbdev.c
>@@ -205,7 +205,8 @@ static int intelfb_alloc(struct drm_fb_helper *helper,
> 	}
> #else
> 	if (!IS_DGFX(dev_priv)) {
>-		obj = xe_bo_create_pin_map(dev_priv, to_gt(dev_priv), NULL, size,
>+		obj = xe_bo_create_pin_map(dev_priv, xe_device_get_root_tile(dev_priv),
>+					   NULL, size,
> 					   ttm_bo_type_kernel, XE_BO_SCANOUT_BIT |
> 					   XE_BO_CREATE_STOLEN_BIT |
> 					   XE_BO_CREATE_PINNED_BIT);
>@@ -215,9 +216,9 @@ static int intelfb_alloc(struct drm_fb_helper *helper,
> 			drm_info(&dev_priv->drm, "Allocated fbdev into stolen failed: %li\n", PTR_ERR(obj));
> 	}
> 	if (IS_ERR(obj)) {
>-		obj = xe_bo_create_pin_map(dev_priv, to_gt(dev_priv), NULL, size,
>+		obj = xe_bo_create_pin_map(dev_priv, xe_device_get_root_tile(dev_priv), NULL, size,
> 					  ttm_bo_type_kernel, XE_BO_SCANOUT_BIT |
>-					  XE_BO_CREATE_VRAM_IF_DGFX(to_gt(dev_priv)) |
>+					  XE_BO_CREATE_VRAM_IF_DGFX(xe_device_get_root_tile(dev_priv)) |
> 					  XE_BO_CREATE_PINNED_BIT);
> 	}
> #endif
>diff --git a/drivers/gpu/drm/xe/display/xe_fb_pin.c b/drivers/gpu/drm/xe/display/xe_fb_pin.c
>index 78ac58244f24..e5999a01daa1 100644
>--- a/drivers/gpu/drm/xe/display/xe_fb_pin.c
>+++ b/drivers/gpu/drm/xe/display/xe_fb_pin.c
>@@ -45,6 +45,7 @@ static int __xe_pin_fb_vma_dpt(struct intel_framebuffer *fb,
> 			       struct i915_vma *vma)
> {
> 	struct xe_device *xe = to_xe_device(fb->base.dev);
>+	struct xe_tile *tile0 = xe_device_get_root_tile(xe);
> 	struct xe_bo *bo = intel_fb_obj(&fb->base), *dpt;
> 	u32 dpt_size, size = bo->ttm.base.size;
>
>@@ -55,17 +56,17 @@ static int __xe_pin_fb_vma_dpt(struct intel_framebuffer *fb,
> 		dpt_size = ALIGN(intel_rotation_info_size(&view->rotated) * 8,
> 				 XE_PAGE_SIZE);
>
>-	dpt = xe_bo_create_pin_map(xe, to_gt(xe), NULL, dpt_size,
>+	dpt = xe_bo_create_pin_map(xe, tile0, NULL, dpt_size,
> 				  ttm_bo_type_kernel,
> 				  XE_BO_CREATE_VRAM0_BIT |
> 				  XE_BO_CREATE_GGTT_BIT);
> 	if (IS_ERR(dpt))
>-		dpt = xe_bo_create_pin_map(xe, to_gt(xe), NULL, dpt_size,
>+		dpt = xe_bo_create_pin_map(xe, tile0, NULL, dpt_size,
> 					   ttm_bo_type_kernel,
> 					   XE_BO_CREATE_STOLEN_BIT |
> 					   XE_BO_CREATE_GGTT_BIT);
> 	if (IS_ERR(dpt))
>-		dpt = xe_bo_create_pin_map(xe, to_gt(xe), NULL, dpt_size,
>+		dpt = xe_bo_create_pin_map(xe, tile0, NULL, dpt_size,
> 					   ttm_bo_type_kernel,
> 					   XE_BO_CREATE_SYSTEM_BIT |
> 					   XE_BO_CREATE_GGTT_BIT);
>diff --git a/drivers/gpu/drm/xe/display/xe_plane_initial.c b/drivers/gpu/drm/xe/display/xe_plane_initial.c
>index 556ede2e459e..5e43ae9f9c4b 100644
>--- a/drivers/gpu/drm/xe/display/xe_plane_initial.c
>+++ b/drivers/gpu/drm/xe/display/xe_plane_initial.c
>@@ -115,7 +115,7 @@ initial_plane_bo(struct xe_device *xe,
> 			page_size);
> 	size -= base;
>
>-	bo = xe_bo_create_pin_map_at(xe, &tile0->primary_gt, NULL, size, phys_base,
>+	bo = xe_bo_create_pin_map_at(xe, tile0, NULL, size, phys_base,
> 				     ttm_bo_type_kernel, flags);
> 	if (IS_ERR(bo)) {
> 		drm_dbg(&xe->drm,
>diff --git a/drivers/gpu/drm/xe/tests/xe_bo.c b/drivers/gpu/drm/xe/tests/xe_bo.c
>index 9bd381e5b7a6..bee5a2031153 100644
>--- a/drivers/gpu/drm/xe/tests/xe_bo.c
>+++ b/drivers/gpu/drm/xe/tests/xe_bo.c
>@@ -173,7 +173,7 @@ static int evict_test_run_gt(struct xe_device *xe, struct xe_gt *gt, struct kuni
> {
> 	struct xe_bo *bo, *external;
> 	unsigned int bo_flags = XE_BO_CREATE_USER_BIT |
>-		XE_BO_CREATE_VRAM_IF_DGFX(gt);
>+		XE_BO_CREATE_VRAM_IF_DGFX(gt_to_tile(gt));
> 	struct xe_vm *vm = xe_migrate_get_vm(xe->gt[0].migrate);
> 	struct ww_acquire_ctx ww;
> 	int err, i;
>diff --git a/drivers/gpu/drm/xe/tests/xe_migrate.c b/drivers/gpu/drm/xe/tests/xe_migrate.c
>index 0f4371ad1fd9..fe8331f116c2 100644
>--- a/drivers/gpu/drm/xe/tests/xe_migrate.c
>+++ b/drivers/gpu/drm/xe/tests/xe_migrate.c
>@@ -240,6 +240,7 @@ static void test_pt_update(struct xe_migrate *m, struct xe_bo *pt,
> static void xe_migrate_sanity_test(struct xe_migrate *m, struct kunit *test)
> {
> 	struct xe_gt *gt = m->gt;
>+	struct xe_tile *tile = gt_to_tile(m->gt);
> 	struct xe_device *xe = gt_to_xe(gt);
> 	struct xe_bo *pt, *bo = m->pt_bo, *big, *tiny;
> 	struct xe_res_cursor src_it;
>@@ -256,18 +257,18 @@ static void xe_migrate_sanity_test(struct xe_migrate *m, struct kunit *test)
> 		return;
> 	}
>
>-	big = xe_bo_create_pin_map(xe, m->gt, m->eng->vm, SZ_4M,
>+	big = xe_bo_create_pin_map(xe, tile, m->eng->vm, SZ_4M,
> 				   ttm_bo_type_kernel,
>-				   XE_BO_CREATE_VRAM_IF_DGFX(m->gt) |
>+				   XE_BO_CREATE_VRAM_IF_DGFX(tile) |
> 				   XE_BO_CREATE_PINNED_BIT);
> 	if (IS_ERR(big)) {
> 		KUNIT_FAIL(test, "Failed to allocate bo: %li\n", PTR_ERR(big));
> 		goto vunmap;
> 	}
>
>-	pt = xe_bo_create_pin_map(xe, m->gt, m->eng->vm, XE_PAGE_SIZE,
>+	pt = xe_bo_create_pin_map(xe, tile, m->eng->vm, XE_PAGE_SIZE,
> 				  ttm_bo_type_kernel,
>-				  XE_BO_CREATE_VRAM_IF_DGFX(m->gt) |
>+				  XE_BO_CREATE_VRAM_IF_DGFX(tile) |
> 				  XE_BO_CREATE_PINNED_BIT);
> 	if (IS_ERR(pt)) {
> 		KUNIT_FAIL(test, "Failed to allocate fake pt: %li\n",
>@@ -275,10 +276,10 @@ static void xe_migrate_sanity_test(struct xe_migrate *m, struct kunit *test)
> 		goto free_big;
> 	}
>
>-	tiny = xe_bo_create_pin_map(xe, m->gt, m->eng->vm,
>+	tiny = xe_bo_create_pin_map(xe, tile, m->eng->vm,
> 				    2 * SZ_4K,
> 				    ttm_bo_type_kernel,
>-				    XE_BO_CREATE_VRAM_IF_DGFX(m->gt) |
>+				    XE_BO_CREATE_VRAM_IF_DGFX(tile) |
> 				    XE_BO_CREATE_PINNED_BIT);
> 	if (IS_ERR(tiny)) {
> 		KUNIT_FAIL(test, "Failed to allocate fake pt: %li\n",
>@@ -286,7 +287,7 @@ static void xe_migrate_sanity_test(struct xe_migrate *m, struct kunit *test)
> 		goto free_pt;
> 	}
>
>-	bb = xe_bb_new(m->gt, 32, xe->info.supports_usm);
>+	bb = xe_bb_new(gt, 32, xe->info.supports_usm);
> 	if (IS_ERR(bb)) {
> 		KUNIT_FAIL(test, "Failed to create batchbuffer: %li\n",
> 			   PTR_ERR(bb));
>diff --git a/drivers/gpu/drm/xe/xe_bb.c b/drivers/gpu/drm/xe/xe_bb.c
>index bf7c94b769d7..f9b6b7adf99f 100644
>--- a/drivers/gpu/drm/xe/xe_bb.c
>+++ b/drivers/gpu/drm/xe/xe_bb.c
>@@ -30,6 +30,7 @@ static int bb_prefetch(struct xe_gt *gt)
>
> struct xe_bb *xe_bb_new(struct xe_gt *gt, u32 dwords, bool usm)
> {
>+	struct xe_tile *tile = gt_to_tile(gt);
> 	struct xe_bb *bb = kmalloc(sizeof(*bb), GFP_KERNEL);
> 	int err;
>
>@@ -42,7 +43,7 @@ struct xe_bb *xe_bb_new(struct xe_gt *gt, u32 dwords, bool usm)
> 	 * space to accomodate the platform-specific hardware prefetch
> 	 * requirements.
> 	 */
>-	bb->bo = xe_sa_bo_new(!usm ? gt->kernel_bb_pool : gt->usm.bb_pool,
>+	bb->bo = xe_sa_bo_new(!usm ? tile->mem.kernel_bb_pool : gt->usm.bb_pool,
> 			      4 * (dwords + 1) + bb_prefetch(gt));
> 	if (IS_ERR(bb->bo)) {
> 		err = PTR_ERR(bb->bo);
>diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
>index 5dbca5bbca8f..9d613fc5d309 100644
>--- a/drivers/gpu/drm/xe/xe_bo.c
>+++ b/drivers/gpu/drm/xe/xe_bo.c
>@@ -452,7 +452,7 @@ static int xe_bo_trigger_rebind(struct xe_device *xe, struct xe_bo *bo,
> 			}
>
> 			xe_vm_assert_held(vm);
>-			if (list_empty(&vma->rebind_link) && vma->gt_present)
>+			if (list_empty(&vma->rebind_link) && vma->tile_present)
> 				list_add_tail(&vma->rebind_link, &vm->rebind_list);
>
> 			if (vm_resv_locked)
>@@ -559,7 +559,7 @@ static int xe_bo_move(struct ttm_buffer_object *ttm_bo, bool evict,
> 	struct xe_bo *bo = ttm_to_xe_bo(ttm_bo);
> 	struct ttm_resource *old_mem = ttm_bo->resource;
> 	struct ttm_tt *ttm = ttm_bo->ttm;
>-	struct xe_gt *gt = NULL;
>+	struct xe_tile *tile = NULL;
> 	struct dma_fence *fence;
> 	bool move_lacks_source;
> 	bool needs_clear;
>@@ -629,15 +629,15 @@ static int xe_bo_move(struct ttm_buffer_object *ttm_bo, bool evict,
> 		goto out;
> 	}
>
>-	if (bo->gt)
>-		gt = bo->gt;
>+	if (bo->tile)
>+		tile = bo->tile;
> 	else if (resource_is_vram(new_mem))
>-		gt = &mem_type_to_tile(xe, new_mem->mem_type)->primary_gt;
>+		tile = mem_type_to_tile(xe, new_mem->mem_type);
> 	else if (resource_is_vram(old_mem))
>-		gt = &mem_type_to_tile(xe, old_mem->mem_type)->primary_gt;
>+		tile = mem_type_to_tile(xe, old_mem->mem_type);
>
>-	XE_BUG_ON(!gt);
>-	XE_BUG_ON(!gt->migrate);
>+	XE_BUG_ON(!tile);
>+	XE_BUG_ON(!tile->primary_gt.migrate);
>
> 	trace_xe_bo_move(bo);
> 	xe_device_mem_access_get(xe);
>@@ -658,7 +658,7 @@ static int xe_bo_move(struct ttm_buffer_object *ttm_bo, bool evict,
>
> 			/* Create a new VMAP once kernel BO back in VRAM */
> 			if (!ret && resource_is_vram(new_mem)) {
>-				void *new_addr = gt_to_tile(gt)->mem.vram.mapping +
>+				void *new_addr = tile->mem.vram.mapping +
> 					(new_mem->start << PAGE_SHIFT);
>
> 				if (XE_WARN_ON(new_mem->start == XE_BO_INVALID_OFFSET)) {
>@@ -675,9 +675,9 @@ static int xe_bo_move(struct ttm_buffer_object *ttm_bo, bool evict,
> 		}
> 	} else {
> 		if (move_lacks_source)
>-			fence = xe_migrate_clear(gt->migrate, bo, new_mem);
>+			fence = xe_migrate_clear(tile->primary_gt.migrate, bo, new_mem);
> 		else
>-			fence = xe_migrate_copy(gt->migrate, bo, old_mem, new_mem);
>+			fence = xe_migrate_copy(tile->primary_gt.migrate, bo, old_mem, new_mem);
> 		if (IS_ERR(fence)) {
> 			ret = PTR_ERR(fence);
> 			xe_device_mem_access_put(xe);
>@@ -958,7 +958,7 @@ static void xe_ttm_bo_destroy(struct ttm_buffer_object *ttm_bo)
> 	WARN_ON(!list_empty(&bo->vmas));
>
> 	if (bo->ggtt_node.size)
>-		xe_ggtt_remove_bo(gt_to_tile(bo->gt)->mem.ggtt, bo);
>+		xe_ggtt_remove_bo(bo->tile->mem.ggtt, bo);
>
> 	if (bo->vm && xe_bo_is_user(bo))
> 		xe_vm_put(bo->vm);
>@@ -1080,7 +1080,7 @@ void xe_bo_free(struct xe_bo *bo)
> }
>
> struct xe_bo *__xe_bo_create_locked(struct xe_device *xe, struct xe_bo *bo,
>-				    struct xe_gt *gt, struct dma_resv *resv,
>+				    struct xe_tile *tile, struct dma_resv *resv,
> 				    size_t size, enum ttm_bo_type type,
> 				    u32 flags)
> {
>@@ -1093,7 +1093,7 @@ struct xe_bo *__xe_bo_create_locked(struct xe_device *xe, struct xe_bo *bo,
> 	int err;
>
> 	/* Only kernel objects should set GT */
>-	XE_BUG_ON(gt && type != ttm_bo_type_kernel);
>+	XE_BUG_ON(tile && type != ttm_bo_type_kernel);
>
> 	if (XE_WARN_ON(!size))
> 		return ERR_PTR(-EINVAL);
>@@ -1114,7 +1114,7 @@ struct xe_bo *__xe_bo_create_locked(struct xe_device *xe, struct xe_bo *bo,
> 		alignment = SZ_4K >> PAGE_SHIFT;
> 	}
>
>-	bo->gt = gt;
>+	bo->tile = tile;
> 	bo->size = size;
> 	bo->flags = flags;
> 	bo->ttm.base.funcs = &xe_gem_object_funcs;
>@@ -1196,7 +1196,7 @@ static int __xe_bo_fixed_placement(struct xe_device *xe,
>
> struct xe_bo *
> xe_bo_create_locked_range(struct xe_device *xe,
>-			  struct xe_gt *gt, struct xe_vm *vm,
>+			  struct xe_tile *tile, struct xe_vm *vm,
> 			  size_t size, u64 start, u64 end,
> 			  enum ttm_bo_type type, u32 flags)
> {
>@@ -1219,7 +1219,7 @@ xe_bo_create_locked_range(struct xe_device *xe,
> 		}
> 	}
>
>-	bo = __xe_bo_create_locked(xe, bo, gt, vm ? &vm->resv : NULL, size,
>+	bo = __xe_bo_create_locked(xe, bo, tile, vm ? &vm->resv : NULL, size,
> 				   type, flags);
> 	if (IS_ERR(bo))
> 		return bo;
>@@ -1229,16 +1229,16 @@ xe_bo_create_locked_range(struct xe_device *xe,
> 	bo->vm = vm;
>
> 	if (bo->flags & XE_BO_CREATE_GGTT_BIT) {
>-		if (!gt && flags & XE_BO_CREATE_STOLEN_BIT)
>-			gt = xe_device_get_gt(xe, 0);
>+		if (!tile && flags & XE_BO_CREATE_STOLEN_BIT)
>+			tile = xe_device_get_root_tile(xe);
>
>-		XE_BUG_ON(!gt);
>+		XE_BUG_ON(!tile);
>
> 		if (flags & XE_BO_FIXED_PLACEMENT_BIT) {
>-			err = xe_ggtt_insert_bo_at(gt_to_tile(gt)->mem.ggtt, bo,
>+			err = xe_ggtt_insert_bo_at(tile->mem.ggtt, bo,
> 						   start + bo->size, U64_MAX);
> 		} else {
>-			err = xe_ggtt_insert_bo(gt_to_tile(gt)->mem.ggtt, bo);
>+			err = xe_ggtt_insert_bo(tile->mem.ggtt, bo);
> 		}
> 		if (err)
> 			goto err_unlock_put_bo;
>@@ -1252,18 +1252,18 @@ xe_bo_create_locked_range(struct xe_device *xe,
> 	return ERR_PTR(err);
> }
>
>-struct xe_bo *xe_bo_create_locked(struct xe_device *xe, struct xe_gt *gt,
>+struct xe_bo *xe_bo_create_locked(struct xe_device *xe, struct xe_tile *tile,
> 				  struct xe_vm *vm, size_t size,
> 				  enum ttm_bo_type type, u32 flags)
> {
>-	return xe_bo_create_locked_range(xe, gt, vm, size, 0, ~0ULL, type, flags);
>+	return xe_bo_create_locked_range(xe, tile, vm, size, 0, ~0ULL, type, flags);
> }
>
>-struct xe_bo *xe_bo_create(struct xe_device *xe, struct xe_gt *gt,
>+struct xe_bo *xe_bo_create(struct xe_device *xe, struct xe_tile *tile,
> 			   struct xe_vm *vm, size_t size,
> 			   enum ttm_bo_type type, u32 flags)
> {
>-	struct xe_bo *bo = xe_bo_create_locked(xe, gt, vm, size, type, flags);
>+	struct xe_bo *bo = xe_bo_create_locked(xe, tile, vm, size, type, flags);
>
> 	if (!IS_ERR(bo))
> 		xe_bo_unlock_vm_held(bo);
>@@ -1271,7 +1271,7 @@ struct xe_bo *xe_bo_create(struct xe_device *xe, struct xe_gt *gt,
> 	return bo;
> }
>
>-struct xe_bo *xe_bo_create_pin_map_at(struct xe_device *xe, struct xe_gt *gt,
>+struct xe_bo *xe_bo_create_pin_map_at(struct xe_device *xe, struct xe_tile *tile,
> 				      struct xe_vm *vm,
> 				      size_t size, u64 offset,
> 				      enum ttm_bo_type type, u32 flags)
>@@ -1285,7 +1285,7 @@ struct xe_bo *xe_bo_create_pin_map_at(struct xe_device *xe, struct xe_gt *gt,
> 	    xe_ttm_stolen_cpu_access_needs_ggtt(xe))
> 		flags |= XE_BO_CREATE_GGTT_BIT;
>
>-	bo = xe_bo_create_locked_range(xe, gt, vm, size, start, end, type, flags);
>+	bo = xe_bo_create_locked_range(xe, tile, vm, size, start, end, type, flags);
> 	if (IS_ERR(bo))
> 		return bo;
>
>@@ -1309,18 +1309,18 @@ struct xe_bo *xe_bo_create_pin_map_at(struct xe_device *xe, struct xe_gt *gt,
> 	return ERR_PTR(err);
> }
>
>-struct xe_bo *xe_bo_create_pin_map(struct xe_device *xe, struct xe_gt *gt,
>+struct xe_bo *xe_bo_create_pin_map(struct xe_device *xe, struct xe_tile *tile,
> 				   struct xe_vm *vm, size_t size,
> 				   enum ttm_bo_type type, u32 flags)
> {
>-	return xe_bo_create_pin_map_at(xe, gt, vm, size, ~0ull, type, flags);
>+	return xe_bo_create_pin_map_at(xe, tile, vm, size, ~0ull, type, flags);
> }
>
>-struct xe_bo *xe_bo_create_from_data(struct xe_device *xe, struct xe_gt *gt,
>+struct xe_bo *xe_bo_create_from_data(struct xe_device *xe, struct xe_tile *tile,
> 				     const void *data, size_t size,
> 				     enum ttm_bo_type type, u32 flags)
> {
>-	struct xe_bo *bo = xe_bo_create_pin_map(xe, gt, NULL,
>+	struct xe_bo *bo = xe_bo_create_pin_map(xe, tile, NULL,
> 						ALIGN(size, PAGE_SIZE),
> 						type, flags);
> 	if (IS_ERR(bo))
>@@ -1949,7 +1949,7 @@ int xe_bo_dumb_create(struct drm_file *file_priv,
> 			   page_size);
>
> 	bo = xe_bo_create(xe, NULL, NULL, args->size, ttm_bo_type_device,
>-			  XE_BO_CREATE_VRAM_IF_DGFX(to_gt(xe)) |
>+			  XE_BO_CREATE_VRAM_IF_DGFX(xe_device_get_root_tile(xe)) |
> 			  XE_BO_CREATE_USER_BIT | XE_BO_SCANOUT_BIT);
> 	if (IS_ERR(bo))
> 		return PTR_ERR(bo);
>diff --git a/drivers/gpu/drm/xe/xe_bo.h b/drivers/gpu/drm/xe/xe_bo.h
>index 7a79f3893260..ccb0fae2966e 100644
>--- a/drivers/gpu/drm/xe/xe_bo.h
>+++ b/drivers/gpu/drm/xe/xe_bo.h
>@@ -21,8 +21,8 @@
> 					 XE_BO_CREATE_VRAM1_BIT)
> /* -- */
> #define XE_BO_CREATE_STOLEN_BIT		BIT(4)
>-#define XE_BO_CREATE_VRAM_IF_DGFX(gt) \
>-	(IS_DGFX(gt_to_xe(gt)) ? XE_BO_CREATE_VRAM0_BIT << gt_to_tile(gt)->id : \
>+#define XE_BO_CREATE_VRAM_IF_DGFX(tile) \
>+	(IS_DGFX(tile_to_xe(tile)) ? XE_BO_CREATE_VRAM0_BIT << (tile)->id : \
> 	 XE_BO_CREATE_SYSTEM_BIT)
> #define XE_BO_CREATE_GGTT_BIT		BIT(5)
> #define XE_BO_CREATE_IGNORE_MIN_PAGE_SIZE_BIT BIT(6)
>@@ -80,27 +80,27 @@ struct xe_bo *xe_bo_alloc(void);
> void xe_bo_free(struct xe_bo *bo);
>
> struct xe_bo *__xe_bo_create_locked(struct xe_device *xe, struct xe_bo *bo,
>-				    struct xe_gt *gt, struct dma_resv *resv,
>+				    struct xe_tile *tile, struct dma_resv *resv,
> 				    size_t size, enum ttm_bo_type type,
> 				    u32 flags);
> struct xe_bo *
> xe_bo_create_locked_range(struct xe_device *xe,
>-			  struct xe_gt *gt, struct xe_vm *vm,
>+			  struct xe_tile *tile, struct xe_vm *vm,
> 			  size_t size, u64 start, u64 end,
> 			  enum ttm_bo_type type, u32 flags);
>-struct xe_bo *xe_bo_create_locked(struct xe_device *xe, struct xe_gt *gt,
>+struct xe_bo *xe_bo_create_locked(struct xe_device *xe, struct xe_tile *tile,
> 				  struct xe_vm *vm, size_t size,
> 				  enum ttm_bo_type type, u32 flags);
>-struct xe_bo *xe_bo_create(struct xe_device *xe, struct xe_gt *gt,
>+struct xe_bo *xe_bo_create(struct xe_device *xe, struct xe_tile *tile,
> 			   struct xe_vm *vm, size_t size,
> 			   enum ttm_bo_type type, u32 flags);
>-struct xe_bo *xe_bo_create_pin_map(struct xe_device *xe, struct xe_gt *gt,
>+struct xe_bo *xe_bo_create_pin_map(struct xe_device *xe, struct xe_tile *tile,
> 				   struct xe_vm *vm, size_t size,
> 				   enum ttm_bo_type type, u32 flags);
>-struct xe_bo *xe_bo_create_pin_map_at(struct xe_device *xe, struct xe_gt *gt,
>+struct xe_bo *xe_bo_create_pin_map_at(struct xe_device *xe, struct xe_tile *tile,
> 				      struct xe_vm *vm, size_t size, u64 offset,
> 				      enum ttm_bo_type type, u32 flags);
>-struct xe_bo *xe_bo_create_from_data(struct xe_device *xe, struct xe_gt *gt,
>+struct xe_bo *xe_bo_create_from_data(struct xe_device *xe, struct xe_tile *tile,
> 				     const void *data, size_t size,
> 				     enum ttm_bo_type type, u32 flags);
>
>diff --git a/drivers/gpu/drm/xe/xe_bo_evict.c b/drivers/gpu/drm/xe/xe_bo_evict.c
>index a72963c54bf3..9226195bd560 100644
>--- a/drivers/gpu/drm/xe/xe_bo_evict.c
>+++ b/drivers/gpu/drm/xe/xe_bo_evict.c
>@@ -149,7 +149,7 @@ int xe_bo_restore_kernel(struct xe_device *xe)
> 		}
>
> 		if (bo->flags & XE_BO_CREATE_GGTT_BIT) {
>-			struct xe_tile *tile = gt_to_tile(bo->gt);
>+			struct xe_tile *tile = bo->tile;
>
> 			mutex_lock(&tile->mem.ggtt->lock);
> 			xe_ggtt_map_bo(tile->mem.ggtt, bo);
>diff --git a/drivers/gpu/drm/xe/xe_bo_types.h b/drivers/gpu/drm/xe/xe_bo_types.h
>index 06de3330211d..f6ee920303af 100644
>--- a/drivers/gpu/drm/xe/xe_bo_types.h
>+++ b/drivers/gpu/drm/xe/xe_bo_types.h
>@@ -29,8 +29,8 @@ struct xe_bo {
> 	u32 flags;
> 	/** @vm: VM this BO is attached to, for extobj this will be NULL */
> 	struct xe_vm *vm;
>-	/** @gt: GT this BO is attached to (kernel BO only) */
>-	struct xe_gt *gt;
>+	/** @tile: Tile this BO is attached to (kernel BO only) */
>+	struct xe_tile *tile;
> 	/** @vmas: List of VMAs for this BO */
> 	struct list_head vmas;
> 	/** @placements: valid placements for this BO */
>diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
>index 6b9e7847161c..c6365b6f14ba 100644
>--- a/drivers/gpu/drm/xe/xe_device_types.h
>+++ b/drivers/gpu/drm/xe/xe_device_types.h
>@@ -131,6 +131,13 @@ struct xe_tile {
>
> 		/** @ggtt: Global graphics translation table */
> 		struct xe_ggtt *ggtt;
>+
>+		/**
>+		 * @kernel_bb_pool: Pool from which batchbuffers are allocated.
>+		 *
>+		 * Media GT shares a pool with its primary GT.
>+		 */
>+		struct xe_sa_manager *kernel_bb_pool;
> 	} mem;
> };
>
>diff --git a/drivers/gpu/drm/xe/xe_ggtt.c b/drivers/gpu/drm/xe/xe_ggtt.c
>index 52d293d61cc0..b11f22b68bb8 100644
>--- a/drivers/gpu/drm/xe/xe_ggtt.c
>+++ b/drivers/gpu/drm/xe/xe_ggtt.c
>@@ -149,7 +149,6 @@ static void xe_ggtt_initial_clear(struct xe_ggtt *ggtt)
> int xe_ggtt_init(struct xe_ggtt *ggtt)
> {
> 	struct xe_device *xe = tile_to_xe(ggtt->tile);
>-	struct xe_gt *gt = &ggtt->tile->primary_gt;
> 	unsigned int flags;
> 	int err;
>
>@@ -162,9 +161,9 @@ int xe_ggtt_init(struct xe_ggtt *ggtt)
> 	if (ggtt->flags & XE_GGTT_FLAGS_64K)
> 		flags |= XE_BO_CREATE_SYSTEM_BIT;
> 	else
>-		flags |= XE_BO_CREATE_VRAM_IF_DGFX(gt);
>+		flags |= XE_BO_CREATE_VRAM_IF_DGFX(ggtt->tile);
>
>-	ggtt->scratch = xe_bo_create_pin_map(xe, gt, NULL, XE_PAGE_SIZE,
>+	ggtt->scratch = xe_bo_create_pin_map(xe, ggtt->tile, NULL, XE_PAGE_SIZE,
> 					     ttm_bo_type_kernel,
> 					     flags);
>
>diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c
>index 1e424ce8ef3e..d769bc93d15c 100644
>--- a/drivers/gpu/drm/xe/xe_gt.c
>+++ b/drivers/gpu/drm/xe/xe_gt.c
>@@ -95,7 +95,7 @@ static int emit_nop_job(struct xe_gt *gt, struct xe_engine *e)
> 	if (IS_ERR(bb))
> 		return PTR_ERR(bb);
>
>-	batch_ofs = xe_bo_ggtt_addr(gt->kernel_bb_pool->bo);
>+	batch_ofs = xe_bo_ggtt_addr(gt_to_tile(gt)->mem.kernel_bb_pool->bo);
> 	job = xe_bb_create_wa_job(e, bb, batch_ofs);
> 	if (IS_ERR(job)) {
> 		xe_bb_free(bb, NULL);
>@@ -144,7 +144,7 @@ static int emit_wa_job(struct xe_gt *gt, struct xe_engine *e)
> 		}
> 	}
>
>-	batch_ofs = xe_bo_ggtt_addr(gt->kernel_bb_pool->bo);
>+	batch_ofs = xe_bo_ggtt_addr(gt_to_tile(gt)->mem.kernel_bb_pool->bo);
> 	job = xe_bb_create_wa_job(e, bb, batch_ofs);
> 	if (IS_ERR(job)) {
> 		xe_bb_free(bb, NULL);
>@@ -364,31 +364,16 @@ static int all_fw_domain_init(struct xe_gt *gt)
> 		goto err_force_wake;
>
> 	if (!xe_gt_is_media_type(gt)) {

I believe at this point this doesn't make sense anymore as there is no
media gt. Reviewing this with --color-words, looks ok.


Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Intel-xe] [PATCH 08/26] drm/xe: Move migration from GT to tile
  2023-05-11  3:47 ` [Intel-xe] [PATCH 08/26] drm/xe: Move migration from GT to tile Matt Roper
@ 2023-05-17  5:00   ` Lucas De Marchi
  0 siblings, 0 replies; 75+ messages in thread
From: Lucas De Marchi @ 2023-05-17  5:00 UTC (permalink / raw)
  To: Matt Roper; +Cc: intel-xe

On Wed, May 10, 2023 at 08:47:04PM -0700, Matt Roper wrote:
>Migration primarily focuses on the memory associated with a tile, so it
>makes more sense to track this at the tile level (especially since the
>driver was already skipping migration operations on media GTs).
>
>Note that the blitter engine used to perform the migration always lives
>in the tile's primary GT today.  In theory that could change if media
>GTs ever start including blitter engines in the future, but we can
>extend the design if/when that happens in the future.
>
>Signed-off-by: Matt Roper <matthew.d.roper@intel.com>


Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>

Lucas De Marchi

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Intel-xe] [PATCH 09/26] drm/xe: Clarify 'gt' retrieval for primary tile
  2023-05-11  3:47 ` [Intel-xe] [PATCH 09/26] drm/xe: Clarify 'gt' retrieval for primary tile Matt Roper
@ 2023-05-17  5:07   ` Lucas De Marchi
  0 siblings, 0 replies; 75+ messages in thread
From: Lucas De Marchi @ 2023-05-17  5:07 UTC (permalink / raw)
  To: Matt Roper; +Cc: intel-xe

On Wed, May 10, 2023 at 08:47:05PM -0700, Matt Roper wrote:
>There are a bunch of places in the driver where we need to perform
>non-GT MMIO against the platform's primary tile (display code, top-level
>interrupt enable/disable, driver initialization, etc.).  Rename
>'to_gt()' to 'xe_primary_mmio_gt()' to clarify that we're trying to get
>a primary MMIO handle for these top-level operations.
>
>In the future we need to move away from xe_gt as the target for MMIO
>operations (most of which are completely unrelated to GT).
>
>Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
>---
> drivers/gpu/drm/xe/compat-i915-headers/intel_uncore.h | 2 +-
> drivers/gpu/drm/xe/xe_device.c                        | 2 +-
> drivers/gpu/drm/xe/xe_device.h                        | 9 +++++++--
> drivers/gpu/drm/xe/xe_irq.c                           | 6 +++---
> drivers/gpu/drm/xe/xe_mmio.c                          | 8 ++++----
> drivers/gpu/drm/xe/xe_query.c                         | 2 +-
> drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c                | 4 ++--
> 7 files changed, 19 insertions(+), 14 deletions(-)
>
>diff --git a/drivers/gpu/drm/xe/compat-i915-headers/intel_uncore.h b/drivers/gpu/drm/xe/compat-i915-headers/intel_uncore.h
>index 14f195fe275d..6eff72311773 100644
>--- a/drivers/gpu/drm/xe/compat-i915-headers/intel_uncore.h
>+++ b/drivers/gpu/drm/xe/compat-i915-headers/intel_uncore.h
>@@ -14,7 +14,7 @@ static inline struct xe_gt *__fake_uncore_to_gt(struct fake_uncore *uncore)
> {
> 	struct xe_device *xe = container_of(uncore, struct xe_device, uncore);
>
>-	return to_gt(xe);
>+	return xe_primary_mmio_gt(xe);

nit: this contrasts with the name chosen in the first patch:
xe_device_get_root_tile(). I think it would be good to be consistent.

xe_primary_mmio_gt(xe)
xe_root_tile(xe)
?

xe_device_get_primary_mmio_gt(xe)
xe_device_get_root_tile(xe)
?

The latter seems a bit too long for me.

Maybe also consolidate on primary vs root?

> }
>
> static inline u32 intel_uncore_read(struct fake_uncore *uncore,
>diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
>index 038074a90584..c93c8895862f 100644
>--- a/drivers/gpu/drm/xe/xe_device.c
>+++ b/drivers/gpu/drm/xe/xe_device.c
>@@ -397,7 +397,7 @@ static void device_kill_persistent_engines(struct xe_device *xe,
>
> void xe_device_wmb(struct xe_device *xe)
> {
>-	struct xe_gt *gt = xe_device_get_gt(xe, 0);
>+	struct xe_gt *gt = xe_primary_mmio_gt(xe);
>
> 	wmb();
> 	if (IS_DGFX(xe))
>diff --git a/drivers/gpu/drm/xe/xe_device.h b/drivers/gpu/drm/xe/xe_device.h
>index 745dbb16d417..fc2655484dfd 100644
>--- a/drivers/gpu/drm/xe/xe_device.h
>+++ b/drivers/gpu/drm/xe/xe_device.h
>@@ -66,9 +66,14 @@ static inline struct xe_gt *xe_device_get_gt(struct xe_device *xe, u8 gt_id)
> }
>
> /*
>- * FIXME: Placeholder until multi-gt lands. Once that lands, kill this function.
>+ * Provide a GT structure suitable for performing non-GT MMIO operations against
>+ * the primary tile.  Primarily intended for early tile initialization, display
>+ * handling, top-most interrupt enable/disable, etc.
>+ *
>+ * FIXME: Fix the driver design so that 'gt' isn't the target of all MMIO
>+ * operations.

again... not sure I agree with the statement here as we may fall into a
situation that it's harder to figure out what should be used for what
mmio. I may need to get used to this after looking at the final result.

Lucas De Marchi

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Intel-xe] [PATCH 10/26] drm/xe: Drop vram_id
  2023-05-11  3:47 ` [Intel-xe] [PATCH 10/26] drm/xe: Drop vram_id Matt Roper
@ 2023-05-17  5:09   ` Lucas De Marchi
  0 siblings, 0 replies; 75+ messages in thread
From: Lucas De Marchi @ 2023-05-17  5:09 UTC (permalink / raw)
  To: Matt Roper; +Cc: intel-xe

On Wed, May 10, 2023 at 08:47:06PM -0700, Matt Roper wrote:
>The VRAM ID is always the tile ID; there's no need to track it
>separately within a GT.
>
>Signed-off-by: Matt Roper <matthew.d.roper@intel.com>

with the move to be tile-based, with the hope it stays like this for
future platforms

Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>

Lucas De Marchi

>---
> drivers/gpu/drm/xe/tests/xe_bo.c | 6 +++---
> drivers/gpu/drm/xe/xe_pci.c      | 2 --
> 2 files changed, 3 insertions(+), 5 deletions(-)
>
>diff --git a/drivers/gpu/drm/xe/tests/xe_bo.c b/drivers/gpu/drm/xe/tests/xe_bo.c
>index bee5a2031153..e4d1d17b1d3c 100644
>--- a/drivers/gpu/drm/xe/tests/xe_bo.c
>+++ b/drivers/gpu/drm/xe/tests/xe_bo.c
>@@ -115,9 +115,9 @@ static void ccs_test_run_gt(struct xe_device *xe, struct xe_gt *gt,
> 	int ret;
>
> 	/* TODO: Sanity check */
>-	vram_bit = XE_BO_CREATE_VRAM0_BIT << gt->info.vram_id;
>+	vram_bit = XE_BO_CREATE_VRAM0_BIT << gt_to_tile(gt)->id;
> 	kunit_info(test, "Testing gt id %u vram id %u\n", gt->info.id,
>-		   gt->info.vram_id);
>+		   gt_to_tile(gt)->id);
>
> 	bo = xe_bo_create_locked(xe, NULL, NULL, SZ_1M, ttm_bo_type_device,
> 				 vram_bit);
>@@ -179,7 +179,7 @@ static int evict_test_run_gt(struct xe_device *xe, struct xe_gt *gt, struct kuni
> 	int err, i;
>
> 	kunit_info(test, "Testing device %s gt id %u vram id %u\n",
>-		   dev_name(xe->drm.dev), gt->info.id, gt->info.vram_id);
>+		   dev_name(xe->drm.dev), gt->info.id, gt_to_tile(gt)->id);
>
> 	for (i = 0; i < 2; ++i) {
> 		xe_vm_lock(vm, &ww, 0, false);
>diff --git a/drivers/gpu/drm/xe/xe_pci.c b/drivers/gpu/drm/xe/xe_pci.c
>index be7c41024838..0f3508c72c79 100644
>--- a/drivers/gpu/drm/xe/xe_pci.c
>+++ b/drivers/gpu/drm/xe/xe_pci.c
>@@ -34,7 +34,6 @@ struct xe_subplatform_desc {
>
> struct xe_gt_desc {
> 	enum xe_gt_type type;
>-	u8 vram_id;
> 	u32 mmio_adj_limit;
> 	u32 mmio_adj_offset;
> };
>@@ -258,7 +257,6 @@ static const struct xe_device_desc dg2_desc = {
> static const struct xe_gt_desc pvc_gts[] = {
> 	{
> 		.type = XE_GT_TYPE_REMOTE,
>-		.vram_id = 1,
> 		.mmio_adj_limit = 0,
> 		.mmio_adj_offset = 0,
> 	},
>-- 
>2.40.0
>

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Intel-xe] [PATCH 11/26] drm/xe: Drop extra_gts[] declarations and XE_GT_TYPE_REMOTE
  2023-05-11  3:47 ` [Intel-xe] [PATCH 11/26] drm/xe: Drop extra_gts[] declarations and XE_GT_TYPE_REMOTE Matt Roper
@ 2023-05-17  5:14   ` Lucas De Marchi
  0 siblings, 0 replies; 75+ messages in thread
From: Lucas De Marchi @ 2023-05-17  5:14 UTC (permalink / raw)
  To: Matt Roper; +Cc: intel-xe

On Wed, May 10, 2023 at 08:47:07PM -0700, Matt Roper wrote:
>Now that tiles and GTs are handled separately, extra_gts[] doesn't
>really provide any useful information that we can't just infer directly.
>The primary GT of the root tile and the remote tiles behave the same way

				     ^ missing a "of"?

>and don't need independent handling.
>
>When we re-add support for media GTs in a future patch, the presence of
>media can be determined from MEDIA_VER() (i.e., >= 13) and media's GSI
>offset handling is expected to remain constant for all forseeable future
>platforms, so it won't need to be provided in a definition structure
>either.
>
>Signed-off-by: Matt Roper <matthew.d.roper@intel.com>


Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>


Lucas De Marchi

>---
> drivers/gpu/drm/xe/xe_gt_types.h |  1 -
> drivers/gpu/drm/xe/xe_pci.c      | 37 ++++++--------------------------
> 2 files changed, 7 insertions(+), 31 deletions(-)
>
>diff --git a/drivers/gpu/drm/xe/xe_gt_types.h b/drivers/gpu/drm/xe/xe_gt_types.h
>index 8a5f9122ba80..5e0bfb21ae1c 100644
>--- a/drivers/gpu/drm/xe/xe_gt_types.h
>+++ b/drivers/gpu/drm/xe/xe_gt_types.h
>@@ -20,7 +20,6 @@ struct xe_ring_ops;
> enum xe_gt_type {
> 	XE_GT_TYPE_UNINITIALIZED,
> 	XE_GT_TYPE_MAIN,
>-	XE_GT_TYPE_REMOTE,
> 	XE_GT_TYPE_MEDIA,
> };
>
>diff --git a/drivers/gpu/drm/xe/xe_pci.c b/drivers/gpu/drm/xe/xe_pci.c
>index 0f3508c72c79..bfdc9563e54f 100644
>--- a/drivers/gpu/drm/xe/xe_pci.c
>+++ b/drivers/gpu/drm/xe/xe_pci.c
>@@ -46,7 +46,6 @@ struct xe_device_desc {
>
> 	const char *platform_name;
> 	const struct xe_subplatform_desc *subplatforms;
>-	const struct xe_gt_desc *extra_gts;
>
> 	enum xe_platform platform;
>
>@@ -254,20 +253,11 @@ static const struct xe_device_desc dg2_desc = {
> 	DG2_FEATURES,
> };
>
>-static const struct xe_gt_desc pvc_gts[] = {
>-	{
>-		.type = XE_GT_TYPE_REMOTE,
>-		.mmio_adj_limit = 0,
>-		.mmio_adj_offset = 0,
>-	},
>-};
>-
> static const struct xe_device_desc pvc_desc = {
> 	.graphics = &graphics_xehpc,
> 	DGFX_FEATURES,
> 	PLATFORM(XE_PVC),
> 	.require_force_probe = true,
>-	.extra_gts = pvc_gts,
> };
>
> static const struct xe_device_desc mtl_desc = {
>@@ -531,26 +521,13 @@ static int xe_info_init(struct xe_device *xe,
> 		gt->info.id = id;
> 		gt->tile = tile;
>
>-		gt->info.id = id;
>-		if (id == 0) {
>-			gt->info.type = XE_GT_TYPE_MAIN;
>-
>-			gt->info.__engine_mask = graphics_desc->hw_engine_mask;
>-			if (MEDIA_VER(xe) < 13 && media_desc)
>-				gt->info.__engine_mask |= media_desc->hw_engine_mask;
>-
>-			gt->mmio.adj_limit = 0;
>-			gt->mmio.adj_offset = 0;
>-		} else {
>-			gt->info.type = desc->extra_gts[id - 1].type;
>-			gt->info.__engine_mask = xe_gt_is_media_type(gt) ?
>-				media_desc->hw_engine_mask :
>-				graphics_desc->hw_engine_mask;
>-			gt->mmio.adj_limit =
>-				desc->extra_gts[id - 1].mmio_adj_limit;
>-			gt->mmio.adj_offset =
>-				desc->extra_gts[id - 1].mmio_adj_offset;
>-		}
>+		gt->info.id = id;	/* FIXME: Determine sensible numbering */
>+		gt->info.type = XE_GT_TYPE_MAIN;
>+		gt->info.__engine_mask = graphics_desc->hw_engine_mask;
>+		if (MEDIA_VER(xe) < 13 && media_desc)
>+			gt->info.__engine_mask |= media_desc->hw_engine_mask;
>+
>+		/* TODO: Init media GT, if present */
> 	}
>
> 	return 0;
>-- 
>2.40.0
>

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Intel-xe] [PATCH 12/26] drm/xe: Allocate GT dynamically
  2023-05-11  3:47 ` [Intel-xe] [PATCH 12/26] drm/xe: Allocate GT dynamically Matt Roper
@ 2023-05-17  5:23   ` Lucas De Marchi
  0 siblings, 0 replies; 75+ messages in thread
From: Lucas De Marchi @ 2023-05-17  5:23 UTC (permalink / raw)
  To: Matt Roper; +Cc: intel-xe

On Wed, May 10, 2023 at 08:47:08PM -0700, Matt Roper wrote:
>In preparation for re-adding media GT support, switch the primary GT
>within the tile to a dynamic allocation.
>
>Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
>---
> drivers/gpu/drm/xe/xe_device.c       |  4 ----
> drivers/gpu/drm/xe/xe_device.h       |  8 ++++++--
> drivers/gpu/drm/xe/xe_device_types.h |  2 +-
> drivers/gpu/drm/xe/xe_ggtt.c         |  2 +-
> drivers/gpu/drm/xe/xe_gt.c           | 11 ++++++++---
> drivers/gpu/drm/xe/xe_gt.h           |  2 +-
> drivers/gpu/drm/xe/xe_migrate.c      | 12 ++++++------
> drivers/gpu/drm/xe/xe_pci.c          |  7 +++++--
> drivers/gpu/drm/xe/xe_pt.c           |  4 ++--
> drivers/gpu/drm/xe/xe_vm.c           |  6 +++---
> 10 files changed, 33 insertions(+), 25 deletions(-)
>
>diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
>index c93c8895862f..b6fecee68cc6 100644
>--- a/drivers/gpu/drm/xe/xe_device.c
>+++ b/drivers/gpu/drm/xe/xe_device.c
>@@ -254,10 +254,6 @@ int xe_device_probe(struct xe_device *xe)
> 		err = xe_tile_alloc(tile);
> 		if (err)
> 			return err;
>-
>-		err = xe_gt_alloc(xe, &tile->primary_gt);
>-		if (err)
>-			return err;
> 	}
>
> 	err = xe_mmio_init(xe);
>diff --git a/drivers/gpu/drm/xe/xe_device.h b/drivers/gpu/drm/xe/xe_device.h
>index fc2655484dfd..370b9ccb875b 100644
>--- a/drivers/gpu/drm/xe/xe_device.h
>+++ b/drivers/gpu/drm/xe/xe_device.h
>@@ -58,7 +58,11 @@ static inline struct xe_gt *xe_device_get_gt(struct xe_device *xe, u8 gt_id)
> 	struct xe_gt *gt;
>
> 	XE_BUG_ON(gt_id > XE_MAX_TILES_PER_DEVICE);
>-	gt = &xe->tiles[gt_id].primary_gt;
>+
>+	gt = xe->tiles[gt_id].primary_gt;
>+	if (drm_WARN_ON(&xe->drm, !gt))
>+		return NULL;
>+
> 	XE_BUG_ON(gt->info.id != gt_id);
> 	XE_BUG_ON(gt->info.type == XE_GT_TYPE_UNINITIALIZED);
>
>@@ -75,7 +79,7 @@ static inline struct xe_gt *xe_device_get_gt(struct xe_device *xe, u8 gt_id)
>  */
> static inline struct xe_gt *xe_primary_mmio_gt(struct xe_device *xe)
> {
>-	return &xe_device_get_root_tile(xe)->primary_gt;
>+	return xe_device_get_root_tile(xe)->primary_gt;
> }
>
> static inline bool xe_device_guc_submission_enabled(struct xe_device *xe)
>diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
>index fa76750a9a5f..1033f233f6ab 100644
>--- a/drivers/gpu/drm/xe/xe_device_types.h
>+++ b/drivers/gpu/drm/xe/xe_device_types.h
>@@ -79,7 +79,7 @@ struct xe_tile {
> 	/**
> 	 * @primary_gt: Primary GT
> 	 */
>-	struct xe_gt primary_gt;
>+	struct xe_gt *primary_gt;
>
> 	/* TODO: Add media GT here */
>
>diff --git a/drivers/gpu/drm/xe/xe_ggtt.c b/drivers/gpu/drm/xe/xe_ggtt.c
>index b11f22b68bb8..7c87623ef5c5 100644
>--- a/drivers/gpu/drm/xe/xe_ggtt.c
>+++ b/drivers/gpu/drm/xe/xe_ggtt.c
>@@ -194,7 +194,7 @@ void xe_ggtt_invalidate(struct xe_ggtt *ggtt)
> 	 * TODO: Loop over each GT in tile once media GT support is
> 	 * re-added
> 	 */
>-	struct xe_gt *gt = &ggtt->tile->primary_gt;
>+	struct xe_gt *gt = ggtt->tile->primary_gt;
>
> 	/* TODO: vfunc for GuC vs. non-GuC */
>
>diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c
>index 297ee32ad928..20663cd0ddaf 100644
>--- a/drivers/gpu/drm/xe/xe_gt.c
>+++ b/drivers/gpu/drm/xe/xe_gt.c
>@@ -42,13 +42,18 @@
> #include "xe_wa.h"
> #include "xe_wopcm.h"
>
>-int xe_gt_alloc(struct xe_device *xe, struct xe_gt *gt)
>+struct xe_gt *xe_gt_alloc(struct xe_tile *tile)
> {
>-	XE_BUG_ON(gt->info.type == XE_GT_TYPE_UNINITIALIZED);
>+	struct xe_gt *gt;
>
>+	gt = drmm_kzalloc(&tile_to_xe(tile)->drm, sizeof(*gt), GFP_KERNEL);
>+	if (IS_ERR(gt))

on error it returns NULL, not an error code encoded into the pointer.

other than that,

	Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>

Lucas De Marchi

>+		return ERR_CAST(gt);
>+
>+	gt->tile = tile;
> 	gt->ordered_wq = alloc_ordered_workqueue("gt-ordered-wq", 0);
>
>-	return 0;
>+	return gt;
> }
>
> void xe_gt_sanitize(struct xe_gt *gt)
>diff --git a/drivers/gpu/drm/xe/xe_gt.h b/drivers/gpu/drm/xe/xe_gt.h
>index c8abbeb0fb96..abcefd8cde78 100644
>--- a/drivers/gpu/drm/xe/xe_gt.h
>+++ b/drivers/gpu/drm/xe/xe_gt.h
>@@ -16,7 +16,7 @@
> 	     for_each_if (((hwe__) = (gt__)->hw_engines + (id__)) && \
> 			  xe_hw_engine_is_valid((hwe__)))
>
>-int xe_gt_alloc(struct xe_device *xe, struct xe_gt *gt);
>+struct xe_gt *xe_gt_alloc(struct xe_tile *tile);
> int xe_gt_init_early(struct xe_gt *gt);
> int xe_gt_init_noalloc(struct xe_gt *gt);
> int xe_gt_init(struct xe_gt *gt);
>diff --git a/drivers/gpu/drm/xe/xe_migrate.c b/drivers/gpu/drm/xe/xe_migrate.c
>index c3a37109bc64..8a4fd80a7fde 100644
>--- a/drivers/gpu/drm/xe/xe_migrate.c
>+++ b/drivers/gpu/drm/xe/xe_migrate.c
>@@ -229,7 +229,7 @@ static int xe_migrate_prepare_vm(struct xe_tile *tile, struct xe_migrate *m,
> 		m->batch_base_ofs = xe_migrate_vram_ofs(batch_addr);
>
> 		if (xe->info.supports_usm) {
>-			batch = tile->primary_gt.usm.bb_pool->bo;
>+			batch = tile->primary_gt->usm.bb_pool->bo;
> 			batch_addr = xe_bo_addr(batch, 0, XE_PAGE_SIZE,
> 						&is_vram);
> 			m->usm_batch_base_ofs = xe_migrate_vram_ofs(batch_addr);
>@@ -313,7 +313,7 @@ static int xe_migrate_prepare_vm(struct xe_tile *tile, struct xe_migrate *m,
> struct xe_migrate *xe_migrate_init(struct xe_tile *tile)
> {
> 	struct xe_device *xe = tile_to_xe(tile);
>-	struct xe_gt *primary_gt = &tile->primary_gt;
>+	struct xe_gt *primary_gt = tile->primary_gt;
> 	struct xe_migrate *m;
> 	struct xe_vm *vm;
> 	struct ww_acquire_ctx ww;
>@@ -546,7 +546,7 @@ static u32 xe_migrate_ccs_copy(struct xe_migrate *m,
> 			       u64 dst_ofs, bool dst_is_vram, u32 dst_size,
> 			       u64 ccs_ofs, bool copy_ccs)
> {
>-	struct xe_gt *gt = &m->tile->primary_gt;
>+	struct xe_gt *gt = m->tile->primary_gt;
> 	u32 flush_flags = 0;
>
> 	if (xe_device_has_flat_ccs(gt_to_xe(gt)) && !copy_ccs && dst_is_vram) {
>@@ -601,7 +601,7 @@ struct dma_fence *xe_migrate_copy(struct xe_migrate *m,
> 				  struct ttm_resource *src,
> 				  struct ttm_resource *dst)
> {
>-	struct xe_gt *gt = &m->tile->primary_gt;
>+	struct xe_gt *gt = m->tile->primary_gt;
> 	struct xe_device *xe = gt_to_xe(gt);
> 	struct dma_fence *fence = NULL;
> 	u64 size = bo->size;
>@@ -853,7 +853,7 @@ struct dma_fence *xe_migrate_clear(struct xe_migrate *m,
> 				   struct ttm_resource *dst)
> {
> 	bool clear_vram = mem_type_is_vram(dst->mem_type);
>-	struct xe_gt *gt = &m->tile->primary_gt;
>+	struct xe_gt *gt = m->tile->primary_gt;
> 	struct xe_device *xe = gt_to_xe(gt);
> 	struct dma_fence *fence = NULL;
> 	u64 size = bo->size;
>@@ -1128,7 +1128,7 @@ xe_migrate_update_pgtables(struct xe_migrate *m,
> {
> 	const struct xe_migrate_pt_update_ops *ops = pt_update->ops;
> 	struct xe_tile *tile = m->tile;
>-	struct xe_gt *gt = &tile->primary_gt;
>+	struct xe_gt *gt = tile->primary_gt;
> 	struct xe_device *xe = tile_to_xe(tile);
> 	struct xe_sched_job *job;
> 	struct dma_fence *fence;
>diff --git a/drivers/gpu/drm/xe/xe_pci.c b/drivers/gpu/drm/xe/xe_pci.c
>index bfdc9563e54f..7d5e65d34f39 100644
>--- a/drivers/gpu/drm/xe/xe_pci.c
>+++ b/drivers/gpu/drm/xe/xe_pci.c
>@@ -517,9 +517,12 @@ static int xe_info_init(struct xe_device *xe,
> 		tile->xe = xe;
> 		tile->id = id;
>
>-		gt = &tile->primary_gt;
>+		tile->primary_gt = xe_gt_alloc(tile);
>+		if (IS_ERR(tile->primary_gt))
>+			return PTR_ERR(tile->primary_gt);
>+
>+		gt = tile->primary_gt;
> 		gt->info.id = id;
>-		gt->tile = tile;
>
> 		gt->info.id = id;	/* FIXME: Determine sensible numbering */
> 		gt->info.type = XE_GT_TYPE_MAIN;
>diff --git a/drivers/gpu/drm/xe/xe_pt.c b/drivers/gpu/drm/xe/xe_pt.c
>index a606cd1a7e3a..60e4a97c78fb 100644
>--- a/drivers/gpu/drm/xe/xe_pt.c
>+++ b/drivers/gpu/drm/xe/xe_pt.c
>@@ -1316,7 +1316,7 @@ __xe_pt_bind_vma(struct xe_tile *tile, struct xe_vma *vma, struct xe_engine *e,
>
> 		/* TLB invalidation must be done before signaling rebind */
> 		if (rebind && !xe_vm_no_dma_fences(vma->vm)) {
>-			int err = invalidation_fence_init(&tile->primary_gt, ifence, fence,
>+			int err = invalidation_fence_init(tile->primary_gt, ifence, fence,
> 							  vma);
> 			if (err) {
> 				dma_fence_put(fence);
>@@ -1636,7 +1636,7 @@ __xe_pt_unbind_vma(struct xe_tile *tile, struct xe_vma *vma, struct xe_engine *e
> 		int err;
>
> 		/* TLB invalidation must be done before signaling unbind */
>-		err = invalidation_fence_init(&tile->primary_gt, ifence, fence, vma);
>+		err = invalidation_fence_init(tile->primary_gt, ifence, fence, vma);
> 		if (err) {
> 			dma_fence_put(fence);
> 			kfree(ifence);
>diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
>index 6beb73b40dca..cbbc809ae4e4 100644
>--- a/drivers/gpu/drm/xe/xe_vm.c
>+++ b/drivers/gpu/drm/xe/xe_vm.c
>@@ -1200,7 +1200,7 @@ struct xe_vm *xe_vm_create(struct xe_device *xe, u32 flags)
> 	/* Kernel migration VM shouldn't have a circular loop.. */
> 	if (!(flags & XE_VM_FLAG_MIGRATION)) {
> 		for_each_tile(tile, xe, id) {
>-			struct xe_gt *gt = &tile->primary_gt;
>+			struct xe_gt *gt = tile->primary_gt;
> 			struct xe_vm *migrate_vm;
> 			struct xe_engine *eng;
>
>@@ -3368,7 +3368,7 @@ int xe_vm_invalidate_vma(struct xe_vma *vma)
> 			 * FIXME: We potentially need to invalidate multiple
> 			 * GTs within the tile
> 			 */
>-			seqno[id] = xe_gt_tlb_invalidation_vma(&tile->primary_gt, NULL, vma);
>+			seqno[id] = xe_gt_tlb_invalidation_vma(tile->primary_gt, NULL, vma);
> 			if (seqno[id] < 0)
> 				return seqno[id];
> 		}
>@@ -3376,7 +3376,7 @@ int xe_vm_invalidate_vma(struct xe_vma *vma)
>
> 	for_each_tile(tile, xe, id) {
> 		if (tile_needs_invalidate & BIT(id)) {
>-			ret = xe_gt_tlb_invalidation_wait(&tile->primary_gt, seqno[id]);
>+			ret = xe_gt_tlb_invalidation_wait(tile->primary_gt, seqno[id]);
> 			if (ret < 0)
> 				return ret;
> 		}
>-- 
>2.40.0
>

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Intel-xe] [PATCH 26/26] drm/xe: Clarify source of GT log messages
  2023-05-11  3:47 ` [Intel-xe] [PATCH 26/26] drm/xe: Clarify source of GT log messages Matt Roper
@ 2023-05-17  9:33   ` Michal Wajdeczko
  0 siblings, 0 replies; 75+ messages in thread
From: Michal Wajdeczko @ 2023-05-17  9:33 UTC (permalink / raw)
  To: Matt Roper, intel-xe; +Cc: Rodrigo Vivi



On 11.05.2023 05:47, Matt Roper wrote:
> The various functions in xe_gt.h can print a lot of important error and
> information messages; ensure that we always include the GT ID in those
> prints for clarity.
> 
> In the future we may want to place the new macros in a dedicated header
> like we've done in i915.  For now we're just using them within this one
> file, so including them at the top of the .c is fine.

series with this functionality [1] is already waiting for review/merge

[1] https://patchwork.freedesktop.org/series/117642/

> 
> Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
> ---
>  drivers/gpu/drm/xe/xe_gt.c | 52 ++++++++++++++++++--------------------
>  1 file changed, 25 insertions(+), 27 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c
> index 2a3457fb97fa..edcb8ccde346 100644
> --- a/drivers/gpu/drm/xe/xe_gt.c
> +++ b/drivers/gpu/drm/xe/xe_gt.c
> @@ -42,6 +42,11 @@
>  #include "xe_wa.h"
>  #include "xe_wopcm.h"
>  
> +#define gt_info(_gt, _fmt, ...) \
> +	drm_info(&gt_to_xe(_gt)->drm, "GT%u (Tile%u): " _fmt, (_gt)->info.id, gt_to_tile(_gt)->id, ##__VA_ARGS__)
> +#define gt_err(_gt, _fmt, ...) \
> +	drm_err(&gt_to_xe(_gt)->drm, "GT%u (Tile%u): " _fmt, (_gt)->info.id, gt_to_tile(_gt)->id, ##__VA_ARGS__)
> +
>  struct xe_gt *xe_gt_alloc(struct xe_tile *tile)
>  {
>  	struct xe_gt *gt;
> @@ -193,16 +198,16 @@ int xe_gt_record_default_lrcs(struct xe_gt *gt)
>  				     hwe, ENGINE_FLAG_WA);
>  		if (IS_ERR(e)) {
>  			err = PTR_ERR(e);
> -			drm_err(&xe->drm, "gt%d, hwe %s, xe_engine_create,e failed=%d",
> -				gt->info.id, hwe->name, err);
> +			gt_err(gt, "hwe %s, xe_engine_create,e failed=%d",
> +			       hwe->name, err);
>  			goto put_vm;
>  		}
>  
>  		/* Prime golden LRC with known good state */
>  		err = emit_wa_job(gt, e);
>  		if (err) {
> -			drm_err(&xe->drm, "gt%d, hwe %s, guc_id=%d, emit_wa_job,e failed=%d",
> -				gt->info.id, hwe->name, e->guc->id, err);
> +			gt_err(gt, "hwe %s, guc_id=%d, emit_wa_job,e failed=%d",
> +				hwe->name, e->guc->id, err);
>  			goto put_engine;
>  		}
>  
> @@ -210,24 +215,24 @@ int xe_gt_record_default_lrcs(struct xe_gt *gt)
>  					 1, hwe, ENGINE_FLAG_WA);
>  		if (IS_ERR(nop_e)) {
>  			err = PTR_ERR(nop_e);
> -			drm_err(&xe->drm, "gt%d, hwe %s, xe_engine_create,nop_e failed=%d",
> -				gt->info.id, hwe->name, err);
> +			gt_err(gt, "hwe %s, xe_engine_create,nop_e failed=%d",
> +				hwe->name, err);
>  			goto put_engine;
>  		}
>  
>  		/* Switch to different LRC */
>  		err = emit_nop_job(gt, nop_e);
>  		if (err) {
> -			drm_err(&xe->drm, "gt%d, hwe %s, guc_id=%d, emit_nop_job,nop_e failed=%d",
> -				gt->info.id, hwe->name, nop_e->guc->id, err);
> +			gt_err(gt, "hwe %s, guc_id=%d, emit_nop_job,nop_e failed=%d",
> +				hwe->name, nop_e->guc->id, err);
>  			goto put_nop_e;
>  		}
>  
>  		/* Reload golden LRC to record the effect of any indirect W/A */
>  		err = emit_nop_job(gt, e);
>  		if (err) {
> -			drm_err(&xe->drm, "gt%d, hwe %s, guc_id=%d, emit_nop_job,e failed=%d",
> -				gt->info.id, hwe->name, e->guc->id, err);
> +			gt_err(gt, "hwe %s, guc_id=%d, emit_nop_job,e failed=%d",
> +				hwe->name, e->guc->id, err);
>  			goto put_nop_e;
>  		}
>  
> @@ -443,15 +448,13 @@ int xe_gt_init(struct xe_gt *gt)
>  
>  static int do_gt_reset(struct xe_gt *gt)
>  {
> -	struct xe_device *xe = gt_to_xe(gt);
>  	int err;
>  
>  	xe_mmio_write32(gt, GDRST, GRDOM_FULL);
>  	err = xe_mmio_wait32(gt, GDRST, 0, GRDOM_FULL, 5000,
>  			     NULL, false);
>  	if (err)
> -		drm_err(&xe->drm,
> -			"GT reset failed to clear GEN11_GRDOM_FULL\n");
> +		gt_err(gt, "reset failed to clear GRDOM_FULL\n");
>  
>  	return err;
>  }
> @@ -494,14 +497,13 @@ static int do_gt_restart(struct xe_gt *gt)
>  
>  static int gt_reset(struct xe_gt *gt)
>  {
> -	struct xe_device *xe = gt_to_xe(gt);
>  	int err;
>  
>  	/* We only support GT resets with GuC submission */
>  	if (!xe_device_guc_submission_enabled(gt_to_xe(gt)))
>  		return -ENODEV;
>  
> -	drm_info(&xe->drm, "GT reset started\n");
> +	gt_info(gt, "reset started\n");
>  
>  	xe_gt_sanitize(gt);
>  
> @@ -530,7 +532,7 @@ static int gt_reset(struct xe_gt *gt)
>  	err = xe_force_wake_put(gt_to_fw(gt), XE_FORCEWAKE_ALL);
>  	XE_WARN_ON(err);
>  
> -	drm_info(&xe->drm, "GT reset done\n");
> +	gt_info(gt, "reset done\n");
>  
>  	return 0;
>  
> @@ -539,7 +541,7 @@ static int gt_reset(struct xe_gt *gt)
>  err_msg:
>  	XE_WARN_ON(xe_uc_start(&gt->uc));
>  	xe_device_mem_access_put(gt_to_xe(gt));
> -	drm_err(&xe->drm, "GT reset failed, err=%d\n", err);
> +	gt_err(gt, "reset failed, err=%d\n", err);
>  
>  	return err;
>  }
> @@ -553,15 +555,13 @@ static void gt_reset_worker(struct work_struct *w)
>  
>  void xe_gt_reset_async(struct xe_gt *gt)
>  {
> -	struct xe_device *xe = gt_to_xe(gt);
> -
> -	drm_info(&xe->drm, "Try GT reset\n");
> +	gt_info(gt, "Try GT reset\n");
>  
>  	/* Don't do a reset while one is already in flight */
>  	if (xe_uc_reset_prepare(&gt->uc))
>  		return;
>  
> -	drm_info(&xe->drm, "Doing GT reset\n");
> +	gt_info(gt, "Doing GT reset\n");
>  	queue_work(gt->ordered_wq, &gt->reset.worker);
>  }
>  
> @@ -578,7 +578,6 @@ void xe_gt_suspend_prepare(struct xe_gt *gt)
>  
>  int xe_gt_suspend(struct xe_gt *gt)
>  {
> -	struct xe_device *xe = gt_to_xe(gt);
>  	int err;
>  
>  	/* For now suspend/resume is only allowed with GuC */
> @@ -598,7 +597,7 @@ int xe_gt_suspend(struct xe_gt *gt)
>  
>  	xe_device_mem_access_put(gt_to_xe(gt));
>  	XE_WARN_ON(xe_force_wake_put(gt_to_fw(gt), XE_FORCEWAKE_ALL));
> -	drm_info(&xe->drm, "GT suspended\n");
> +	gt_info(gt, "suspended\n");
>  
>  	return 0;
>  
> @@ -606,14 +605,13 @@ int xe_gt_suspend(struct xe_gt *gt)
>  	XE_WARN_ON(xe_force_wake_put(gt_to_fw(gt), XE_FORCEWAKE_ALL));
>  err_msg:
>  	xe_device_mem_access_put(gt_to_xe(gt));
> -	drm_err(&xe->drm, "GT suspend failed: %d\n", err);
> +	gt_err(gt, "suspend failed: %d\n", err);
>  
>  	return err;
>  }
>  
>  int xe_gt_resume(struct xe_gt *gt)
>  {
> -	struct xe_device *xe = gt_to_xe(gt);
>  	int err;
>  
>  	xe_device_mem_access_get(gt_to_xe(gt));
> @@ -627,7 +625,7 @@ int xe_gt_resume(struct xe_gt *gt)
>  
>  	xe_device_mem_access_put(gt_to_xe(gt));
>  	XE_WARN_ON(xe_force_wake_put(gt_to_fw(gt), XE_FORCEWAKE_ALL));
> -	drm_info(&xe->drm, "GT resumed\n");
> +	gt_info(gt, "resumed\n");
>  
>  	return 0;
>  
> @@ -635,7 +633,7 @@ int xe_gt_resume(struct xe_gt *gt)
>  	XE_WARN_ON(xe_force_wake_put(gt_to_fw(gt), XE_FORCEWAKE_ALL));
>  err_msg:
>  	xe_device_mem_access_put(gt_to_xe(gt));
> -	drm_err(&xe->drm, "GT resume failed: %d\n", err);
> +	gt_err(gt, "resume failed: %d\n", err);
>  
>  	return err;
>  }

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Intel-xe] [PATCH 06/26] drm/xe: Move VRAM from GT to tile
  2023-05-15 22:40   ` Lucas De Marchi
@ 2023-05-18 17:29     ` Rodrigo Vivi
  0 siblings, 0 replies; 75+ messages in thread
From: Rodrigo Vivi @ 2023-05-18 17:29 UTC (permalink / raw)
  To: Lucas De Marchi; +Cc: Matt Roper, intel-xe

On Mon, May 15, 2023 at 03:40:16PM -0700, Lucas De Marchi wrote:
> On Wed, May 10, 2023 at 08:47:02PM -0700, Matt Roper wrote:
> > On platforms with VRAM, the VRAM is associated with the tile, not the
> > GT.
> > 
> > Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
> > ---
> > drivers/gpu/drm/xe/Makefile                   |  1 +
> > drivers/gpu/drm/xe/display/xe_fb_pin.c        |  6 +-
> > drivers/gpu/drm/xe/display/xe_plane_initial.c |  8 +-
> 
> I'm not sure the best way to handle the display. On my refactors I've
> been leaving them on a separate patch, even if it breaks the build. The
> reason is that when the rebase happens and display is moved up, we don't
> risk losing the these hunks.

yes, please, display in separated patch would be awesome!

> 
> > diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
> > index 2481b2045284..6b9e7847161c 100644
> > --- a/drivers/gpu/drm/xe/xe_device_types.h
> > +++ b/drivers/gpu/drm/xe/xe_device_types.h
> > @@ -53,6 +53,8 @@
> > 		 const struct xe_tile *: (const struct xe_device *)((tile__)->xe),	\
> > 		 struct xe_tile *: (tile__)->xe)
> > 
> > +struct xe_ggtt;
> > +
> > /**
> >  * struct xe_tile - hardware tile structure
> >  *
> > @@ -96,6 +98,40 @@ struct xe_tile {
> > 		/** @regs: pointer to tile's MMIO space (starting with registers) */
> > 		void *regs;
> > 	} mmio;
> > +
> > +	/** @mem: memory management info for tile */
> > +	struct {
> > +		/**
> > +		 * @vram: VRAM info for tile.
> > +		 *
> > +		 * Although VRAM is associated with a specific tile, it can
> > +		 * still be accessed by all tiles' GTs.
> > +		 */
> > +		struct {
> > +			/** @io_start: IO start address of this VRAM instance */
> > +			resource_size_t io_start;
> > +			/**
> > +			 * @io_size: IO size of this VRAM instance
> > +			 *
> > +			 * This represents how much of this VRAM we can access
> > +			 * via the CPU through the VRAM BAR. This can be smaller
> > +			 * than @size, in which case only part of VRAM is CPU
> > +			 * accessible (typically the first 256M). This
> > +			 * configuration is known as small-bar.
> > +			 */
> > +			resource_size_t io_size;
> > +			/** @size: size of VRAM. */
> > +			resource_size_t size;
> > +			/** @mapping: pointer to VRAM mappable space */
> > +			void *__iomem mapping;
> > +		} vram;
> > +
> > +		/** @vram_mgr: VRAM TTM manager */
> > +		struct xe_ttm_vram_mgr *vram_mgr;
> > +
> > +		/** @ggtt: Global graphics translation table */
> > +		struct xe_ggtt *ggtt;
> 
> I guess the ggtt should be moved on a separate patch?
> 
> other than that it seems good to me, but since it has a lot of
> mechanical changes, but not CI, hard to judge for correctness.
> 
> 
> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
> 
> 
> Lucas De Marchi

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Intel-xe] [PATCH 02/26] drm/xe: Introduce xe_tile
  2023-05-11  3:46 ` [Intel-xe] [PATCH 02/26] drm/xe: Introduce xe_tile Matt Roper
                     ` (2 preceding siblings ...)
  2023-05-12  5:45   ` Iddamsetty, Aravind
@ 2023-05-18 17:35   ` Rodrigo Vivi
  3 siblings, 0 replies; 75+ messages in thread
From: Rodrigo Vivi @ 2023-05-18 17:35 UTC (permalink / raw)
  To: Matt Roper; +Cc: intel-xe

On Wed, May 10, 2023 at 08:46:58PM -0700, Matt Roper wrote:
> Create a new xe_tile structure to begin separating the concept of "tile"
> from "GT."  A tile is effectively a complete GPU, and a GT is just one
> part of that.  On platforms like MTL, there's only a single full GPU
> (tile) which has its IP blocks provided by two GTs.  In contrast, a
> "multi-tile" platform like PVC is basically multiple complete GPUs
> packed behind a single PCI device.
> 
> For now, just create xe_tile as a simple wrapper around xe_gt.  The
> items in xe_gt that are truly tied to the tile rather than the GT will
> be moved in future patches.  Support for multiple GTs per tile (i.e.,
> the MTL standalone media case) will also be re-introduced in a future
> patch.
> 
> Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
> ---
>  drivers/gpu/drm/xe/xe_device.h       | 11 +++++---
>  drivers/gpu/drm/xe/xe_device_types.h | 40 +++++++++++++++++++++++++---
>  drivers/gpu/drm/xe/xe_gt_types.h     | 15 +++++++----
>  drivers/gpu/drm/xe/xe_mmio.c         | 13 ++++-----
>  drivers/gpu/drm/xe/xe_pci.c          |  5 +++-
>  drivers/gpu/drm/xe/xe_vm.c           |  2 +-
>  drivers/gpu/drm/xe/xe_vm_types.h     |  8 +++---
>  7 files changed, 71 insertions(+), 23 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_device.h b/drivers/gpu/drm/xe/xe_device.h
> index cbae480a2092..f7acaf51a1fc 100644
> --- a/drivers/gpu/drm/xe/xe_device.h
> +++ b/drivers/gpu/drm/xe/xe_device.h
> @@ -48,12 +48,17 @@ static inline struct xe_file *to_xe_file(const struct drm_file *file)
>  	return file->driver_priv;
>  }
>  
> +static inline struct xe_tile *xe_device_get_root_tile(struct xe_device *xe)
> +{
> +	return &xe->tiles[0];
> +}
> +
>  static inline struct xe_gt *xe_device_get_gt(struct xe_device *xe, u8 gt_id)
>  {
>  	struct xe_gt *gt;
>  
> -	XE_BUG_ON(gt_id > XE_MAX_GT);
> -	gt = xe->gt + gt_id;
> +	XE_BUG_ON(gt_id > XE_MAX_TILES_PER_DEVICE);
> +	gt = &xe->tiles[gt_id].primary_gt;
>  	XE_BUG_ON(gt->info.id != gt_id);
>  	XE_BUG_ON(gt->info.type == XE_GT_TYPE_UNINITIALIZED);
>  
> @@ -65,7 +70,7 @@ static inline struct xe_gt *xe_device_get_gt(struct xe_device *xe, u8 gt_id)
>   */
>  static inline struct xe_gt *to_gt(struct xe_device *xe)
>  {
> -	return xe->gt;
> +	return &xe_device_get_root_tile(xe)->primary_gt;
>  }
>  
>  static inline bool xe_device_guc_submission_enabled(struct xe_device *xe)
> diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
> index 6490a04614ce..5dcf1695925f 100644
> --- a/drivers/gpu/drm/xe/xe_device_types.h
> +++ b/drivers/gpu/drm/xe/xe_device_types.h
> @@ -34,7 +34,7 @@
>  
>  #define XE_GT0		0
>  #define XE_GT1		1
> -#define XE_MAX_GT	(XE_GT1 + 1)
> +#define XE_MAX_TILES_PER_DEVICE	(XE_GT1 + 1)
>  
>  #define XE_MAX_ASID	(BIT(20))
>  
> @@ -48,6 +48,40 @@
>  	 (_xe)->info.step.graphics >= (min_step) &&			\
>  	 (_xe)->info.step.graphics < (max_step))
>  
> +#define tile_to_xe(tile__)								\
> +	_Generic(tile__,								\
> +		 const struct xe_tile *: (const struct xe_device *)((tile__)->xe),	\
> +		 struct xe_tile *: (tile__)->xe)
> +
> +/**
> + * struct xe_tile - hardware tile structure
> + *
> + * From a driver perspective, a "tile" is effectively a complete GPU, containing
> + * an SGunit, 1-2 GTs, and (for discrete platforms) VRAM.
> + *
> + * Multi-tile platforms effectively bundle multiple GPUs behind a single PCI
> + * device and designate one "root" tile as being responsible for external PCI
> + * communication.  PCI BAR0 exposes the GGTT and MMIO register space for each
> + * tile in a stacked layout, and PCI BAR2 exposes the local memory associated
> + * with each tile similarly.  Device-wide interrupts can be enabled/disabled
> + * at the root tile, and the MSTR_TILE_INTR register will report which tiles
> + * have interrupts that need servicing.
> + */
> +struct xe_tile {
> +	/** @xe: Backpointer to tile's PCI device */
> +	struct xe_device *xe;
> +
> +	/** @id: ID of the tile */
> +	u8 id;
> +
> +	/**
> +	 * @primary_gt: Primary GT
> +	 */
> +	struct xe_gt primary_gt;
> +
> +	/* TODO: Add media GT here */
> +};
> +
>  /**
>   * struct xe_device - Top level struct of XE device
>   */
> @@ -248,8 +282,8 @@ struct xe_device {
>  	/** @ordered_wq: used to serialize compute mode resume */
>  	struct workqueue_struct *ordered_wq;
>  
> -	/** @gt: graphics tile */
> -	struct xe_gt gt[XE_MAX_GT];
> +	/** @tiles: device tiles */
> +	struct xe_tile tiles[XE_MAX_TILES_PER_DEVICE];
>  
>  	/**
>  	 * @mem_access: keep track of memory access in the device, possibly
> diff --git a/drivers/gpu/drm/xe/xe_gt_types.h b/drivers/gpu/drm/xe/xe_gt_types.h
> index 7c47d67aa8be..e0ed4508269b 100644
> --- a/drivers/gpu/drm/xe/xe_gt_types.h
> +++ b/drivers/gpu/drm/xe/xe_gt_types.h
> @@ -77,12 +77,17 @@ enum xe_steering_type {
>  };
>  
>  /**
> - * struct xe_gt - Top level struct of a graphics tile
> + * struct xe_gt - A "Graphics Technology" unit of the GPU
>   *
> - * A graphics tile may be a physical split (duplicate pieces of silicon,
> - * different GGTT + VRAM) or a virtual split (shared GGTT + VRAM). Either way
> - * this structure encapsulates of everything a GT is (MMIO, VRAM, memory
> - * management, microcontrols, and a hardware set of engines).
> + * A GT ("Graphics Technology") is the subset of a GPU primarily responsible
> + * for implementing the graphics and/or media IP.  It encapsulates the hardware

"what about compute?" we will hear this a lot

I know, the graphics portion is the one for 3D + compute, but we probably
need to be very clear in the documentation here.

> + * engines, programmable execution units, and GuC.   Each GT has its own
> + * handling of power management (RC6+forcewake) and multicast register
> + * steering.
> + *
> + * A GPU/tile may have a single GT that supplies all graphics and media
> + * functionality, or the graphics and media may be split into separate GTs
> + * within a tile.
>   */
>  struct xe_gt {
>  	/** @xe: backpointer to XE device */
> diff --git a/drivers/gpu/drm/xe/xe_mmio.c b/drivers/gpu/drm/xe/xe_mmio.c
> index 4804616a3c44..254b4a63d901 100644
> --- a/drivers/gpu/drm/xe/xe_mmio.c
> +++ b/drivers/gpu/drm/xe/xe_mmio.c
> @@ -399,6 +399,7 @@ int xe_mmio_ioctl(struct drm_device *dev, void *data,
>  		  struct drm_file *file)
>  {
>  	struct xe_device *xe = to_xe_device(dev);
> +	struct xe_gt *gt = xe_device_get_gt(xe, 0);
>  	struct drm_xe_mmio *args = data;
>  	unsigned int bits_flag, bytes;
>  	struct xe_reg reg;
> @@ -440,7 +441,7 @@ int xe_mmio_ioctl(struct drm_device *dev, void *data,
>  	 */
>  	reg = XE_REG(args->addr);
>  
> -	xe_force_wake_get(gt_to_fw(&xe->gt[0]), XE_FORCEWAKE_ALL);
> +	xe_force_wake_get(gt_to_fw(gt), XE_FORCEWAKE_ALL);
>  
>  	if (args->flags & DRM_XE_MMIO_WRITE) {
>  		switch (bits_flag) {
> @@ -449,10 +450,10 @@ int xe_mmio_ioctl(struct drm_device *dev, void *data,
>  				ret = -EINVAL;
>  				goto exit;
>  			}
> -			xe_mmio_write32(to_gt(xe), reg, args->value);
> +			xe_mmio_write32(gt, reg, args->value);
>  			break;
>  		case DRM_XE_MMIO_64BIT:
> -			xe_mmio_write64(to_gt(xe), reg, args->value);
> +			xe_mmio_write64(gt, reg, args->value);
>  			break;
>  		default:
>  			drm_dbg(&xe->drm, "Invalid MMIO bit size");
> @@ -467,10 +468,10 @@ int xe_mmio_ioctl(struct drm_device *dev, void *data,
>  	if (args->flags & DRM_XE_MMIO_READ) {
>  		switch (bits_flag) {
>  		case DRM_XE_MMIO_32BIT:
> -			args->value = xe_mmio_read32(to_gt(xe), reg);
> +			args->value = xe_mmio_read32(gt, reg);
>  			break;
>  		case DRM_XE_MMIO_64BIT:
> -			args->value = xe_mmio_read64(to_gt(xe), reg);
> +			args->value = xe_mmio_read64(gt, reg);
>  			break;
>  		default:
>  			drm_dbg(&xe->drm, "Invalid MMIO bit size");
> @@ -482,7 +483,7 @@ int xe_mmio_ioctl(struct drm_device *dev, void *data,
>  	}
>  
>  exit:
> -	xe_force_wake_put(gt_to_fw(&xe->gt[0]), XE_FORCEWAKE_ALL);
> +	xe_force_wake_put(gt_to_fw(gt), XE_FORCEWAKE_ALL);
>  
>  	return ret;
>  }
> diff --git a/drivers/gpu/drm/xe/xe_pci.c b/drivers/gpu/drm/xe/xe_pci.c
> index bf2c234c4f6e..e79b16d8bf7f 100644
> --- a/drivers/gpu/drm/xe/xe_pci.c
> +++ b/drivers/gpu/drm/xe/xe_pci.c
> @@ -525,7 +525,10 @@ static int xe_info_init(struct xe_device *xe,
>  	xe->info.step = xe_step_get(xe);
>  
>  	for (id = 0; id < xe->info.tile_count; ++id) {
> -		gt = xe->gt + id;
> +		xe->tiles[id].xe = xe;
> +		xe->tiles[id].id = id;
> +
> +		gt = &xe->tiles[id].primary_gt;
>  		gt->info.id = id;
>  		gt->xe = xe;
>  
> diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
> index 0a4becdf4675..fe6abb6ed6ca 100644
> --- a/drivers/gpu/drm/xe/xe_vm.c
> +++ b/drivers/gpu/drm/xe/xe_vm.c
> @@ -3347,7 +3347,7 @@ int xe_vm_invalidate_vma(struct xe_vma *vma)
>  	struct xe_device *xe = vma->vm->xe;
>  	struct xe_gt *gt;
>  	u32 gt_needs_invalidate = 0;
> -	int seqno[XE_MAX_GT];
> +	int seqno[XE_MAX_TILES_PER_DEVICE];
>  	u8 id;
>  	int ret;
>  
> diff --git a/drivers/gpu/drm/xe/xe_vm_types.h b/drivers/gpu/drm/xe/xe_vm_types.h
> index fada7896867f..203ba9d946b8 100644
> --- a/drivers/gpu/drm/xe/xe_vm_types.h
> +++ b/drivers/gpu/drm/xe/xe_vm_types.h
> @@ -159,7 +159,7 @@ struct xe_vm {
>  	struct kref refcount;
>  
>  	/* engine used for (un)binding vma's */
> -	struct xe_engine *eng[XE_MAX_GT];
> +	struct xe_engine *eng[XE_MAX_TILES_PER_DEVICE];
>  
>  	/** Protects @rebind_list and the page-table structures */
>  	struct dma_resv resv;
> @@ -167,9 +167,9 @@ struct xe_vm {
>  	u64 size;
>  	struct rb_root vmas;
>  
> -	struct xe_pt *pt_root[XE_MAX_GT];
> -	struct xe_bo *scratch_bo[XE_MAX_GT];
> -	struct xe_pt *scratch_pt[XE_MAX_GT][XE_VM_MAX_LEVEL];
> +	struct xe_pt *pt_root[XE_MAX_TILES_PER_DEVICE];
> +	struct xe_bo *scratch_bo[XE_MAX_TILES_PER_DEVICE];
> +	struct xe_pt *scratch_pt[XE_MAX_TILES_PER_DEVICE][XE_VM_MAX_LEVEL];
>  
>  	/** @flags: flags for this VM, statically setup a creation time */
>  #define XE_VM_FLAGS_64K			BIT(0)
> -- 
> 2.40.0
> 

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Intel-xe] [PATCH 00/26] Separate GT and tile
  2023-05-11  3:46 [Intel-xe] [PATCH 00/26] Separate GT and tile Matt Roper
                   ` (33 preceding siblings ...)
  2023-05-16 14:18 ` Das, Nirmoy
@ 2023-05-18 17:47 ` Rodrigo Vivi
  34 siblings, 0 replies; 75+ messages in thread
From: Rodrigo Vivi @ 2023-05-18 17:47 UTC (permalink / raw)
  To: Matt Roper
  Cc: Lucas De Marchi, Rodrigo Vivi, intel-xe, ville.syrjala,
	Nirmoy Das

On Wed, May 10, 2023 at 08:46:56PM -0700, Matt Roper wrote:
> A 'tile' is not the same thing as a 'GT.'  For historical reasons, i915
> attempted to use a single 'struct intel_gt' to represent both concepts,
> although this design hasn't worked out terribly well.  For Xe we have
> the opportunity to design the driver in a way that more accurately
> reflects the real hardware behavior.
> 
> Different vendors use the term "tile" a bit differently, but in the
> Intel world, a 'tile' is pretty close to what most people would think of

Even in the graphics world we have many different meanings for 'tile'
like the way to organize your pixels in memory, etc...

Other options could be sub_device, subdev, ... ?! but anyway,
as long it is well documented any name should be good.

sub_device would align well with current level 0 API, but I heard that
some folks don't like that and were asking that to change and level 0
to expose all sub_devices like devices, so I'm not sure if it worth
to align.

> as being a complete GPU.  When multiple GPUs are placed behind a single
> PCI device, that's what we refer to as a "multi-tile device."  In such
> cases, pretty much all hardware is replicated per-tile, although certain
> responsibilities like PCI communication, reporting of interrupts to the
> OS, etc. are handled solely by the "root tile."  A multi-tile platform
> takes care of tying the tiles together in a way such that interrupt
> notifications from remote tiles are forwarded to the root tile, the
> per-tile vram is combined into a single address space, etc.
> 
> In contrast, a "GT" (which officially stands for "Graphics Technology")
> is the subset of a GPU/tile that is responsible for implementing
> graphics and/or media operations.  The GT is where a lot of the driver
> implementation happens since it's where the hardware engines, the
> execution units, and the GuC all reside.
> 
> Historically most Intel devices were single-tile devices that contained
> a single GT.  PVC is currently the only released Intel platform built on
> a multi-tile design (i.e., multiple GPUs behind a single PCI device);
> each PVC tile only has a single GT.  In contrast, platforms like MTL
> that have separate chips for render and media IP are still only a single
> logical GPU, but the graphics and media IP blocks are exposed each
> exposed as a separate GT within that single GPU.  This is important from
> a software perspective because multi-GT platforms like MTL only
> replicate a subset of the GPU hardware and behave differently than
> multi-tile platforms like PVC where nearly everything is replicated.
> 
> This series separates tiles from GTs in a manner that more closely
> matches the hardware behavior.  We now consider a PCI device (xe_device)
> to contain one or more tiles (struct xe_tile).  Each tile will contain
> one or two GTs (struct xe_gt).  Although we don't have any platforms yet
> that are multi-tile *and* contain more than one GT per tile, that may
> change in the future.  This driver redesign splits functionality as
> follows:
> 
> Per-tile functionality (shared by all GTs within the tile):
>  - Complete 4MB MMIO space (containing SGunit/SoC registers, GT
>    registers, display registers, etc.)
>  - Global GTT
>  - VRAM (if discrete)
>  - Interrupt flows
>  - Migration context
>  - kernel batchbuffer pool
>  - Primary GT
>  - Media GT (if media version >= 13)
> 
> Per-GT functionality:
>  - GuC
>  - Hardware engines
>  - Programmable hardware units (subslices, EUs)
>  - GSI subset of registers (multiple copies of these registers reside
>    within the complete MMIO space provided by the tile, but at different
>    offsets --- 0 for render, 0x380000 for media)
>  - Multicast register steering
>  - TLBs to cache page table translations
>  - Reset capability
>  - Low-level power management (e.g., C6)
>  - Clock frequency
>  - MOCS and PAT programming

Everything above is a very good text for a /** DOC: **/ page,
could you please add it to the patch 2?

> 
> At the moment I've left USM / pagefault handling at the GT level,
> although I'm not familiar enough with that specific feature to know
> whether it's truly correct or not.
> 
> The first patch in this series temporarily drops MTL media GT support.
> The driver doesn't load properly on MTL today, largely due to the
> mishandling of GT vs tile; dropping support completely allows us to more
> easily make the necessary driver redesign required.  The media GT is
> re-enabled (properly this time) near the end of the series and this
> allows the driver to load successfully without error on MTL for the
> first time.  There are still issues when submitting workloads to MTL
> after driver load (i.e., CAT errors), but those seem to be a separate
> platform-specific issues unrelated to the GT/tile work in this series
> that will need to be debugged and fixed separately.
> 
> 
> This series leaves a few open questions and FIXME's:
>  - Unlike i915, the Xe driver has chosen to expose GTs to userspace
>    rather than keeping them a hidden implementation detail.  With the
>    separation of xe_tile and xe_gt, we need to decide whether we also
>    want to expose tiles (in addition to GTs), whether we want to _only_
>    expose tiles (and keep the primary vs media GT separation a hidden
>    internal detail), or something else.

same level0 alignment dilema applies here...

>  - How should GTs be numbered?  Today it's straightforward --- PVC
>    assigns GT IDs 0 and 1 to the primary GT of each tile.  MTL assigns
>    GT IDs 0 and 1 to the primary and media GTs of its sole tile.  But if
>    we have a platform in the future that has multiple tiles _and_
>    multiple GTs per tile, how should we handle the numbering in that
>    case?

exposing the sub_device/tile would make this numbering likely easier,
but then our future hw change the split again and we are again misaligned...

so, no strong opnion here...

one thing I had in mind before seeing your series was to make things
as simple as gt<n>/ and name file (type?)

$ cat gt0/name
Graphics-Root

$ cat gt1/name
Media

or

$ cat gt1/name
Graphics-Secondary


>  - Xe (mis)design used xe_gt as the target of all MMIO operations (i.e.,
>    xe_mmio_*()).  This really doesn't make sense, especially since
>    there's a lot of MMIO accesses that are completely unrelated to GT
>    (i.e., sgunit registers, display registers, etc.).  i915 used
>    'intel_uncore' as the MMIO target, although that wasn't really an
>    accurate reflection of the hardware either.  What we really want is
>    something that combines the MMIO register space (stored in the tile)
>    with the GSI offset (stored in the GT).  My current plan is to
>    introduce an "xe_mmio_view" (name may change) in a future series that
>    will serve as a target for register operations.  There will be
>    sensible APIs to obtain an xe_mmio_view appropriate to the type of
>    register access being performed (and that will also be able to do
>    some range sanity checking in debug drivers to help catch misuse).
>    That's a somewhat large/invasive change, so I'm saving that for a
>    follow-up series after this one is completed.

\o/

Ville was indeed complaining about this mmio misdesign, but since I don't
like the i915 uncore either I wasn't sure about this, but I like your idea
here very much.

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>

> 
> 
> Cc: Matthew Brost <matthew.brost@intel.com>
> Cc: Lucas De Marchi <lucas.demarchi@intel.com>
> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
> Cc: Michael J. Ruhl <michael.j.ruhl@intel.com>
> Cc: Nirmoy Das <nirmoy.das@intel.com>
> 
> 
> Matt Roper (26):
>   drm/xe/mtl: Disable media GT
>   drm/xe: Introduce xe_tile
>   drm/xe: Add backpointer from gt to tile
>   drm/xe: Add for_each_tile iterator
>   drm/xe: Move register MMIO into xe_tile
>   drm/xe: Move VRAM from GT to tile
>   drm/xe: Memory allocations are tile-based, not GT-based
>   drm/xe: Move migration from GT to tile
>   drm/xe: Clarify 'gt' retrieval for primary tile
>   drm/xe: Drop vram_id
>   drm/xe: Drop extra_gts[] declarations and XE_GT_TYPE_REMOTE
>   drm/xe: Allocate GT dynamically
>   drm/xe: Add media GT to tile
>   drm/xe: Move display IRQ postinstall out of GT function
>   drm/xe: Interrupts are delivered per-tile, not per-GT
>   drm/xe/irq: Handle ASLE backlight interrupts at same time as display
>   drm/xe/irq: Actually call xe_irq_postinstall()
>   drm/xe/irq: Ensure primary GuC won't clobber media GuC's interrupt
>     mask
>   drm/xe/irq: Untangle postinstall functions
>   drm/xe: Replace xe_gt_irq_postinstall with xe_irq_enable_hwe
>   drm/xe: Invalidate TLB on all affected GTs during GGTT updates
>   drm/xe/tlb: Obtain forcewake when doing GGTT TLB invalidations
>   drm/xe: Allow GT looping and lookup on standalone media
>   drm/xe: Update query uapi to support standalone media
>   drm/xe: Reinstate media GT support
>   drm/xe: Clarify source of GT log messages
> 
>  drivers/gpu/drm/i915/display/intel_dsb.c      |   5 +-
>  drivers/gpu/drm/i915/display/intel_fbc.c      |   3 +-
>  drivers/gpu/drm/i915/display/intel_fbdev.c    |   7 +-
>  drivers/gpu/drm/xe/Makefile                   |   1 +
>  .../drm/xe/compat-i915-headers/intel_uncore.h |   2 +-
>  drivers/gpu/drm/xe/display/ext/i915_irq.c     |   2 +-
>  drivers/gpu/drm/xe/display/xe_fb_pin.c        |  13 +-
>  drivers/gpu/drm/xe/display/xe_plane_initial.c |   8 +-
>  drivers/gpu/drm/xe/regs/xe_gt_regs.h          |   8 +
>  drivers/gpu/drm/xe/tests/xe_bo.c              |   8 +-
>  drivers/gpu/drm/xe/tests/xe_migrate.c         |  15 +-
>  drivers/gpu/drm/xe/xe_bb.c                    |   5 +-
>  drivers/gpu/drm/xe/xe_bo.c                    | 104 ++---
>  drivers/gpu/drm/xe/xe_bo.h                    |  20 +-
>  drivers/gpu/drm/xe/xe_bo_evict.c              |  22 +-
>  drivers/gpu/drm/xe/xe_bo_types.h              |   4 +-
>  drivers/gpu/drm/xe/xe_device.c                |  12 +-
>  drivers/gpu/drm/xe/xe_device.h                |  49 ++-
>  drivers/gpu/drm/xe/xe_device_types.h          | 107 ++++-
>  drivers/gpu/drm/xe/xe_engine.c                |   2 +-
>  drivers/gpu/drm/xe/xe_ggtt.c                  |  45 +-
>  drivers/gpu/drm/xe/xe_ggtt.h                  |   6 +-
>  drivers/gpu/drm/xe/xe_ggtt_types.h            |   2 +-
>  drivers/gpu/drm/xe/xe_gt.c                    | 191 ++-------
>  drivers/gpu/drm/xe/xe_gt.h                    |   8 +-
>  drivers/gpu/drm/xe/xe_gt_debugfs.c            |   8 +-
>  drivers/gpu/drm/xe/xe_gt_mcr.c                |   2 +-
>  drivers/gpu/drm/xe/xe_gt_pagefault.c          |  16 +-
>  drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c   |   4 +-
>  drivers/gpu/drm/xe/xe_gt_types.h              |  87 ++--
>  drivers/gpu/drm/xe/xe_guc.c                   |  11 +-
>  drivers/gpu/drm/xe/xe_guc_ads.c               |   5 +-
>  drivers/gpu/drm/xe/xe_guc_ct.c                |   5 +-
>  drivers/gpu/drm/xe/xe_guc_hwconfig.c          |   5 +-
>  drivers/gpu/drm/xe/xe_guc_log.c               |   6 +-
>  drivers/gpu/drm/xe/xe_guc_pc.c                |   5 +-
>  drivers/gpu/drm/xe/xe_hw_engine.c             |   6 +-
>  drivers/gpu/drm/xe/xe_irq.c                   | 393 +++++++++---------
>  drivers/gpu/drm/xe/xe_irq.h                   |   3 +-
>  drivers/gpu/drm/xe/xe_lrc.c                   |  13 +-
>  drivers/gpu/drm/xe/xe_lrc_types.h             |   4 +-
>  drivers/gpu/drm/xe/xe_migrate.c               |  76 ++--
>  drivers/gpu/drm/xe/xe_migrate.h               |   9 +-
>  drivers/gpu/drm/xe/xe_mmio.c                  |  92 ++--
>  drivers/gpu/drm/xe/xe_mmio.h                  |  21 +-
>  drivers/gpu/drm/xe/xe_mocs.c                  |  14 +-
>  drivers/gpu/drm/xe/xe_pci.c                   |  92 ++--
>  drivers/gpu/drm/xe/xe_pt.c                    | 150 ++++---
>  drivers/gpu/drm/xe/xe_pt.h                    |  14 +-
>  drivers/gpu/drm/xe/xe_query.c                 |  32 +-
>  drivers/gpu/drm/xe/xe_res_cursor.h            |   2 +-
>  drivers/gpu/drm/xe/xe_sa.c                    |  13 +-
>  drivers/gpu/drm/xe/xe_sa.h                    |   4 +-
>  drivers/gpu/drm/xe/xe_tile.c                  |  89 ++++
>  drivers/gpu/drm/xe/xe_tile.h                  |  16 +
>  drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c        |   4 +-
>  drivers/gpu/drm/xe/xe_ttm_vram_mgr.c          |  16 +-
>  drivers/gpu/drm/xe/xe_ttm_vram_mgr.h          |   4 +-
>  drivers/gpu/drm/xe/xe_ttm_vram_mgr_types.h    |   6 +-
>  drivers/gpu/drm/xe/xe_uc_fw.c                 |   5 +-
>  drivers/gpu/drm/xe/xe_vm.c                    | 156 ++++---
>  drivers/gpu/drm/xe/xe_vm.h                    |   2 +-
>  drivers/gpu/drm/xe/xe_vm_types.h              |  22 +-
>  include/uapi/drm/xe_drm.h                     |   4 +-
>  64 files changed, 1108 insertions(+), 957 deletions(-)
>  create mode 100644 drivers/gpu/drm/xe/xe_tile.c
>  create mode 100644 drivers/gpu/drm/xe/xe_tile.h
> 
> -- 
> 2.40.0
> 

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Intel-xe] [PATCH 13/26] drm/xe: Add media GT to tile
  2023-05-11  3:47 ` [Intel-xe] [PATCH 13/26] drm/xe: Add media GT to tile Matt Roper
@ 2023-05-18 17:50   ` Rodrigo Vivi
  0 siblings, 0 replies; 75+ messages in thread
From: Rodrigo Vivi @ 2023-05-18 17:50 UTC (permalink / raw)
  To: Matt Roper; +Cc: intel-xe

On Wed, May 10, 2023 at 08:47:09PM -0700, Matt Roper wrote:
> This media_gt pointer isn't actually allocated yet.  Future patches will
> start hooking it up at appropriate places in the code, and then creation
> of the media GT will be added once those infrastructure changes are in
> place.

I think it could be squashed in some other patch, but anyway

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

> 
> Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
> ---
>  drivers/gpu/drm/xe/xe_device_types.h | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
> index 1033f233f6ab..2cf67ea57aac 100644
> --- a/drivers/gpu/drm/xe/xe_device_types.h
> +++ b/drivers/gpu/drm/xe/xe_device_types.h
> @@ -81,7 +81,12 @@ struct xe_tile {
>  	 */
>  	struct xe_gt *primary_gt;
>  
> -	/* TODO: Add media GT here */
> +	/**
> +	 * @media_gt: Media GT
> +	 *
> +	 * Only present on devices with media version >= 13.
> +	 */
> +	struct xe_gt *media_gt;
>  
>  	/**
>  	 * @mmio: MMIO info for a tile.
> -- 
> 2.40.0
> 

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Intel-xe] [PATCH 14/26] drm/xe: Move display IRQ postinstall out of GT function
  2023-05-11  3:47 ` [Intel-xe] [PATCH 14/26] drm/xe: Move display IRQ postinstall out of GT function Matt Roper
@ 2023-05-18 17:51   ` Rodrigo Vivi
  2023-05-18 18:20   ` Lucas De Marchi
  1 sibling, 0 replies; 75+ messages in thread
From: Rodrigo Vivi @ 2023-05-18 17:51 UTC (permalink / raw)
  To: Matt Roper; +Cc: intel-xe

On Wed, May 10, 2023 at 08:47:10PM -0700, Matt Roper wrote:
> Display interrupts are unrelated to the GT (and are also only relevant
> to the root tile).  Move the postinstall call up a level in the
> callstack.

good catch!

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>


> 
> Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
> ---
>  drivers/gpu/drm/xe/xe_irq.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_irq.c b/drivers/gpu/drm/xe/xe_irq.c
> index 806121009102..494ec5567e50 100644
> --- a/drivers/gpu/drm/xe/xe_irq.c
> +++ b/drivers/gpu/drm/xe/xe_irq.c
> @@ -490,8 +490,6 @@ void xe_gt_irq_postinstall(struct xe_gt *gt)
>  		dg1_irq_postinstall(xe, gt);
>  	else
>  		xelp_irq_postinstall(xe, gt);
> -
> -	xe_display_irq_postinstall(xe, gt);
>  }
>  
>  static void xe_irq_postinstall(struct xe_device *xe)
> @@ -501,6 +499,8 @@ static void xe_irq_postinstall(struct xe_device *xe)
>  
>  	for_each_gt(gt, xe, id)
>  		xe_gt_irq_postinstall(gt);
> +
> +	xe_display_irq_postinstall(xe, xe_primary_mmio_gt(xe));
>  }
>  
>  static irq_handler_t xe_irq_handler(struct xe_device *xe)
> -- 
> 2.40.0
> 

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Intel-xe] [PATCH 14/26] drm/xe: Move display IRQ postinstall out of GT function
  2023-05-11  3:47 ` [Intel-xe] [PATCH 14/26] drm/xe: Move display IRQ postinstall out of GT function Matt Roper
  2023-05-18 17:51   ` Rodrigo Vivi
@ 2023-05-18 18:20   ` Lucas De Marchi
  1 sibling, 0 replies; 75+ messages in thread
From: Lucas De Marchi @ 2023-05-18 18:20 UTC (permalink / raw)
  To: Matt Roper; +Cc: intel-xe

On Wed, May 10, 2023 at 08:47:10PM -0700, Matt Roper wrote:
>Display interrupts are unrelated to the GT (and are also only relevant
>to the root tile).  Move the postinstall call up a level in the
>callstack.
>
>Signed-off-by: Matt Roper <matthew.d.roper@intel.com>


Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>

Lucas De Marchi

>---
> drivers/gpu/drm/xe/xe_irq.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
>diff --git a/drivers/gpu/drm/xe/xe_irq.c b/drivers/gpu/drm/xe/xe_irq.c
>index 806121009102..494ec5567e50 100644
>--- a/drivers/gpu/drm/xe/xe_irq.c
>+++ b/drivers/gpu/drm/xe/xe_irq.c
>@@ -490,8 +490,6 @@ void xe_gt_irq_postinstall(struct xe_gt *gt)
> 		dg1_irq_postinstall(xe, gt);
> 	else
> 		xelp_irq_postinstall(xe, gt);
>-
>-	xe_display_irq_postinstall(xe, gt);
> }
>
> static void xe_irq_postinstall(struct xe_device *xe)
>@@ -501,6 +499,8 @@ static void xe_irq_postinstall(struct xe_device *xe)
>
> 	for_each_gt(gt, xe, id)
> 		xe_gt_irq_postinstall(gt);
>+
>+	xe_display_irq_postinstall(xe, xe_primary_mmio_gt(xe));
> }
>
> static irq_handler_t xe_irq_handler(struct xe_device *xe)
>-- 
>2.40.0
>

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Intel-xe] [PATCH 15/26] drm/xe: Interrupts are delivered per-tile, not per-GT
  2023-05-11  3:47 ` [Intel-xe] [PATCH 15/26] drm/xe: Interrupts are delivered per-tile, not per-GT Matt Roper
  2023-05-11 12:14   ` Iddamsetty, Aravind
@ 2023-05-18 18:30   ` Lucas De Marchi
  1 sibling, 0 replies; 75+ messages in thread
From: Lucas De Marchi @ 2023-05-18 18:30 UTC (permalink / raw)
  To: Matt Roper; +Cc: intel-xe

On Wed, May 10, 2023 at 08:47:11PM -0700, Matt Roper wrote:
>IRQ delivery and handling needs to be handled on a per-tile basis.  Note
>that this is true even for the "GT interrupts" relating to engines and
>GuCs --- the interrupts relating to both GTs get raised through a single
>set of registers in the tile's sgunit range.
>
>The (mis)use of struct xe_gt as a target for MMIO operations in the
>driver makes the code somewhat confusing since we wind up needing a GT
>pointer to handle programming that's unrelated to the GT.  To mitigate
>this confusion, all of the xe_gt structures used solely as an MMIO
>target in interrupt code are renamed to 'mmio.'  Reworking the driver's
>MMIO handling to not be dependent on xe_gt is planned as a future
>update.
>
>Note that GT initialization code currently calls xe_gt_irq_postinstall()
>in an attempt to enable the HWE interrupts for the GT being initialized.
>Unfortunately xe_gt_irq_postinstall() doesn't really match its name and
>does a bunch of other stuff unrelated to the GT interrupts (such as
>enabling the top-level device interrupts).  That will be addressed in
>future patches.
>
>Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
>---
> drivers/gpu/drm/xe/xe_gt.c  |   2 +-
> drivers/gpu/drm/xe/xe_irq.c | 334 ++++++++++++++++++++----------------
> drivers/gpu/drm/xe/xe_irq.h |   4 +-
> 3 files changed, 187 insertions(+), 153 deletions(-)
>
>diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c
>index 20663cd0ddaf..e00d260dff00 100644
>--- a/drivers/gpu/drm/xe/xe_gt.c
>+++ b/drivers/gpu/drm/xe/xe_gt.c
>@@ -303,7 +303,7 @@ static int gt_fw_domain_init(struct xe_gt *gt)
> 	gt->info.engine_mask = gt->info.__engine_mask;
>
> 	/* Enables per hw engine IRQs */
>-	xe_gt_irq_postinstall(gt);
>+	xe_gt_irq_postinstall(gt_to_tile(gt));
>
> 	/* Rerun MCR init as we now have hw engine list */
> 	xe_gt_mcr_init(gt);
>diff --git a/drivers/gpu/drm/xe/xe_irq.c b/drivers/gpu/drm/xe/xe_irq.c
>index 494ec5567e50..fa7d04ba23c0 100644
>--- a/drivers/gpu/drm/xe/xe_irq.c
>+++ b/drivers/gpu/drm/xe/xe_irq.c
>@@ -27,60 +27,66 @@
> #define IIR(offset)				XE_REG(offset + 0x8)
> #define IER(offset)				XE_REG(offset + 0xc)
>
>-static void assert_iir_is_zero(struct xe_gt *gt, struct xe_reg reg)
>+static void assert_iir_is_zero(struct xe_gt *mmio, struct xe_reg reg)
> {
>-	u32 val = xe_mmio_read32(gt, reg);
>+	u32 val = xe_mmio_read32(mmio, reg);
>
> 	if (val == 0)
> 		return;
>
>-	drm_WARN(&gt_to_xe(gt)->drm, 1,
>+	drm_WARN(&gt_to_xe(mmio)->drm, 1,
> 		 "Interrupt register 0x%x is not zero: 0x%08x\n",
> 		 reg.addr, val);
>-	xe_mmio_write32(gt, reg, 0xffffffff);
>-	xe_mmio_read32(gt, reg);
>-	xe_mmio_write32(gt, reg, 0xffffffff);
>-	xe_mmio_read32(gt, reg);
>+	xe_mmio_write32(mmio, reg, 0xffffffff);
>+	xe_mmio_read32(mmio, reg);
>+	xe_mmio_write32(mmio, reg, 0xffffffff);
>+	xe_mmio_read32(mmio, reg);
> }
>
> /*
>  * Unmask and enable the specified interrupts.  Does not check current state,
>  * so any bits not specified here will become masked and disabled.
>  */
>-static void unmask_and_enable(struct xe_gt *gt, u32 irqregs, u32 bits)
>+static void unmask_and_enable(struct xe_tile *tile, u32 irqregs, u32 bits)
> {
>+	struct xe_gt *mmio = tile->primary_gt;


AFAICS mmio is always assigned to primary_gt. Why not call it primary_gt
and avoid the confusion and name mismatch with the type?

If I see a function assert_iir_is_zero(struct xe_gt *primary_gt, struct xe_reg reg)
there's a hint that function is only supposed to be called with the
primary_gt. If it can be any type of gt, then simply calling it gt
rather than mmio would be more straightforward.

Lucas De Marchi

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Intel-xe] [PATCH 16/26] drm/xe/irq: Handle ASLE backlight interrupts at same time as display
  2023-05-11  3:47 ` [Intel-xe] [PATCH 16/26] drm/xe/irq: Handle ASLE backlight interrupts at same time as display Matt Roper
@ 2023-05-18 18:33   ` Lucas De Marchi
  0 siblings, 0 replies; 75+ messages in thread
From: Lucas De Marchi @ 2023-05-18 18:33 UTC (permalink / raw)
  To: Matt Roper; +Cc: intel-xe

On Wed, May 10, 2023 at 08:47:12PM -0700, Matt Roper wrote:
>Our only use of GUnit interrupts is to handle ASLE backlight operations
>that are reported as GUnit GSE interrupts.  Move the enable/disable of
>these interrupts adjacent to display interrupts.
>
>In the future we may want to even move these inside the
>xe_display_irq_*() functions.  But since these rely on xe_irq static
>functions like mask_and_disable() it's easier to keep them separate for
>now.
>
>Signed-off-by: Matt Roper <matthew.d.roper@intel.com>


Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>

Lucas De Marchi

>---
> drivers/gpu/drm/xe/xe_irq.c | 15 +++++++++------
> 1 file changed, 9 insertions(+), 6 deletions(-)
>
>diff --git a/drivers/gpu/drm/xe/xe_irq.c b/drivers/gpu/drm/xe/xe_irq.c
>index fa7d04ba23c0..02f44b58ce3e 100644
>--- a/drivers/gpu/drm/xe/xe_irq.c
>+++ b/drivers/gpu/drm/xe/xe_irq.c
>@@ -187,8 +187,6 @@ static void xelp_irq_postinstall(struct xe_device *xe, struct xe_tile *tile)
>
> 	gt_irq_postinstall(tile);
>
>-	unmask_and_enable(tile, GU_MISC_IRQ_OFFSET, GU_MISC_GSE);
>-
> 	xelp_intr_enable(xe, true);
> }
>
>@@ -372,8 +370,6 @@ static void dg1_irq_postinstall(struct xe_device *xe, struct xe_tile *tile)
> {
> 	gt_irq_postinstall(tile);
>
>-	unmask_and_enable(tile, GU_MISC_IRQ_OFFSET, GU_MISC_GSE);
>-
> 	if (tile->id == 0)
> 		dg1_intr_enable(xe, true);
> }
>@@ -486,7 +482,6 @@ static void xelp_irq_reset(struct xe_tile *tile)
>
> 	gt_irq_reset(tile);
>
>-	mask_and_disable(tile, GU_MISC_IRQ_OFFSET);
> 	mask_and_disable(tile, PCU_IRQ_OFFSET);
> }
>
>@@ -497,7 +492,6 @@ static void dg1_irq_reset(struct xe_tile *tile)
>
> 	gt_irq_reset(tile);
>
>-	mask_and_disable(tile, GU_MISC_IRQ_OFFSET);
> 	mask_and_disable(tile, PCU_IRQ_OFFSET);
> }
>
>@@ -513,6 +507,8 @@ static void xe_irq_reset(struct xe_device *xe)
> 			xelp_irq_reset(tile);
> 	}
>
>+	tile = xe_device_get_root_tile(xe);
>+	mask_and_disable(tile, GU_MISC_IRQ_OFFSET);
> 	xe_display_irq_reset(xe);
> }
>
>@@ -535,6 +531,13 @@ static void xe_irq_postinstall(struct xe_device *xe)
> 		xe_gt_irq_postinstall(tile);
>
> 	xe_display_irq_postinstall(xe, xe_primary_mmio_gt(xe));
>+
>+	/*
>+	 * ASLE backlight operations are reported via GUnit GSE interrupts
>+	 * on the root tile.
>+	 */
>+	unmask_and_enable(xe_device_get_root_tile(xe),
>+			  GU_MISC_IRQ_OFFSET, GU_MISC_GSE);
> }
>
> static irq_handler_t xe_irq_handler(struct xe_device *xe)
>-- 
>2.40.0
>

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Intel-xe] [PATCH 17/26] drm/xe/irq: Actually call xe_irq_postinstall()
  2023-05-11  3:47 ` [Intel-xe] [PATCH 17/26] drm/xe/irq: Actually call xe_irq_postinstall() Matt Roper
@ 2023-05-18 18:40   ` Lucas De Marchi
  0 siblings, 0 replies; 75+ messages in thread
From: Lucas De Marchi @ 2023-05-18 18:40 UTC (permalink / raw)
  To: Matt Roper; +Cc: intel-xe

On Wed, May 10, 2023 at 08:47:13PM -0700, Matt Roper wrote:
>The xe_irq_postinstall() never actually gets called after installing the
>interrupt handler.  This oversight seems to get papered over due to the
>fact that the (misnamed) xe_gt_irq_postinstall does more than it really
>should and gets called in the middle of the GT initialization.

shouldn't we then fix that in the same patch too, moving the call to be
inside xe_irq?

Lucas De Marchi

>
>Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
>---
> drivers/gpu/drm/xe/xe_irq.c | 2 ++
> 1 file changed, 2 insertions(+)
>
>diff --git a/drivers/gpu/drm/xe/xe_irq.c b/drivers/gpu/drm/xe/xe_irq.c
>index 02f44b58ce3e..2549fd9fb5cd 100644
>--- a/drivers/gpu/drm/xe/xe_irq.c
>+++ b/drivers/gpu/drm/xe/xe_irq.c
>@@ -588,6 +588,8 @@ int xe_irq_install(struct xe_device *xe)
> 		return err;
> 	}
>
>+	xe_irq_postinstall(xe);
>+
> 	err = drmm_add_action_or_reset(&xe->drm, irq_uninstall, xe);
> 	if (err)
> 		return err;
>-- 
>2.40.0
>

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Intel-xe] [PATCH 19/26] drm/xe/irq: Untangle postinstall functions
  2023-05-11  3:47 ` [Intel-xe] [PATCH 19/26] drm/xe/irq: Untangle postinstall functions Matt Roper
@ 2023-05-18 18:45   ` Lucas De Marchi
  0 siblings, 0 replies; 75+ messages in thread
From: Lucas De Marchi @ 2023-05-18 18:45 UTC (permalink / raw)
  To: Matt Roper; +Cc: intel-xe

On Wed, May 10, 2023 at 08:47:15PM -0700, Matt Roper wrote:
>The callstack for postinstall is a bit muddled with top-level device
>interrupt enablement happening within platform-specific functions called
>from the per-tile xe_gt_irq_postinstall() function.  If we pull
>top-level irq enablement up to xe_irq_postinstall where we'd expect it
>to be, we can eliminate some confusing layers of indirection.
>
>Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
>---
> drivers/gpu/drm/xe/xe_irq.c | 35 +++++++----------------------------
> 1 file changed, 7 insertions(+), 28 deletions(-)
>
>diff --git a/drivers/gpu/drm/xe/xe_irq.c b/drivers/gpu/drm/xe/xe_irq.c
>index 2549fd9fb5cd..58745a5add87 100644
>--- a/drivers/gpu/drm/xe/xe_irq.c
>+++ b/drivers/gpu/drm/xe/xe_irq.c
>@@ -122,7 +122,7 @@ static inline void xelp_intr_enable(struct xe_device *xe, bool stall)
> 		xe_mmio_read32(mmio, GFX_MSTR_IRQ);
> }
>
>-static void gt_irq_postinstall(struct xe_tile *tile)
>+void xe_gt_irq_postinstall(struct xe_tile *tile)

should probably squash in HEAD~2 patch adding the call to
xe_irq_postinstall()?

this is also seems to be named wrong:  xe_gt_* receiving as first arg a
xe_tile inside a file named xe_irq.c. Suggestion to rename it to
xe_irq_postinstall_tile().


Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>


Lucas De Marchi



> {
> 	struct xe_device *xe = tile_to_xe(tile);
> 	struct xe_gt *mmio = tile->primary_gt;
>@@ -181,15 +181,6 @@ static void gt_irq_postinstall(struct xe_tile *tile)
> 	xe_mmio_write32(mmio, GUC_SG_INTR_MASK,  ~0);
> }
>
>-static void xelp_irq_postinstall(struct xe_device *xe, struct xe_tile *tile)
>-{
>-	/* TODO: PCH */
>-
>-	gt_irq_postinstall(tile);
>-
>-	xelp_intr_enable(xe, true);
>-}
>-
> static u32
> gt_engine_identity(struct xe_device *xe,
> 		   struct xe_gt *mmio,
>@@ -366,14 +357,6 @@ static void dg1_intr_enable(struct xe_device *xe, bool stall)
> 		xe_mmio_read32(mmio, DG1_MSTR_TILE_INTR);
> }
>
>-static void dg1_irq_postinstall(struct xe_device *xe, struct xe_tile *tile)
>-{
>-	gt_irq_postinstall(tile);
>-
>-	if (tile->id == 0)
>-		dg1_intr_enable(xe, true);
>-}
>-
> /*
>  * Top-level interrupt handler for Xe_LP+ and beyond.  These platforms have
>  * a "master tile" interrupt register which must be consulted before the
>@@ -512,16 +495,6 @@ static void xe_irq_reset(struct xe_device *xe)
> 	xe_display_irq_reset(xe);
> }
>
>-void xe_gt_irq_postinstall(struct xe_tile *tile)
>-{
>-	struct xe_device *xe = tile_to_xe(tile);
>-
>-	if (GRAPHICS_VERx100(xe) >= 1210)
>-		dg1_irq_postinstall(xe, tile);
>-	else
>-		xelp_irq_postinstall(xe, tile);
>-}
>-
> static void xe_irq_postinstall(struct xe_device *xe)
> {
> 	struct xe_tile *tile;
>@@ -538,6 +511,12 @@ static void xe_irq_postinstall(struct xe_device *xe)
> 	 */
> 	unmask_and_enable(xe_device_get_root_tile(xe),
> 			  GU_MISC_IRQ_OFFSET, GU_MISC_GSE);
>+
>+	/* Enable top-level interrupts */
>+	if (GRAPHICS_VERx100(xe) >= 1210)
>+		dg1_intr_enable(xe, true);
>+	else
>+		xelp_intr_enable(xe, true);
> }
>
> static irq_handler_t xe_irq_handler(struct xe_device *xe)
>-- 
>2.40.0
>

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [Intel-xe] [PATCH 20/26] drm/xe: Replace xe_gt_irq_postinstall with xe_irq_enable_hwe
  2023-05-11  3:47 ` [Intel-xe] [PATCH 20/26] drm/xe: Replace xe_gt_irq_postinstall with xe_irq_enable_hwe Matt Roper
@ 2023-05-18 19:54   ` Lucas De Marchi
  0 siblings, 0 replies; 75+ messages in thread
From: Lucas De Marchi @ 2023-05-18 19:54 UTC (permalink / raw)
  To: Matt Roper; +Cc: intel-xe

On Wed, May 10, 2023 at 08:47:16PM -0700, Matt Roper wrote:
>The majority of xe_gt_irq_postinstall() is really focused on the
>hardware engine interrupts; other GT-related interrupts such as the GuC
>are enabled/disabled independently.  Renaming the function and making it
>truly GT-specific will make it more clear what the intended focus is.
>
>Disabling/masking of other interrupts (such as GuC interrupts) is
>unnecessary since that has already happened during the irq_reset stage,
>and doing so will become harmful once the media GT is re-enabled since
>calls to xe_gt_irq_postinstall during media GT initialization would
>incorrectly disable the primary GT's GuC interrupts.
>
>Also, since this function is called from gt_fw_domain_init(), it's not
>necessary to also call it earlier during xe_irq_postinstall; just
>xe_irq_resume to handle runtime resume should be sufficient.
>
>Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
>---
> drivers/gpu/drm/xe/xe_gt.c        |  2 +-
> drivers/gpu/drm/xe/xe_hw_engine.c |  1 +
> drivers/gpu/drm/xe/xe_irq.c       | 91 ++++++++++++++++---------------
> drivers/gpu/drm/xe/xe_irq.h       |  3 +-
> 4 files changed, 50 insertions(+), 47 deletions(-)
>
>diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c
>index e00d260dff00..2a3457fb97fa 100644
>--- a/drivers/gpu/drm/xe/xe_gt.c
>+++ b/drivers/gpu/drm/xe/xe_gt.c
>@@ -303,7 +303,7 @@ static int gt_fw_domain_init(struct xe_gt *gt)
> 	gt->info.engine_mask = gt->info.__engine_mask;
>
> 	/* Enables per hw engine IRQs */
>-	xe_gt_irq_postinstall(gt_to_tile(gt));
>+	xe_irq_enable_hwe(gt);

ok, this clears up my concern on previous patch, so you can ignore my
comment there. While at it, s/Enables/Enable/ to follow the style.



>
> 	/* Rerun MCR init as we now have hw engine list */
> 	xe_gt_mcr_init(gt);
>diff --git a/drivers/gpu/drm/xe/xe_hw_engine.c b/drivers/gpu/drm/xe/xe_hw_engine.c
>index fe8af54ea8bd..5188ee268b30 100644
>--- a/drivers/gpu/drm/xe/xe_hw_engine.c
>+++ b/drivers/gpu/drm/xe/xe_hw_engine.c
>@@ -17,6 +17,7 @@
> #include "xe_gt.h"
> #include "xe_gt_topology.h"
> #include "xe_hw_fence.h"
>+#include "xe_irq.h"
> #include "xe_lrc.h"
> #include "xe_macros.h"
> #include "xe_mmio.h"
>diff --git a/drivers/gpu/drm/xe/xe_irq.c b/drivers/gpu/drm/xe/xe_irq.c
>index 58745a5add87..12919ef68cff 100644
>--- a/drivers/gpu/drm/xe/xe_irq.c
>+++ b/drivers/gpu/drm/xe/xe_irq.c
>@@ -122,13 +122,14 @@ static inline void xelp_intr_enable(struct xe_device *xe, bool stall)
> 		xe_mmio_read32(mmio, GFX_MSTR_IRQ);
> }
>
>-void xe_gt_irq_postinstall(struct xe_tile *tile)
>+/* Enable/unmask the HWE interrupts for a specific GT's engines. */
>+void xe_irq_enable_hwe(struct xe_gt *gt)
> {
>-	struct xe_device *xe = tile_to_xe(tile);
>-	struct xe_gt *mmio = tile->primary_gt;
>+	struct xe_device *xe = gt_to_xe(gt);
>+	u32 ccs_mask, bcs_mask;
> 	u32 irqs, dmask, smask;
>-	u32 ccs_mask = xe_hw_engine_mask_per_class(tile->primary_gt, XE_ENGINE_CLASS_COMPUTE);
>-	u32 bcs_mask = xe_hw_engine_mask_per_class(tile->primary_gt, XE_ENGINE_CLASS_COPY);
>+	if (!gt)
>+		return;

I don't follow why this is needed... It seems all the callers have
already used gt or we are looping through the gt's with for_each_gt()
which already does the check.


>
> 	if (xe_device_guc_submission_enabled(xe)) {
> 		irqs = GT_RENDER_USER_INTERRUPT |
>@@ -140,45 +141,44 @@ void xe_gt_irq_postinstall(struct xe_tile *tile)
> 		       GT_WAIT_SEMAPHORE_INTERRUPT;
> 	}
>
>+	ccs_mask = xe_hw_engine_mask_per_class(gt, XE_ENGINE_CLASS_COMPUTE);
>+	bcs_mask = xe_hw_engine_mask_per_class(gt, XE_ENGINE_CLASS_COPY);
>+
> 	dmask = irqs << 16 | irqs;
> 	smask = irqs << 16;
>
>-	/* Enable RCS, BCS, VCS and VECS class interrupts. */
>-	xe_mmio_write32(mmio, RENDER_COPY_INTR_ENABLE, dmask);
>-	xe_mmio_write32(mmio, VCS_VECS_INTR_ENABLE, dmask);
>-	if (ccs_mask)
>-		xe_mmio_write32(mmio, CCS_RSVD_INTR_ENABLE, smask);
>+	if (!xe_gt_is_media_type(gt)) {
>+		/* Enable classes */

classes?

also, from the look at this function it seems media should be in a
complete different function?

Lucas De Marchi

>+		xe_mmio_write32(gt, RENDER_COPY_INTR_ENABLE, dmask);
>+		if (ccs_mask)
>+			xe_mmio_write32(gt, CCS_RSVD_INTR_ENABLE, smask);
>+
>+		/* Unmask instances */
>+		xe_mmio_write32(gt, RCS0_RSVD_INTR_MASK, ~smask);
>+		xe_mmio_write32(gt, BCS_RSVD_INTR_MASK, ~smask);
>+		if (bcs_mask & (BIT(1)|BIT(2)))
>+			xe_mmio_write32(gt, XEHPC_BCS1_BCS2_INTR_MASK, ~dmask);
>+		if (bcs_mask & (BIT(3)|BIT(4)))
>+			xe_mmio_write32(gt, XEHPC_BCS3_BCS4_INTR_MASK, ~dmask);
>+		if (bcs_mask & (BIT(5)|BIT(6)))
>+			xe_mmio_write32(gt, XEHPC_BCS5_BCS6_INTR_MASK, ~dmask);
>+		if (bcs_mask & (BIT(7)|BIT(8)))
>+			xe_mmio_write32(gt, XEHPC_BCS7_BCS8_INTR_MASK, ~dmask);
>+		if (ccs_mask & (BIT(0)|BIT(1)))
>+			xe_mmio_write32(gt, CCS0_CCS1_INTR_MASK, ~dmask);
>+		if (ccs_mask & (BIT(2)|BIT(3)))
>+			xe_mmio_write32(gt,  CCS2_CCS3_INTR_MASK, ~dmask);
>+	}
>
>-	/* Unmask irqs on RCS, BCS, VCS and VECS engines. */
>-	xe_mmio_write32(mmio, RCS0_RSVD_INTR_MASK, ~smask);
>-	xe_mmio_write32(mmio, BCS_RSVD_INTR_MASK, ~smask);
>-	if (bcs_mask & (BIT(1)|BIT(2)))
>-		xe_mmio_write32(mmio, XEHPC_BCS1_BCS2_INTR_MASK, ~dmask);
>-	if (bcs_mask & (BIT(3)|BIT(4)))
>-		xe_mmio_write32(mmio, XEHPC_BCS3_BCS4_INTR_MASK, ~dmask);
>-	if (bcs_mask & (BIT(5)|BIT(6)))
>-		xe_mmio_write32(mmio, XEHPC_BCS5_BCS6_INTR_MASK, ~dmask);
>-	if (bcs_mask & (BIT(7)|BIT(8)))
>-		xe_mmio_write32(mmio, XEHPC_BCS7_BCS8_INTR_MASK, ~dmask);
>-	xe_mmio_write32(mmio, VCS0_VCS1_INTR_MASK, ~dmask);
>-	xe_mmio_write32(mmio, VCS2_VCS3_INTR_MASK, ~dmask);
>-	xe_mmio_write32(mmio, VECS0_VECS1_INTR_MASK, ~dmask);
>-	if (ccs_mask & (BIT(0)|BIT(1)))
>-		xe_mmio_write32(mmio, CCS0_CCS1_INTR_MASK, ~dmask);
>-	if (ccs_mask & (BIT(2)|BIT(3)))
>-		xe_mmio_write32(mmio,  CCS2_CCS3_INTR_MASK, ~dmask);
>+	if (xe_gt_is_media_type(gt) || MEDIA_VER(xe) < 13) {
>+		/* Enable classes */
>+		xe_mmio_write32(gt, VCS_VECS_INTR_ENABLE, dmask);
>
>-	/*
>-	 * RPS interrupts will get enabled/disabled on demand when RPS itself
>-	 * is enabled/disabled.
>-	 */
>-	/* TODO: gt->pm_ier, gt->pm_imr */
>-	xe_mmio_write32(mmio, GPM_WGBOXPERF_INTR_ENABLE, 0);
>-	xe_mmio_write32(mmio, GPM_WGBOXPERF_INTR_MASK,  ~0);
>-
>-	/* Same thing for GuC interrupts */
>-	xe_mmio_write32(mmio, GUC_SG_INTR_ENABLE, 0);
>-	xe_mmio_write32(mmio, GUC_SG_INTR_MASK,  ~0);
>+		/* Unmask instances */
>+		xe_mmio_write32(gt, VCS0_VCS1_INTR_MASK, ~dmask);
>+		xe_mmio_write32(gt, VCS2_VCS3_INTR_MASK, ~dmask);
>+		xe_mmio_write32(gt, VECS0_VECS1_INTR_MASK, ~dmask);
>+	}
> }
>
> static u32
>@@ -497,12 +497,6 @@ static void xe_irq_reset(struct xe_device *xe)
>
> static void xe_irq_postinstall(struct xe_device *xe)
> {
>-	struct xe_tile *tile;
>-	u8 id;
>-
>-	for_each_tile(tile, xe, id)
>-		xe_gt_irq_postinstall(tile);
>-
> 	xe_display_irq_postinstall(xe, xe_primary_mmio_gt(xe));
>
> 	/*
>@@ -591,9 +585,16 @@ void xe_irq_suspend(struct xe_device *xe)
>
> void xe_irq_resume(struct xe_device *xe)
> {
>+	struct xe_gt *gt;
>+	int id;
>+
> 	spin_lock_irq(&xe->irq.lock);
> 	xe->irq.enabled = true;
> 	xe_irq_reset(xe);
> 	xe_irq_postinstall(xe);
>+
>+	for_each_gt(gt, xe, id)
>+		xe_irq_enable_hwe(gt);
>+
> 	spin_unlock_irq(&xe->irq.lock);
> }
>diff --git a/drivers/gpu/drm/xe/xe_irq.h b/drivers/gpu/drm/xe/xe_irq.h
>index 69113c21e1cd..bc42bc90d967 100644
>--- a/drivers/gpu/drm/xe/xe_irq.h
>+++ b/drivers/gpu/drm/xe/xe_irq.h
>@@ -8,11 +8,12 @@
>
> struct xe_device;
> struct xe_tile;
>+struct xe_gt;
>
> int xe_irq_install(struct xe_device *xe);
>-void xe_gt_irq_postinstall(struct xe_tile *tile);
> void xe_irq_shutdown(struct xe_device *xe);
> void xe_irq_suspend(struct xe_device *xe);
> void xe_irq_resume(struct xe_device *xe);
>+void xe_irq_enable_hwe(struct xe_gt *gt);
>
> #endif
>-- 
>2.40.0
>

^ permalink raw reply	[flat|nested] 75+ messages in thread

end of thread, other threads:[~2023-05-18 19:54 UTC | newest]

Thread overview: 75+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-05-11  3:46 [Intel-xe] [PATCH 00/26] Separate GT and tile Matt Roper
2023-05-11  3:46 ` [Intel-xe] [PATCH 01/26] drm/xe/mtl: Disable media GT Matt Roper
2023-05-11 20:50   ` Matt Atwood
2023-05-11 23:29   ` Lucas De Marchi
2023-05-12 15:38     ` Matt Roper
2023-05-11  3:46 ` [Intel-xe] [PATCH 02/26] drm/xe: Introduce xe_tile Matt Roper
2023-05-11  5:46   ` Lucas De Marchi
2023-05-12  5:33   ` Iddamsetty, Aravind
2023-05-12 16:27     ` Matt Roper
2023-05-12  5:45   ` Iddamsetty, Aravind
2023-05-18 17:35   ` Rodrigo Vivi
2023-05-11  3:46 ` [Intel-xe] [PATCH 03/26] drm/xe: Add backpointer from gt to tile Matt Roper
2023-05-11 21:10   ` Matt Atwood
2023-05-12  0:07   ` Lucas De Marchi
2023-05-12 16:20     ` Matt Roper
2023-05-12 16:31       ` Matt Atwood
2023-05-12 17:00         ` Matt Roper
2023-05-11  3:47 ` [Intel-xe] [PATCH 04/26] drm/xe: Add for_each_tile iterator Matt Roper
2023-05-11 23:23   ` Lucas De Marchi
2023-05-12  5:45   ` Iddamsetty, Aravind
2023-05-12 16:28     ` Matt Roper
2023-05-11  3:47 ` [Intel-xe] [PATCH 05/26] drm/xe: Move register MMIO into xe_tile Matt Roper
2023-05-11 12:20   ` Jani Nikula
2023-05-11 22:01     ` Lucas De Marchi
2023-05-13  5:53   ` Lucas De Marchi
2023-05-11  3:47 ` [Intel-xe] [PATCH 06/26] drm/xe: Move VRAM from GT to tile Matt Roper
2023-05-15 22:40   ` Lucas De Marchi
2023-05-18 17:29     ` Rodrigo Vivi
2023-05-11  3:47 ` [Intel-xe] [PATCH 07/26] drm/xe: Memory allocations are tile-based, not GT-based Matt Roper
2023-05-17  4:56   ` Lucas De Marchi
2023-05-11  3:47 ` [Intel-xe] [PATCH 08/26] drm/xe: Move migration from GT to tile Matt Roper
2023-05-17  5:00   ` Lucas De Marchi
2023-05-11  3:47 ` [Intel-xe] [PATCH 09/26] drm/xe: Clarify 'gt' retrieval for primary tile Matt Roper
2023-05-17  5:07   ` Lucas De Marchi
2023-05-11  3:47 ` [Intel-xe] [PATCH 10/26] drm/xe: Drop vram_id Matt Roper
2023-05-17  5:09   ` Lucas De Marchi
2023-05-11  3:47 ` [Intel-xe] [PATCH 11/26] drm/xe: Drop extra_gts[] declarations and XE_GT_TYPE_REMOTE Matt Roper
2023-05-17  5:14   ` Lucas De Marchi
2023-05-11  3:47 ` [Intel-xe] [PATCH 12/26] drm/xe: Allocate GT dynamically Matt Roper
2023-05-17  5:23   ` Lucas De Marchi
2023-05-11  3:47 ` [Intel-xe] [PATCH 13/26] drm/xe: Add media GT to tile Matt Roper
2023-05-18 17:50   ` Rodrigo Vivi
2023-05-11  3:47 ` [Intel-xe] [PATCH 14/26] drm/xe: Move display IRQ postinstall out of GT function Matt Roper
2023-05-18 17:51   ` Rodrigo Vivi
2023-05-18 18:20   ` Lucas De Marchi
2023-05-11  3:47 ` [Intel-xe] [PATCH 15/26] drm/xe: Interrupts are delivered per-tile, not per-GT Matt Roper
2023-05-11 12:14   ` Iddamsetty, Aravind
2023-05-11 13:50     ` Matt Roper
2023-05-18 18:30   ` Lucas De Marchi
2023-05-11  3:47 ` [Intel-xe] [PATCH 16/26] drm/xe/irq: Handle ASLE backlight interrupts at same time as display Matt Roper
2023-05-18 18:33   ` Lucas De Marchi
2023-05-11  3:47 ` [Intel-xe] [PATCH 17/26] drm/xe/irq: Actually call xe_irq_postinstall() Matt Roper
2023-05-18 18:40   ` Lucas De Marchi
2023-05-11  3:47 ` [Intel-xe] [PATCH 18/26] drm/xe/irq: Ensure primary GuC won't clobber media GuC's interrupt mask Matt Roper
2023-05-11  3:47 ` [Intel-xe] [PATCH 19/26] drm/xe/irq: Untangle postinstall functions Matt Roper
2023-05-18 18:45   ` Lucas De Marchi
2023-05-11  3:47 ` [Intel-xe] [PATCH 20/26] drm/xe: Replace xe_gt_irq_postinstall with xe_irq_enable_hwe Matt Roper
2023-05-18 19:54   ` Lucas De Marchi
2023-05-11  3:47 ` [Intel-xe] [PATCH 21/26] drm/xe: Invalidate TLB on all affected GTs during GGTT updates Matt Roper
2023-05-11  3:47 ` [Intel-xe] [PATCH 22/26] drm/xe/tlb: Obtain forcewake when doing GGTT TLB invalidations Matt Roper
2023-05-11  3:47 ` [Intel-xe] [PATCH 23/26] drm/xe: Allow GT looping and lookup on standalone media Matt Roper
2023-05-11  3:47 ` [Intel-xe] [PATCH 24/26] drm/xe: Update query uapi to support " Matt Roper
2023-05-11  3:47 ` [Intel-xe] [PATCH 25/26] drm/xe: Reinstate media GT support Matt Roper
2023-05-11  3:47 ` [Intel-xe] [PATCH 26/26] drm/xe: Clarify source of GT log messages Matt Roper
2023-05-17  9:33   ` Michal Wajdeczko
2023-05-11  3:50 ` [Intel-xe] ✓ CI.Patch_applied: success for Separate GT and tile Patchwork
2023-05-11  3:51 ` [Intel-xe] ✗ CI.KUnit: failure " Patchwork
2023-05-11  7:08 ` [Intel-xe] ✓ CI.Patch_applied: success for Separate GT and tile (rev2) Patchwork
2023-05-11  7:10 ` [Intel-xe] ✗ CI.KUnit: failure " Patchwork
2023-05-12  7:21 ` [Intel-xe] ✓ CI.Patch_applied: success " Patchwork
2023-05-12  7:23 ` [Intel-xe] ✗ CI.KUnit: failure " Patchwork
2023-05-15 13:08 ` [Intel-xe] [PATCH 00/26] Separate GT and tile Thomas Hellström
2023-05-15 18:11   ` Matt Roper
2023-05-16 14:18 ` Das, Nirmoy
2023-05-18 17:47 ` Rodrigo Vivi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox