From: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
To: intel-xe@lists.freedesktop.org
Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>,
"Pallavi Mishra" <pallavi.mishra@intel.com>,
"Matthew Auld" <matthew.auld@intel.com>,
dri-devel@lists.freedesktop.org,
"Joonas Lahtinen" <joonas.lahtinen@linux.intel.com>,
"Effie Yu" <effie.yu@intel.com>,
"Matthew Brost" <matthew.brost@intel.com>,
"Maarten Lankhorst" <maarten.lankhorst@linux.intel.com>,
"Jose Souza" <jose.souza@intel.com>,
"Michal Mrozek" <michal.mrozek@intel.com>,
stable@vger.kernel.org
Subject: [PATCH] drm/xe: Use write-back caching mode for system memory on DGFX
Date: Wed, 19 Jun 2024 18:39:04 +0200 [thread overview]
Message-ID: <20240619163904.2935-1-thomas.hellstrom@linux.intel.com> (raw)
The caching mode for buffer objects with VRAM as a possible
placement was forced to write-combined, regardless of placement.
However, write-combined system memory is expensive to allocate and
even though it is pooled, the pool is expensive to shrink, since
it involves global CPU TLB flushes.
Moreover write-combined system memory from TTM is only reliably
available on x86 and DGFX doesn't have an x86 restriction.
So regardless of the cpu caching mode selected for a bo,
internally use write-back caching mode for system memory on DGFX.
Coherency is maintained, but user-space clients may perceive a
difference in cpu access speeds.
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Fixes: 622f709ca629 ("drm/xe/uapi: Add support for CPU caching mode")
Cc: Pallavi Mishra <pallavi.mishra@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: dri-devel@lists.freedesktop.org
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Effie Yu <effie.yu@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Jose Souza <jose.souza@intel.com>
Cc: Michal Mrozek <michal.mrozek@intel.com>
Cc: <stable@vger.kernel.org> # v6.8+
---
drivers/gpu/drm/xe/xe_bo.c | 47 +++++++++++++++++++-------------
drivers/gpu/drm/xe/xe_bo_types.h | 3 +-
include/uapi/drm/xe_drm.h | 8 +++++-
3 files changed, 37 insertions(+), 21 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
index 65c696966e96..31192d983d9e 100644
--- a/drivers/gpu/drm/xe/xe_bo.c
+++ b/drivers/gpu/drm/xe/xe_bo.c
@@ -343,7 +343,7 @@ static struct ttm_tt *xe_ttm_tt_create(struct ttm_buffer_object *ttm_bo,
struct xe_device *xe = xe_bo_device(bo);
struct xe_ttm_tt *tt;
unsigned long extra_pages;
- enum ttm_caching caching;
+ enum ttm_caching caching = ttm_cached;
int err;
tt = kzalloc(sizeof(*tt), GFP_KERNEL);
@@ -357,26 +357,35 @@ static struct ttm_tt *xe_ttm_tt_create(struct ttm_buffer_object *ttm_bo,
extra_pages = DIV_ROUND_UP(xe_device_ccs_bytes(xe, bo->size),
PAGE_SIZE);
- switch (bo->cpu_caching) {
- case DRM_XE_GEM_CPU_CACHING_WC:
- caching = ttm_write_combined;
- break;
- default:
- caching = ttm_cached;
- break;
- }
-
- WARN_ON((bo->flags & XE_BO_FLAG_USER) && !bo->cpu_caching);
-
/*
- * Display scanout is always non-coherent with the CPU cache.
- *
- * For Xe_LPG and beyond, PPGTT PTE lookups are also non-coherent and
- * require a CPU:WC mapping.
+ * DGFX system memory is always WB / ttm_cached, since
+ * other caching modes are only supported on x86. DGFX
+ * GPU system memory accesses are always coherent with the
+ * CPU.
*/
- if ((!bo->cpu_caching && bo->flags & XE_BO_FLAG_SCANOUT) ||
- (xe->info.graphics_verx100 >= 1270 && bo->flags & XE_BO_FLAG_PAGETABLE))
- caching = ttm_write_combined;
+ if (!IS_DGFX(xe)) {
+ switch (bo->cpu_caching) {
+ case DRM_XE_GEM_CPU_CACHING_WC:
+ caching = ttm_write_combined;
+ break;
+ default:
+ caching = ttm_cached;
+ break;
+ }
+
+ WARN_ON((bo->flags & XE_BO_FLAG_USER) && !bo->cpu_caching);
+
+ /*
+ * Display scanout is always non-coherent with the CPU cache.
+ *
+ * For Xe_LPG and beyond, PPGTT PTE lookups are also
+ * non-coherent and require a CPU:WC mapping.
+ */
+ if ((!bo->cpu_caching && bo->flags & XE_BO_FLAG_SCANOUT) ||
+ (xe->info.graphics_verx100 >= 1270 &&
+ bo->flags & XE_BO_FLAG_PAGETABLE))
+ caching = ttm_write_combined;
+ }
if (bo->flags & XE_BO_FLAG_NEEDS_UC) {
/*
diff --git a/drivers/gpu/drm/xe/xe_bo_types.h b/drivers/gpu/drm/xe/xe_bo_types.h
index 86422e113d39..10450f1fbbde 100644
--- a/drivers/gpu/drm/xe/xe_bo_types.h
+++ b/drivers/gpu/drm/xe/xe_bo_types.h
@@ -66,7 +66,8 @@ struct xe_bo {
/**
* @cpu_caching: CPU caching mode. Currently only used for userspace
- * objects.
+ * objects. Exceptions are system memory on DGFX, which is always
+ * WB.
*/
u16 cpu_caching;
diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
index 93e00be44b2d..1189b3044723 100644
--- a/include/uapi/drm/xe_drm.h
+++ b/include/uapi/drm/xe_drm.h
@@ -783,7 +783,13 @@ struct drm_xe_gem_create {
#define DRM_XE_GEM_CPU_CACHING_WC 2
/**
* @cpu_caching: The CPU caching mode to select for this object. If
- * mmaping the object the mode selected here will also be used.
+ * mmaping the object the mode selected here will also be used. The
+ * exception is when mapping system memory (including evicted
+ * system memory) on discrete GPUs. The caching mode selected will
+ * then be overridden to DRM_XE_GEM_CPU_CACHING_WB, and coherency
+ * between GPU- and CPU is guaranteed. The caching mode of
+ * existing CPU-mappings will be updated transparently to
+ * user-space clients.
*/
__u16 cpu_caching;
/** @pad: MBZ */
--
2.44.0
next reply other threads:[~2024-06-19 16:39 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-19 16:39 Thomas Hellström [this message]
2024-06-19 17:28 ` ✓ CI.Patch_applied: success for drm/xe: Use write-back caching mode for system memory on DGFX Patchwork
2024-06-19 17:28 ` ✓ CI.checkpatch: " Patchwork
2024-06-19 17:29 ` ✓ CI.KUnit: " Patchwork
2024-06-19 17:42 ` ✓ CI.Build: " Patchwork
2024-06-19 17:46 ` ✗ CI.Hooks: failure " Patchwork
2024-06-19 17:48 ` ✓ CI.checksparse: success " Patchwork
2024-06-19 18:07 ` ✗ CI.BAT: failure " Patchwork
2024-06-19 19:30 ` ✓ CI.Patch_applied: success for drm/xe: Use write-back caching mode for system memory on DGFX (rev2) Patchwork
2024-06-19 19:30 ` ✓ CI.checkpatch: " Patchwork
2024-06-19 19:31 ` ✓ CI.KUnit: " Patchwork
2024-06-19 19:43 ` ✓ CI.Build: " Patchwork
2024-06-19 19:45 ` ✗ CI.Hooks: failure " Patchwork
2024-06-19 19:46 ` ✓ CI.checksparse: success " Patchwork
2024-06-19 20:08 ` ✓ CI.BAT: " Patchwork
2024-06-19 23:26 ` ✗ CI.FULL: failure for drm/xe: Use write-back caching mode for system memory on DGFX Patchwork
2024-06-20 4:42 ` ✗ CI.FULL: failure for drm/xe: Use write-back caching mode for system memory on DGFX (rev2) Patchwork
2024-06-20 9:03 ` [PATCH] drm/xe: Use write-back caching mode for system memory on DGFX Matthew Auld
2024-06-24 21:37 ` Matt Roper
2024-07-02 13:56 ` Souza, Jose
2024-07-02 15:59 ` Rodrigo Vivi
2024-07-04 7:48 ` Mrozek, Michal
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240619163904.2935-1-thomas.hellstrom@linux.intel.com \
--to=thomas.hellstrom@linux.intel.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=effie.yu@intel.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=joonas.lahtinen@linux.intel.com \
--cc=jose.souza@intel.com \
--cc=maarten.lankhorst@linux.intel.com \
--cc=matthew.auld@intel.com \
--cc=matthew.brost@intel.com \
--cc=michal.mrozek@intel.com \
--cc=pallavi.mishra@intel.com \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.