From: Matthew Auld <matthew.auld@intel.com>
To: Lucas De Marchi <lucas.demarchi@intel.com>,
intel-xe@lists.freedesktop.org
Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com>,
Rodrigo Vivi <rodrigo.vivi@intel.com>,
Badal Nilawar <badal.nilawar@intel.com>,
Stuart Summers <stuart.summers@intel.com>
Subject: Re: [PATCH v5 3/4] drm/xe: Split xe_device_td_flush()
Date: Fri, 20 Jun 2025 12:01:08 +0100 [thread overview]
Message-ID: <c6f0b60a-2e70-405f-9be5-e5f270c1ffbe@intel.com> (raw)
In-Reply-To: <20250618-wa-22019338487-v5-3-b888388477f2@intel.com>
On 18/06/2025 19:50, Lucas De Marchi wrote:
> xe_device_td_flush() has 2 possible implementations: an entire L2 flush
> or a transient flush, depending on WA 16023588340. Make this clear by
> splitting the function so it calls each of them.
>
> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
> ---
> drivers/gpu/drm/xe/xe_device.c | 68 +++++++++++++++++++++++++-----------------
> 1 file changed, 40 insertions(+), 28 deletions(-)
>
> diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
> index 8cfcfff250ca5..8396612b68d4b 100644
> --- a/drivers/gpu/drm/xe/xe_device.c
> +++ b/drivers/gpu/drm/xe/xe_device.c
> @@ -981,38 +981,15 @@ void xe_device_wmb(struct xe_device *xe)
> xe_mmio_write32(xe_root_tile_mmio(xe), VF_CAP_REG, 0);
> }
>
> -/**
> - * xe_device_td_flush() - Flush transient L3 cache entries
> - * @xe: The device
> - *
> - * Display engine has direct access to memory and is never coherent with L3/L4
> - * caches (or CPU caches), however KMD is responsible for specifically flushing
> - * transient L3 GPU cache entries prior to the flip sequence to ensure scanout
> - * can happen from such a surface without seeing corruption.
> - *
> - * Display surfaces can be tagged as transient by mapping it using one of the
> - * various L3:XD PAT index modes on Xe2.
> - *
> - * Note: On non-discrete xe2 platforms, like LNL, the entire L3 cache is flushed
> - * at the end of each submission via PIPE_CONTROL for compute/render, since SA
> - * Media is not coherent with L3 and we want to support render-vs-media
> - * usescases. For other engines like copy/blt the HW internally forces uncached
> - * behaviour, hence why we can skip the TDF on such platforms.
> +/*
> + * Issue a TRANSIENT_FLUSH_REQUEST and wait for completion on each gt.
> */
> -void xe_device_td_flush(struct xe_device *xe)
> +static void tdf_request_sync(struct xe_device *xe)
> {
> - struct xe_gt *gt;
> unsigned int fw_ref;
> + struct xe_gt *gt;
> u8 id;
>
> - if (!IS_DGFX(xe) || GRAPHICS_VER(xe) < 20)
> - return;
> -
> - if (XE_WA(xe_root_mmio_gt(xe), 16023588340)) {
> - xe_device_l2_flush(xe);
> - return;
> - }
> -
> for_each_gt(gt, xe, id) {
> if (xe_gt_is_media_type(gt))
> continue;
> @@ -1022,6 +999,7 @@ void xe_device_td_flush(struct xe_device *xe)
> return;
>
> xe_mmio_write32(>->mmio, XE2_TDF_CTRL, TRANSIENT_FLUSH_REQUEST);
> +
> /*
> * FIXME: We can likely do better here with our choice of
> * timeout. Currently we just assume the worst case, i.e. 150us,
> @@ -1052,15 +1030,49 @@ void xe_device_l2_flush(struct xe_device *xe)
> return;
>
> spin_lock(>->global_invl_lock);
> - xe_mmio_write32(>->mmio, XE2_GLOBAL_INVAL, 0x1);
>
> + xe_mmio_write32(>->mmio, XE2_GLOBAL_INVAL, 0x1);
> if (xe_mmio_wait32(>->mmio, XE2_GLOBAL_INVAL, 0x1, 0x0, 500, NULL, true))
> xe_gt_err_once(gt, "Global invalidation timeout\n");
> +
> spin_unlock(>->global_invl_lock);
>
> xe_force_wake_put(gt_to_fw(gt), fw_ref);
> }
>
> +/**
> + * xe_device_td_flush() - Flush transient L3 cache entries
> + * @xe: The device
> + *
> + * Display engine has direct access to memory and is never coherent with L3/L4
> + * caches (or CPU caches), however KMD is responsible for specifically flushing
> + * transient L3 GPU cache entries prior to the flip sequence to ensure scanout
> + * can happen from such a surface without seeing corruption.
> + *
> + * Display surfaces can be tagged as transient by mapping it using one of the
> + * various L3:XD PAT index modes on Xe2.
> + *
> + * Note: On non-discrete xe2 platforms, like LNL, the entire L3 cache is flushed
> + * at the end of each submission via PIPE_CONTROL for compute/render, since SA
> + * Media is not coherent with L3 and we want to support render-vs-media
> + * usescases. For other engines like copy/blt the HW internally forces uncached
> + * behaviour, hence why we can skip the TDF on such platforms.
> + */
> +void xe_device_td_flush(struct xe_device *xe)
> +{
> + struct xe_gt *root_gt;
> +
> + if (!IS_DGFX(xe) || GRAPHICS_VER(xe) < 20)
> + return;
> +
> + root_gt = xe_root_mmio_gt(xe);
> + if (XE_WA(root_gt, 16023588340))
> + /* A transient flush is not sufficient: flush the L2 */
> + xe_device_l2_flush(xe);
> + else
> + tdf_request_sync(xe);
> +}
> +
> u32 xe_device_ccs_bytes(struct xe_device *xe, u64 size)
> {
> return xe_device_has_flat_ccs(xe) ?
>
next prev parent reply other threads:[~2025-06-20 11:01 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-06-18 18:49 [PATCH v5 0/4] drm/xe: Update Wa_22019338487 Lucas De Marchi
2025-06-18 18:49 ` [PATCH v5 1/4] drm/xe/guc_pc: Add _locked variant for min/max freq Lucas De Marchi
2025-06-18 18:49 ` [PATCH v5 2/4] drm/xe/xe_guc_pc: Lock once to update stashed frequencies Lucas De Marchi
2025-06-18 18:50 ` [PATCH v5 3/4] drm/xe: Split xe_device_td_flush() Lucas De Marchi
2025-06-20 11:01 ` Matthew Auld [this message]
2025-06-18 18:50 ` [PATCH v5 4/4] drm/xe/bmg: Update Wa_22019338487 Lucas De Marchi
2025-06-20 13:44 ` Rodrigo Vivi
2025-06-18 18:56 ` ✗ CI.checkpatch: warning for drm/xe: Update Wa_22019338487 (rev2) Patchwork
2025-06-18 18:59 ` ✓ CI.KUnit: success " Patchwork
2025-06-18 19:56 ` ✓ Xe.CI.BAT: " Patchwork
2025-06-19 8:54 ` ✗ Xe.CI.Full: failure " Patchwork
2025-06-24 17:23 ` [PATCH v5 0/4] drm/xe: Update Wa_22019338487 Lucas De Marchi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=c6f0b60a-2e70-405f-9be5-e5f270c1ffbe@intel.com \
--to=matthew.auld@intel.com \
--cc=badal.nilawar@intel.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=lucas.demarchi@intel.com \
--cc=rodrigo.vivi@intel.com \
--cc=stuart.summers@intel.com \
--cc=vinay.belgaumkar@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.