Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
To: "Souza, Jose" <jose.souza@intel.com>,
	"intel-xe@lists.freedesktop.org" <intel-xe@lists.freedesktop.org>
Cc: "kernel-dev@igalia.com" <kernel-dev@igalia.com>
Subject: Re: [PATCH v7 11/24] drm/xe: Flush L3 when flushing render cache
Date: Mon, 30 Jun 2025 13:43:10 +0100	[thread overview]
Message-ID: <bdcc81af-4eba-4129-a945-e45a71f79285@igalia.com> (raw)
In-Reply-To: <3bbea88dc4e95e6e976f0143a38ae40b813a9728.camel@intel.com>


On 27/06/2025 19:57, Souza, Jose wrote:
> On Fri, 2025-06-27 at 11:23 -0700, José Roberto de Souza wrote:
>> On Fri, 2025-06-27 at 14:33 +0100, Tvrtko Ursulin wrote:
>>> I915 sets PIPE_CONTROL_FLUSH_L3 (bit 27) when flushing render caches but
>>> interesting thing is Tigerlake PRM lists that bit as reserved.
>>>
>>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
>>> ---
>>> Is xe missing this? Or has this been wrong for so long in i915? Or is this
>>> an undocumented bit?
>>> ---
>>>   drivers/gpu/drm/xe/instructions/xe_gpu_commands.h |  1 +
>>>   drivers/gpu/drm/xe/xe_ring_ops.c                  | 10 ++++++++++
>>>   2 files changed, 11 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/xe/instructions/xe_gpu_commands.h b/drivers/gpu/drm/xe/instructions/xe_gpu_commands.h
>>> index 78c0e87dbd37..27892984403c 100644
>>> --- a/drivers/gpu/drm/xe/instructions/xe_gpu_commands.h
>>> +++ b/drivers/gpu/drm/xe/instructions/xe_gpu_commands.h
>>> @@ -47,6 +47,7 @@
>>>   
>>>   #define   PIPE_CONTROL_COMMAND_CACHE_INVALIDATE		(1<<29)
>>>   #define   PIPE_CONTROL_TILE_CACHE_FLUSH			(1<<28)
>>> +#define   PIPE_CONTROL_FLUSH_L3                         (1<<27)
>>
>> On spec this bit is Protected Memory Disable. I think what you want is bit 30.
> 
> That is also wrong on i915 oO.

Yeah, most of the series is me copying "reference" implementation from i915.

But you are right, bit 27 appears wrong, and the history of the whole 
area is a bit confusing.

The existing PIPE_CONTROL_FLUSH_L3 definition was added back in 2015 in:

0160f055393f ("drm/i915/gen8: Add WaClearSlmSpaceAtContextSwitch 
workaround")

But if I look at the public PRMs for BDW and CHV bit 27 is reserved MBZ. 
Bit 30 is the same. Was it something undocumented? I don't know.

Furthermore, the public PRM has no mention of 
WaClearSlmSpaceAtContextSwitch. It has this 
WaFlushCoherentL3CacheLinesAtContextSwitch which is not quite the thing:

"""
Coherent L3 cache lines are not getting flushed during context switch 
which is causing
issues like corruption. Need to set bit 21 of MMIO b118, then send PC 
with DC flush and
then reset bit 21 of MMIO b118. This programming sequence needs to be 
part of the indirect
context WA BB.
"""

This WA was later extnded to KBL:

066d46288851 ("drm/i915/kbl: Add WaClearSlmSpaceAtContextSwitch")

But KBL public PRM is the same - I don't see any WA which mention SLM or 
bit 30. And bits 27 and 30 are still reserved.

There is a new bit 26 (Flush LLC) on SKL and KBL but we do not use it or 
define it.

That also feels odd.

And finally L3 flushing was added to the plain ring programming in:

0c7c0c8e6f09 ("drm/i915/gen12: Flush L3")

As you discovered this is most likely wrong since bspec say bit 30 on 
TGL is L3 Fabric Flush.

So more questions than answers and it looks access to internal docs will 
be required to get to the bottom of it all.

> Will give some testing here and submit a i915 patch.

I at least tried changing it to bit 30 on ADL but that on it's own 
appeared to make no positive improvement to the transient cache dirt in 
some AuxCCS IGTs. It still requires the large hammer of "drm/xe: Force 
flush system memory AuxCCS framebuffers before scan out".

I am curious to what your testing will show with the use cases you have 
in mind.

Regards,

Tvrtko

>>
>>>   #define   PIPE_CONTROL_AMFS_FLUSH			(1<<25)
>>>   #define   PIPE_CONTROL_GLOBAL_GTT_IVB			(1<<24)
>>>   #define   PIPE_CONTROL_LRI_POST_SYNC			BIT(23)
>>> diff --git a/drivers/gpu/drm/xe/xe_ring_ops.c b/drivers/gpu/drm/xe/xe_ring_ops.c
>>> index a1289f086191..8f655b6fe913 100644
>>> --- a/drivers/gpu/drm/xe/xe_ring_ops.c
>>> +++ b/drivers/gpu/drm/xe/xe_ring_ops.c
>>> @@ -197,6 +197,16 @@ static int emit_render_cache_flush(struct xe_sched_job *job, u32 *dw, int i)
>>>   	if (XE_WA(gt, 1409600907))
>>>   		flags |= PIPE_CONTROL_DEPTH_STALL;
>>>   
>>> +	/*
>>> +	 * L3 fabric flush is needed for AUX CCS invalidation
>>> +	 * which happens as part of pipe-control so we can
>>> +	 * ignore PIPE_CONTROL_FLUSH_L3. Also PIPE_CONTROL_FLUSH_L3
>>> +	 * deals with Protected Memory which is not needed for
>>> +	 * AUX CCS invalidation and lead to unwanted side effects.
>>> +	 */
>>> +	if (GRAPHICS_VERx100(xe) < 1270)
>>> +		flags |= PIPE_CONTROL_FLUSH_L3;
>>> +
>>>   	if (lacks_render)
>>>   		flags &= ~PIPE_CONTROL_3D_ARCH_FLAGS;
>>>   	else if (job->q->class == XE_ENGINE_CLASS_COMPUTE)


  reply	other threads:[~2025-06-30 12:43 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-27 13:33 [PATCH v7 00/24] AuxCCS handling and render compression modifiers Tvrtko Ursulin
2025-06-27 13:33 ` [PATCH v7 01/24] drm/xe: Consolidate LRC offset calculations Tvrtko Ursulin
2025-06-27 13:33 ` [PATCH v7 02/24] drm/xe: Generalize wa bb emission code Tvrtko Ursulin
2025-06-27 13:33 ` [PATCH v7 03/24] drm/xe: Rename utilisation workaround emission function Tvrtko Ursulin
2025-06-27 13:33 ` [PATCH v7 04/24] drm/xe: Return number of written dwords from workaround batch buffer emission Tvrtko Ursulin
2025-06-27 13:33 ` [PATCH v7 05/24] drm/xe: Allow specifying number of extra dwords at the end of wa bb emission Tvrtko Ursulin
2025-06-27 13:33 ` [PATCH v7 06/24] drm/xe: Add plumbing for indirect context workarounds Tvrtko Ursulin
2025-06-27 13:33 ` [PATCH v7 07/24] drm/xe/xelp: Implement Wa_16010904313 Tvrtko Ursulin
2025-06-27 13:33 ` [PATCH v7 08/24] drm/xe/xelp: Add Wa_18022495364 Tvrtko Ursulin
2025-06-27 13:33 ` [PATCH v7 09/24] drm/xe: Use emit_flush_imm_ggtt helper instead of open coding Tvrtko Ursulin
2025-06-27 21:57   ` Matthew Brost
2025-06-27 13:33 ` [PATCH v7 10/24] drm/xe/xelpg: Flush CCS when flushing caches Tvrtko Ursulin
2025-06-27 13:33 ` [PATCH v7 11/24] drm/xe: Flush L3 when flushing render cache Tvrtko Ursulin
2025-06-27 18:23   ` Souza, Jose
2025-06-27 18:57     ` Souza, Jose
2025-06-30 12:43       ` Tvrtko Ursulin [this message]
2025-06-27 13:33 ` [PATCH v7 12/24] drm/xe/xelp: Quiesce memory traffic before invalidating auxccs Tvrtko Ursulin
2025-06-27 13:33 ` [PATCH v7 13/24] drm/xe/xelp: Support auxccs invalidation on blitter Tvrtko Ursulin
2025-06-27 13:33 ` [PATCH v7 14/24] drm/xe/xelp: Use MI_FLUSH_DW_CCS on auxccs platforms Tvrtko Ursulin
2025-06-27 13:33 ` [PATCH v7 15/24] drm/xe/xelp: Wait for AuxCCS invalidation to complete Tvrtko Ursulin
2025-06-27 13:33 ` [PATCH v7 16/24] drm/xe/xelp: Add AuxCCS invalidation to the buffer migration path Tvrtko Ursulin
2025-06-27 13:33 ` [PATCH v7 17/24] drm/xe: Export xe_emit_aux_table_inv Tvrtko Ursulin
2025-06-27 13:33 ` [PATCH v7 18/24] drm/xe/xelp: Add AuxCCS invalidation to the indirect context workarounds Tvrtko Ursulin
2025-06-27 13:33 ` [PATCH v7 19/24] drm/xe: Use fb cached min alignment Tvrtko Ursulin
2025-06-27 13:33 ` [PATCH v7 20/24] drm/xe: Flush GGTT writes after populating DPT Tvrtko Ursulin
2025-06-27 13:33 ` [PATCH v7 21/24] drm/xe: Handle DPT in system memory Tvrtko Ursulin
2025-06-27 13:33 ` [PATCH v7 22/24] drm/xe: Force flush system memory AuxCCS framebuffers before scan out Tvrtko Ursulin
2025-06-27 13:33 ` [PATCH v7 23/24] drm/xe/display: Add support for AuxCCS Tvrtko Ursulin
2025-06-27 13:33 ` [PATCH v7 24/24] drm/i915/display: Expose AuxCCS frame buffer modifiers for Xe Tvrtko Ursulin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bdcc81af-4eba-4129-a945-e45a71f79285@igalia.com \
    --to=tvrtko.ursulin@igalia.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=jose.souza@intel.com \
    --cc=kernel-dev@igalia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox