From: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
To: "Ville Syrjälä" <ville.syrjala@linux.intel.com>
Cc: intel-xe@lists.freedesktop.org, kernel-dev@igalia.com
Subject: Re: [PATCH v12 11/13] drm/xe: Force flush system memory AuxCCS framebuffers before scan out
Date: Tue, 23 Sep 2025 13:25:58 +0100 [thread overview]
Message-ID: <afd72823-2b19-4d15-b87e-bd879568ad02@igalia.com> (raw)
In-Reply-To: <aNKMKSIGSPfmPBEc@intel.com>
On 23/09/2025 13:01, Ville Syrjälä wrote:
> On Tue, Sep 23, 2025 at 11:48:59AM +0100, Tvrtko Ursulin wrote:
>>
>> On 23/09/2025 11:19, Ville Syrjälä wrote:
>>> On Tue, Sep 23, 2025 at 11:08:04AM +0100, Tvrtko Ursulin wrote:
>>>> Even though frame buffer objects are created as write-combined, in
>>>> practice, on top of all the ring buffer flushing, an additional clflush
>>>> seems to be needed before display engine can coherently scan out the
>>>> AuxCCS compressed data without transient artifacts.
>>>>
>>>> If for comparison we look at how i915 handles things (where AuxCCS works
>>>> fine), as it happens it has this same clflush before a frame buffer is
>>>> pinned for display for the first time, courtesy the dynamic tracking of
>>>> the buffer cache mode and setting the latter to uncached before handing
>>>> to display.
>>>>
>>>> Since xe considers the buffer object caching mode as static we can
>>>> implement the same approach by adding a flag telling us if the buffer
>>>> was ever pinned for display and flush on the first pin. Subsequent re-pins
>>>> will not repeat the clflush but so far I have not observed any glitching
>>>> after the first pin.
>>>>
>>>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
>>>> ---
>>>> drivers/gpu/drm/xe/display/xe_fb_pin.c | 14 +++++++++++++-
>>>> drivers/gpu/drm/xe/xe_bo_types.h | 14 +++++++++-----
>>>> 2 files changed, 22 insertions(+), 6 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/xe/display/xe_fb_pin.c b/drivers/gpu/drm/xe/display/xe_fb_pin.c
>>>> index d8aa23b8cf14..f247c0da6b9e 100644
>>>> --- a/drivers/gpu/drm/xe/display/xe_fb_pin.c
>>>> +++ b/drivers/gpu/drm/xe/display/xe_fb_pin.c
>>>> @@ -382,6 +382,7 @@ static struct i915_vma *__xe_pin_fb_vma(const struct intel_framebuffer *fb,
>>>> struct xe_bo *bo = gem_to_xe_bo(obj);
>>>> struct xe_validation_ctx ctx;
>>>> struct drm_exec exec;
>>>> + bool first_pin;
>>>> int ret = 0;
>>>>
>>>> if (!vma)
>>>> @@ -422,8 +423,11 @@ static struct i915_vma *__xe_pin_fb_vma(const struct intel_framebuffer *fb,
>>>> ret = xe_bo_validate(bo, NULL, true, &exec);
>>>> drm_exec_retry_on_contention(&exec);
>>>> xe_validation_retry_on_oom(&ctx, &ret);
>>>> - if (!ret)
>>>> + if (!ret) {
>>>> ttm_bo_pin(&bo->ttm);
>>>> + first_pin = !bo->display_pin;
>>>> + bo->display_pin = true;
>>>> + }
>>>> }
>>>> if (ret)
>>>> goto err;
>>>> @@ -436,6 +440,14 @@ static struct i915_vma *__xe_pin_fb_vma(const struct intel_framebuffer *fb,
>>>> if (ret)
>>>> goto err_unpin;
>>>>
>>>> + /*
>>>> + * Force flush frame buffer data for non-coherent display access when
>>>> + * AuxCCS formats are used.
>>>> + */
>>>> + if (first_pin && !xe_bo_is_vram(bo) && !xe_bo_is_stolen(bo) &&
>>>> + intel_fb_is_ccs_modifier(fb->base.modifier))
>>>> + drm_clflush_sg(xe_bo_sg(bo));
>>>
>>> You still haven't found the actual bug that causes the dirty cache?
>>
>> Sadly no. I cross referenced everything numerous times, including
>> workarounds, tried pretty much 1:1 i915 vs xe ring buffer programming,
>> but this extra flush always remains required. I also heard that how
>> flushing of the aux metadata works isn't documented anywhere and i915
>> does have this flush, by design or accident I don't know.
>
> i915 has the flush for the *whole* bo because it started out as
> cached. As soon as it undergoes a cached->uncached change (either
> due to set_caching ioctl or becoming a display scanout buffer) we
> clflush it and switch the GPU page tables from WB to UC. After
> that the bo stays uncached, and will need no further clflushes,
> assuming that everyone follows the rules:
>
> - CPU accesses via a WB mapping to an uncached bo will explicitly
> clflush before (invalidate) and after (write-back) the access
> - userspace always sets the GPU to use use "consult the PTE" MOCS
> setting for any potential scanout buffer. That way the GPU will
> use WB as long as the bo is cached, and once it becomes
> uncached the GPU also switches to UC accesses
>
> With xe I presume the BO should already start out as UC/WC with
> clean caches, and nothing should be dirtying the caches unless
> there is a real bug somewhere.
Correct, it is all WC and MOCS are correctly set as WC.
For example for the media compression flavour that was broken and needed
fixing in b412b144685f ("lib/intel/veboxcopy: Respect buffer MOCS
index"). (For the render compression flavour MOCS was already correct.)
But MOCS is not enough on its own for some reason.
I also looked into the rendercopy surface state programming and that too
looks okay. In fact it is possible to see how using the wrong MOCS makes
things worse. But it also appears the surface state MOCS only applies to
the main surface, while the aux state is the one which appears to
contain cached/unflushed data.
Regards,
Tvrtko
next prev parent reply other threads:[~2025-09-23 12:26 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-23 10:07 [PATCH v12 00/13] AuxCCS handling and render compression modifiers Tvrtko Ursulin
2025-09-23 10:07 ` [PATCH v12 01/13] drm/xe/xelpg: Flush CCS when flushing caches Tvrtko Ursulin
2025-09-23 10:07 ` [PATCH v12 02/13] drm/xe/xelp: Quiesce memory traffic before invalidating AuxCCS Tvrtko Ursulin
2025-10-01 15:47 ` Rodrigo Vivi
2025-09-23 10:07 ` [PATCH v12 03/13] drm/xe/xelp: Support auxccs invalidation on blitter Tvrtko Ursulin
2025-09-23 10:07 ` [PATCH v12 04/13] drm/xe/xelp: Use MI_FLUSH_DW_CCS on auxccs platforms Tvrtko Ursulin
2025-09-23 10:07 ` [PATCH v12 05/13] drm/xe/xelp: Wait for AuxCCS invalidation to complete Tvrtko Ursulin
2025-09-23 10:07 ` [PATCH v12 06/13] drm/xe: Export xe_emit_aux_table_inv Tvrtko Ursulin
2025-09-23 10:08 ` [PATCH v12 07/13] drm/xe/xelp: Add AuxCCS invalidation to the indirect context workarounds Tvrtko Ursulin
2025-09-23 10:08 ` [PATCH v12 08/13] drm/xe: Flush GGTT writes after populating DPT Tvrtko Ursulin
2025-09-23 10:08 ` [PATCH v12 09/13] drm/xe: Handle DPT in system memory Tvrtko Ursulin
2025-09-23 10:08 ` [PATCH v12 10/13] drm/xe/display: Add support for AuxCCS Tvrtko Ursulin
2025-09-23 10:08 ` [PATCH v12 11/13] drm/xe: Force flush system memory AuxCCS framebuffers before scan out Tvrtko Ursulin
2025-09-23 10:19 ` Ville Syrjälä
2025-09-23 10:48 ` Tvrtko Ursulin
2025-09-23 12:01 ` Ville Syrjälä
2025-09-23 12:25 ` Tvrtko Ursulin [this message]
2025-09-23 13:20 ` Ville Syrjälä
2025-09-23 14:40 ` Tvrtko Ursulin
2025-09-23 14:52 ` Ville Syrjälä
2025-09-24 13:09 ` Tvrtko Ursulin
2025-09-24 22:35 ` Ville Syrjälä
2025-09-25 7:24 ` Tvrtko Ursulin
2025-09-25 10:08 ` Tvrtko Ursulin
2025-09-26 7:41 ` Ville Syrjälä
2025-09-26 19:35 ` Ville Syrjälä
2025-10-02 14:01 ` Tvrtko Ursulin
2025-10-02 14:36 ` Tvrtko Ursulin
2025-10-02 16:23 ` Ville Syrjälä
2025-10-02 17:04 ` Tvrtko Ursulin
2025-10-02 17:14 ` Ville Syrjälä
2025-10-02 22:02 ` Ville Syrjälä
2025-09-23 10:44 ` [PATCH v13 " Tvrtko Ursulin
2025-09-23 10:08 ` [PATCH v12 12/13] drm/xe: Do not use stolen memory for DPT on IGFX and AuxCCS Tvrtko Ursulin
2025-09-23 10:08 ` [PATCH v12 13/13] drm/i915/display: Expose AuxCCS frame buffer modifiers for Xe Tvrtko Ursulin
2025-09-23 10:15 ` ✗ CI.checkpatch: warning for AuxCCS handling and render compression modifiers (rev15) Patchwork
2025-09-23 10:16 ` ✓ CI.KUnit: success " Patchwork
2025-09-23 11:15 ` ✓ Xe.CI.BAT: " Patchwork
2025-09-23 11:21 ` ✗ CI.checkpatch: warning for AuxCCS handling and render compression modifiers (rev16) Patchwork
2025-09-23 11:22 ` ✓ CI.KUnit: success " Patchwork
2025-09-23 12:03 ` ✓ Xe.CI.BAT: " Patchwork
2025-09-23 13:26 ` ✗ Xe.CI.Full: failure for AuxCCS handling and render compression modifiers (rev15) Patchwork
2025-09-23 14:12 ` ✗ Xe.CI.Full: failure for AuxCCS handling and render compression modifiers (rev16) Patchwork
2025-09-23 20:12 ` [PATCH v12 00/13] AuxCCS handling and render compression modifiers Ville Syrjälä
2025-09-24 7:59 ` Tvrtko Ursulin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=afd72823-2b19-4d15-b87e-bd879568ad02@igalia.com \
--to=tvrtko.ursulin@igalia.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=kernel-dev@igalia.com \
--cc=ville.syrjala@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox