From: "Ville Syrjälä" <ville.syrjala@linux.intel.com>
To: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Cc: intel-xe@lists.freedesktop.org, kernel-dev@igalia.com
Subject: Re: [PATCH v12 11/13] drm/xe: Force flush system memory AuxCCS framebuffers before scan out
Date: Fri, 3 Oct 2025 01:02:42 +0300 [thread overview]
Message-ID: <aN72gqfjyhJZsWNA@intel.com> (raw)
In-Reply-To: <aN6y2-2GTgOVmbmR@intel.com>
On Thu, Oct 02, 2025 at 08:14:03PM +0300, Ville Syrjälä wrote:
> On Thu, Oct 02, 2025 at 06:04:28PM +0100, Tvrtko Ursulin wrote:
> >
> > On 02/10/2025 17:23, Ville Syrjälä wrote:
> > > On Thu, Oct 02, 2025 at 03:01:08PM +0100, Tvrtko Ursulin wrote:
> > >>
> > >> Hi,
> > >>
> > >> On 26/09/2025 20:35, Ville Syrjälä wrote:
> > >>> On Fri, Sep 26, 2025 at 10:41:56AM +0300, Ville Syrjälä wrote:
> > >>>> I reverse engineered this a bit and there's definitely a
> > >>>> MOCS issue at play.
> > >>>>
> > >>>> First I noticed that if filled the entire MOCS table with
> > >>>> UC the problem went away. I then filled the entire table
> > >>>> with WB and essentially bisected what I need to make UC
> > >>>> to fix it. And I had to repeat that same process starting
> > >>>> from the other end of table.
> > >>>>
> > >>>> Looks like there is some undocumented magic in the hardware.
> > >>>>
> > >>>> MOCS 61 really is special:
> > >>>> - MOCS 61 UC, others WB, select MOCS 61 -> no corruption
> > >>>>
> > >>>> MOCS 0 and 63 are special in other ways:
> > >>>> - MOCS X UC, others WB, select MOCS X -> corruption
> > >>>> - MOCS X+0 UC, others WB, select MOCS X -> corruption
> > >>>> - MOCS X+63 UC, others WB, select MOCS X -> corruption
> > >>>> - MOCS X+0+63 UC, others WB, select MOCS X -> no corruption
> > >>>> where X != 61
> > >>>
> > >>> OK, the MOCS 63 issue was caused by me having L3=WB still in
> > >>> MOCS X. If I change MOCS X to L3=UC, MOCS 63 no longer makes
> > >>> a difference. I suppose that means MOCS 63 is still used for
> > >>> L3 evictions, even though bspec no longer mentions that fact
> > >>> explicitly.
> > >>>
> > >>> So MOCS 0 is the thing that really matters for CCS. And for
> > >>> MOCS 0 only the LLC WB vs. UC selection matters. L3 WB vs. UC
> > >>> doesn't seem to make any difference.
> > >>>
> > >>> It's interesting that MOCS 60 is documented as a "CCS special case",
> > >>> but in reality it's MOCS 0 that matters for CCS. I wonder if some
> > >>> wires got crossed in the hw design and the wrong MOCS entry ended
> > >>> up being used for CCS and no one noticed...
> > >>
> > >> Oh wow, that is an amazing discovery!
> > >>
> > >> I verified it on my end too. Setting MOCS 0 to uncached and cache dirt
> > >> is gone. No need to the explicit cache flush patch on first pin.
> > >>
> > >> Luckily ADL is unsupported so we could change it to UC. I will send a
> > >> series for CI to see what it will say.
> >
> > So the MOCS 0 UC experiment did not seem to be 100% glitch free. It
> > *looks* it helps, maybe even a lot, but not fully - three tests still
> > failed due CRC mismatches.
> >
> > > I think the real fix is to change igt to use MOCS 61 for tgl/adl.
> > > That is what Mesa uses as well.
> > I somehow glossed over the fact you initially wrote 61 worked fine for
> > you and focused only on your X+0+63 combinations. :(
> >
> > 61 works fine for me locally too. Very curious hw behaviour.
> >
> > It would be nice to do a CI run with IGT changed to 61 but AFAIK the xe
> > patchwork/CI does not support the Test-with tag.
> >
> > > Looks like Mesa uses a different MOCS for DG1 and DG2. Those
> > > do seem to like up with what's in bspec, so probably someone
> > > needs to just copy the whole MOCS thing from Mesa into igt.
> >
> > I can have a look.
> >
> > > Looks like Mesa doesn't even use a UC MOCS for anything except
> > > on MTL, so possibly we can just change the TGL MOCS 0 to be the
> > > same WB as on ADL, and maybe that gives some performance benefit
> > > in some cases.
> >
> > On xe, i915 or both?
>
> Both.
>
> Does xe not program the table already according to bspec? I doubt
> we should really care about the "ancient Mesa + xe + TGL" case,
> so the special TGL MOCS table shouldn't be needed on xe IMO.
>
> >
> > >>>> I didn't actually test all values of X there, but I did spot
> > >>>> check a handful of them.
> > >>>>
> > >>>> Also, ADL is affected, but TGL doesn't seem to be. Though I
> > >>>> still need to check the situation on TGL a bit more thoroughly.
> > >>>
> > >>> TGL actually works exactly the same as ADL. The only reason why
> > >>> TGL worked correctly out of the box was that we use a different
> > >>> MOCS table for TGL/RKL (IIRC because we started out with the
> > >>> wrong table and early Mesa versions depended on that), and in
> > >>> that table MOCS 0 is just 0x0, whereas on ADL MOCS 0 is WB.
> > >>
> > >> Kind of sounds familiar but the only commit I found was 3f027d61663f
> > >> ("drm/i915/gt: Add separate MOCS table for Gen12 devices other than
> > >> TGL/RKL") but it is about MOCS 1. What am I missing? Are the hw defaults
> > >> maybe different and not the code?
> > >
> > > The defaults are somehow populated differently dependign on
> > > unused_entries_index which is also being set in a very confusing
> > > way (first set it to 1(PTE) on everything and the overwritten
> > > with some other value for some of the platforms). The code could
> > > certainly use a good cleanup pass.
> > >
> > > Anyways, the default index ends up being different on TGL and ADL
> > > and thus MOCS 0 ends up different as well.
> >
> > Yep. I missed it and forgot about cfbe5291a189 ("drm/i915/gt: Initialize
> > unused MOCS entries with device specific values").
> >
> > > Since MOCS 0 seems to be special, we should probablya populate
> > > it explicitly. And I suppose we should first figure out if
> > > other platforms are also affected.
> >
> > Yeah. If we could only get the full understanding on the details of
> > "specialness".
>
> I filed a bspec issue for it now. I guess we'll see if anyone
> cares anymore...
>
> And I do still want to reverse engineer this on other platforms
> as well.
I did a quick test on ICL (not affected) and MTL (inconclusive
due to apparent lack of L4).
I still couldn't see what is supposed to be special about MOCS 60.
I've not seen any behavioral difference between it and any other
MOCS entry (apart from MOCS 61). I suspect what has happened is
that the hardware was supposed to use MOCS 60 for some non-display
(ie. not rendered with MOCS 61) CCS stuff but due to some mishap
it actually ends up using MOCS 0.
Apparently MTL still has those special MOCS entries. Though they
might be borked due the number of MOCS entries being reduced to
16 (4 bits) while the hw might still be internally looking for
the full 6 bit special values (60 and 61). But since MTL
apparently has no L4 it supposedly doesn't matter.
On ARL they added a way to configure those special MOCS indices
via SARB_CHICKEN1. Default for what was MOCS 61 seems to be 13
(just the same value truncated to 4 bits), but the default for
the old MOCS 60 seems to be 0 on ARL. We don't appear to change
those defaults anywhere. So I guess if someone has an ARL with
L4 (dunno if it actually exists) they might see the same
behaviour even if the hardware actually tries to use the
configured MOCS index correctly for the non-display CCS use
case.
--
Ville Syrjälä
Intel
next prev parent reply other threads:[~2025-10-02 22:02 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-23 10:07 [PATCH v12 00/13] AuxCCS handling and render compression modifiers Tvrtko Ursulin
2025-09-23 10:07 ` [PATCH v12 01/13] drm/xe/xelpg: Flush CCS when flushing caches Tvrtko Ursulin
2025-09-23 10:07 ` [PATCH v12 02/13] drm/xe/xelp: Quiesce memory traffic before invalidating AuxCCS Tvrtko Ursulin
2025-10-01 15:47 ` Rodrigo Vivi
2025-09-23 10:07 ` [PATCH v12 03/13] drm/xe/xelp: Support auxccs invalidation on blitter Tvrtko Ursulin
2025-09-23 10:07 ` [PATCH v12 04/13] drm/xe/xelp: Use MI_FLUSH_DW_CCS on auxccs platforms Tvrtko Ursulin
2025-09-23 10:07 ` [PATCH v12 05/13] drm/xe/xelp: Wait for AuxCCS invalidation to complete Tvrtko Ursulin
2025-09-23 10:07 ` [PATCH v12 06/13] drm/xe: Export xe_emit_aux_table_inv Tvrtko Ursulin
2025-09-23 10:08 ` [PATCH v12 07/13] drm/xe/xelp: Add AuxCCS invalidation to the indirect context workarounds Tvrtko Ursulin
2025-09-23 10:08 ` [PATCH v12 08/13] drm/xe: Flush GGTT writes after populating DPT Tvrtko Ursulin
2025-09-23 10:08 ` [PATCH v12 09/13] drm/xe: Handle DPT in system memory Tvrtko Ursulin
2025-09-23 10:08 ` [PATCH v12 10/13] drm/xe/display: Add support for AuxCCS Tvrtko Ursulin
2025-09-23 10:08 ` [PATCH v12 11/13] drm/xe: Force flush system memory AuxCCS framebuffers before scan out Tvrtko Ursulin
2025-09-23 10:19 ` Ville Syrjälä
2025-09-23 10:48 ` Tvrtko Ursulin
2025-09-23 12:01 ` Ville Syrjälä
2025-09-23 12:25 ` Tvrtko Ursulin
2025-09-23 13:20 ` Ville Syrjälä
2025-09-23 14:40 ` Tvrtko Ursulin
2025-09-23 14:52 ` Ville Syrjälä
2025-09-24 13:09 ` Tvrtko Ursulin
2025-09-24 22:35 ` Ville Syrjälä
2025-09-25 7:24 ` Tvrtko Ursulin
2025-09-25 10:08 ` Tvrtko Ursulin
2025-09-26 7:41 ` Ville Syrjälä
2025-09-26 19:35 ` Ville Syrjälä
2025-10-02 14:01 ` Tvrtko Ursulin
2025-10-02 14:36 ` Tvrtko Ursulin
2025-10-02 16:23 ` Ville Syrjälä
2025-10-02 17:04 ` Tvrtko Ursulin
2025-10-02 17:14 ` Ville Syrjälä
2025-10-02 22:02 ` Ville Syrjälä [this message]
2025-09-23 10:44 ` [PATCH v13 " Tvrtko Ursulin
2025-09-23 10:08 ` [PATCH v12 12/13] drm/xe: Do not use stolen memory for DPT on IGFX and AuxCCS Tvrtko Ursulin
2025-09-23 10:08 ` [PATCH v12 13/13] drm/i915/display: Expose AuxCCS frame buffer modifiers for Xe Tvrtko Ursulin
2025-09-23 10:15 ` ✗ CI.checkpatch: warning for AuxCCS handling and render compression modifiers (rev15) Patchwork
2025-09-23 10:16 ` ✓ CI.KUnit: success " Patchwork
2025-09-23 11:15 ` ✓ Xe.CI.BAT: " Patchwork
2025-09-23 11:21 ` ✗ CI.checkpatch: warning for AuxCCS handling and render compression modifiers (rev16) Patchwork
2025-09-23 11:22 ` ✓ CI.KUnit: success " Patchwork
2025-09-23 12:03 ` ✓ Xe.CI.BAT: " Patchwork
2025-09-23 13:26 ` ✗ Xe.CI.Full: failure for AuxCCS handling and render compression modifiers (rev15) Patchwork
2025-09-23 14:12 ` ✗ Xe.CI.Full: failure for AuxCCS handling and render compression modifiers (rev16) Patchwork
2025-09-23 20:12 ` [PATCH v12 00/13] AuxCCS handling and render compression modifiers Ville Syrjälä
2025-09-24 7:59 ` Tvrtko Ursulin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aN72gqfjyhJZsWNA@intel.com \
--to=ville.syrjala@linux.intel.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=kernel-dev@igalia.com \
--cc=tvrtko.ursulin@igalia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox