From: "Govindapillai, Vinod" <vinod.govindapillai@intel.com>
To: "Sousa, Gustavo" <gustavo.sousa@intel.com>,
"Roper, Matthew D" <matthew.d.roper@intel.com>
Cc: "intel-xe@lists.freedesktop.org" <intel-xe@lists.freedesktop.org>,
"Coelho, Luciano" <luciano.coelho@intel.com>,
"Atwood, Matthew S" <matthew.s.atwood@intel.com>,
"Nautiyal, Ankit K" <ankit.k.nautiyal@intel.com>,
"Heikkila, Juha-pekka" <juha-pekka.heikkila@intel.com>,
"intel-gfx@lists.freedesktop.org"
<intel-gfx@lists.freedesktop.org>,
"Bhadane, Dnyaneshwar" <dnyaneshwar.bhadane@intel.com>,
"Chauhan, Shekhar" <shekhar.chauhan@intel.com>,
"Hogander, Jouni" <jouni.hogander@intel.com>,
"De Marchi, Lucas" <lucas.demarchi@intel.com>,
"Vodapalli, Ravi Kumar" <ravi.kumar.vodapalli@intel.com>
Subject: Re: [PATCH v3 21/29] drm/i915/xe3p_lpd: Enable system caching for FBC
Date: Tue, 4 Nov 2025 17:02:34 +0000 [thread overview]
Message-ID: <6638166b854fb9fb46d14fe608e51a1c60cb62b7.camel@intel.com> (raw)
In-Reply-To: <176227368961.3586.18103273933170980883@intel.com>
On Tue, 2025-11-04 at 13:28 -0300, Gustavo Sousa wrote:
> Quoting Gustavo Sousa (2025-11-04 13:16:01-03:00)
> > Quoting Matt Roper (2025-11-03 21:15:37-03:00)
> > > On Mon, Nov 03, 2025 at 02:18:12PM -0300, Gustavo Sousa wrote:
> > > > From: Vinod Govindapillai <vinod.govindapillai@intel.com>
> > > >
> > > > Configure one of the FBC instances to use system caching. FBC
> > > > read/write requests are tagged as cacheable till a programmed
> > > > limit is reached by the hw.
> > >
> > > What exactly is "system caching?" We have lots of different
> > > caches in
> > > current platforms, and it's not really obvious to me from the
> > > description here (or the bspec page) exactly which cache(s) are
> > > involved
> > > here.
> >
> > Perhaps Vinod would provide more assertive answers for this and
> > other
> > questions, but I'll also try to reply based on my research on this
> > topic.
> >
> > Although the Bspec does not make it clear, by digging a bit deeper
> > into
> > other documentation, "system cache" appears to be the SoC-level
> > cache.
> >
> > >
> > > Is turning this on always a win or is it situational? I.e., is
> > > there
> > > any potential for display memory traffic to fill a cache with FBC
> > > data
> > > by evicting data that was part of the CPU or GT's working set?
> > > If so,
> > > that seems like it could potentially harm the performance of
> > > other
> > > workloads running on the platform.
> > >
> > > Or is this whole thing about a completely new cache (unrelated to
> > > and unusable by anything else) which is devoted solely to FBC?
> >
> > From what I understood in the docs, the value
> > FBC_SYS_CACHE_TAG_USE_RES_SPACE (i.e. 0b11) for field "Cache Tags"
> > indicates that the caching will happen in a reserved space
> > dedicated to
> > the display engine. So I believe we wouldn't be interfering with
> > other
> > agents.
> >
> > >
> > > >
> > > > Bspec: 74722
> > >
> > > You might want to add 68881 here since it has a bit more
> > > information
> > > about how we're actually supposed to set the fields documented on
> > > 74722.
> >
> > Agreed.
> >
> > >
> > > > Signed-off-by: Vinod Govindapillai
> > > > <vinod.govindapillai@intel.com>
> > > > Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com>
> > > > ---
> > > > drivers/gpu/drm/i915/display/intel_fbc.c | 47
> > > > +++++++++++++++++++++++++++
> > > > drivers/gpu/drm/i915/display/intel_fbc_regs.h | 9 +++++
> > > > 2 files changed, 56 insertions(+)
> > > >
> > > > diff --git a/drivers/gpu/drm/i915/display/intel_fbc.c
> > > > b/drivers/gpu/drm/i915/display/intel_fbc.c
> > > > index 24b72951ea3c..e2e55c58ddbc 100644
> > > > --- a/drivers/gpu/drm/i915/display/intel_fbc.c
> > > > +++ b/drivers/gpu/drm/i915/display/intel_fbc.c
> > > > @@ -127,6 +127,9 @@ struct intel_fbc {
> > > > */
> > > > struct intel_fbc_state state;
> > > > const char *no_fbc_reason;
> > > > +
> > > > + /* Only one of FBC instances can use the system cache
> > > > */
> > > > + bool own_sys_cache;
> >
> > If we go ahead with using this member, I would prefer that we used
> > "owns_sys_cache" (just like we use "has_something" instead of
> > "have_something").
> >
> > > > };
> > > >
> > > > static char fbc_name(enum intel_fbc_id fbc_id)
> > > > @@ -569,12 +572,51 @@ static bool ilk_fbc_is_compressing(struct
> > > > intel_fbc *fbc)
> > > > return intel_de_read(fbc->display,
> > > > ILK_DPFC_STATUS(fbc->id)) & DPFC_COMP_SEG_MASK;
> > > > }
> > > >
> > > > +static void nvl_fbc_program_system_cache(struct intel_fbc
> > > > *fbc, bool enable)
> > > > +{
> > > > + struct intel_display *display = fbc->display;
> > > > + u32 cfb_offset, usage;
> > > > +
> > > > + lockdep_assert_held(&fbc->lock);
> > > > +
> > > > + usage = intel_de_read(display,
> > > > NVL_FBC_SYS_CACHE_USAGE_CFG);
> > > > +
> > > > + /* System cache already being used by another pipe */
> > > > + if (enable && (usage &
> > > > FBC_SYS_CACHE_TAG_USE_RES_SPACE))
> > > > + return;
> > >
> > > Rather than relying on the current register contents, should we
> > > be
> > > sanitizing this on driver probe (in case the pre-OS firmware
> > > already set
> > > this up) and then making our own decisions (as part of an atomic
> > > transaction) about which pipe to prioritize after that?
> >
> > I agree.
> >
> > >
> > > > +
> > > > + /* Only the fbc instance which owns system cache can
> > > > disable it */
> > > > + if (!enable && !fbc->own_sys_cache)
> > > > + return;
> > > > +
> > > > + /*
> > > > + * Not programming the cache limit and cache reading
> > > > enable bits explicitly
> > > > + * here. The default values should take care of those
> > > > and that could leave
> > > > + * adjustments of those bits to the system hw policy
> > > > + *
> > > > + * TODO: check if we need to explicitly program these?
> > >
> > > There's no hardware default documented for the range field, so
> > > unless
> > > the pre-OS firmware sets it up (which we probably shouldn't rely
> > > on),
> > > I'd expect this to be 0; I don't think that's what we want.
> >
> > Agreed.
> >
> > The Bspec clearly states that we should set "Cacheable Range" to
> > 32, the
> > equivalent of 2MB (i.e. 32 chunks of 64KB). So yes, we shouldn't
> > rely
> > on any existing value and always use 32.
> >
> > >
> > > > + */
> > > > + cfb_offset = enable ? i915_gem_stolen_node_offset(fbc-
> > > > >compressed_fb) : 0;
> > > > + usage |= FBC_SYS_CACHE_START_BASE(cfb_offset);
> > >
> > > And if something *did* set this up already, then OR'ing a new
> > > address
> > > over the old one isn't going to work. We'd need "(old & ~mask) |
> > > new"
> > > to ensure we don't have leftover bits still set by accident. But
> > > it
> >
> > Yeah. The current code is not accouting for any pre-existing value
> > here
> > and is subject to corruption by simply OR'ing. This needs to be
> > fixed.
> >
> > Another thing to fix here is that the field "Cache Start Base"
> > needs to
> > be in "4k byte chunks" and we are currently using cfb_offset
> > directly
> > instead of applying the necessary shift.
> >
> > > would probably be better to just avoid RMW-style handling in
> > > general and
> > > build a complete register value with exactly what we want rather
> > > than
> > > trying to modify the pre-existing value.
> >
> > The Bspec says that we should keep other fields with their default
> > values. So, I believe we do need to have RMW logic here.
> >
> > >
> > > > + usage |= enable ? FBC_SYS_CACHE_TAG_USE_RES_SPACE :
> > > > FBC_SYS_CACHE_TAG_DONT_CACHE;
> > > > +
> > > > + intel_de_write(display, NVL_FBC_SYS_CACHE_USAGE_CFG,
> > > > usage);
> > > > +
> > > > + fbc->own_sys_cache = enable;
> > >
> > > It feels like instead of having this as a boolean flag in fbc,
> > > this
> > > should be a pointer/ID tracked at the intel_display level. E.g.,
> > >
> > > display->sys_cache_fbc = fbc;
> >
> > Yeah. A single member instead of one for each FBC seems to be
> > enough.
> >
> > >
> > > or possibly converted over to something tracked with atomic state
> > > so
> > > that we can make better high-level decisions about which FBC we
> > > want to
> > > enable this on as various displays get enabled/disabled.
> >
> > That would be nice. I see it as something that could be done as a
> > follow-up.
> >
> > >
> > >
> > > Matt
> > >
> > > > +
> > > > + drm_dbg_kms(display->drm, "System caching for FBC[%d]
> > > > %s\n",
> > > > + fbc->id, enable ? "configured" :
> > > > "cleared");
> > > > +}
> > > > +
> > > > static void ilk_fbc_program_cfb(struct intel_fbc *fbc)
> > > > {
> > > > struct intel_display *display = fbc->display;
> > > >
> > > > intel_de_write(display, ILK_DPFC_CB_BASE(fbc->id),
> > > > i915_gem_stolen_node_offset(fbc-
> > > > >compressed_fb));
> > > > +
> > > > + if (DISPLAY_VER(display) >= 35)
>
> I forgot to mention this on my previous email: I think Bspec is
> missing
> this info, but this feature is applicable only to integrated display,
> so
> we probably want to limit the condition above to reflect that.
>
> Perhaps it would be good to have a macro like
>
> #define HAS_FBC_SYS_CACHE(__display) (DISPLAY_VER(__display) >= 35
> && !(__display)->platform.dgfx)
Ack.
BR
Vinod
>
> --
> Gustavo Sousa
>
> > > > + nvl_fbc_program_system_cache(fbc, true);
> >
> > One thing that concerns me here is that we are programming
> > SYS_CACHE_USAGE multiple times and the Bspec seems to indicate that
> > we
> > should do it only once:
> >
> > "Configure SYS_CACHE_USAGE to setup the caching before enabling
> > first FBC and leave it alone after that."
> >
> > I believe we should get some clarification with the HW team to
> > verify if
> > what we are doing here is legal. By doing it multiple times, we
> > could
> > be interfering with other agents (e.g. PCode) that could be doing
> > some
> > dynamic adjustments.
> >
> > --
> > Gustavo Sousa
> >
> > > > }
> > > >
> > > > static const struct intel_fbc_funcs ilk_fbc_funcs = {
> > > > @@ -952,6 +994,8 @@ static void
> > > > intel_fbc_program_workarounds(struct intel_fbc *fbc)
> > > >
> > > > static void __intel_fbc_cleanup_cfb(struct intel_fbc *fbc)
> > > > {
> > > > + struct intel_display *display = fbc->display;
> > > > +
> > > > if (WARN_ON(intel_fbc_hw_is_active(fbc)))
> > > > return;
> > > >
> > > > @@ -959,6 +1003,9 @@ static void __intel_fbc_cleanup_cfb(struct
> > > > intel_fbc *fbc)
> > > > i915_gem_stolen_remove_node(fbc-
> > > > >compressed_llb);
> > > > if (i915_gem_stolen_node_allocated(fbc-
> > > > >compressed_fb))
> > > > i915_gem_stolen_remove_node(fbc-
> > > > >compressed_fb);
> > > > +
> > > > + if (DISPLAY_VER(display) >= 35)
> > > > + nvl_fbc_program_system_cache(fbc, false);
> > > > }
> > > >
> > > > void intel_fbc_cleanup(struct intel_display *display)
> > > > diff --git a/drivers/gpu/drm/i915/display/intel_fbc_regs.h
> > > > b/drivers/gpu/drm/i915/display/intel_fbc_regs.h
> > > > index 77d8321c4fb3..592cd2384255 100644
> > > > --- a/drivers/gpu/drm/i915/display/intel_fbc_regs.h
> > > > +++ b/drivers/gpu/drm/i915/display/intel_fbc_regs.h
> > > > @@ -128,4 +128,13 @@
> > > > #define FBC_REND_NUKE REG_BIT(2)
> > > > #define FBC_REND_CACHE_CLEAN REG_BIT(1)
> > > >
> > > > +#define NVL_FBC_SYS_CACHE_USAGE_CFG
> > > > _MMIO(0x1344E0)
> > > > +#define FBC_SYS_CACHE_START_BASE_MASK
> > > > REG_GENMASK(31, 16)
> > > > +#define FBC_SYS_CACHE_START_BASE(base)
> > > > REG_FIELD_PREP(FBC_SYS_CACHE_START_BASE_MASK, (base))
> > > > +#define FBC_SYS_CACHEABLE_RANGE_MASK
> > > > REG_GENMASK(15, 4)
> > > > +#define FBC_SYS_CACHEABLE_RANGE(range)
> > > > REG_FIELD_PREP(FBC_SYS_CACHEABLE_RANGE_MASK, (range))
> > > > +#define FBC_SYS_CACHE_TAG_MASK REG_GENMASK(3,
> > > > 2)
> > > > +#define FBC_SYS_CACHE_TAG_DONT_CACHE
> > > > REG_FIELD_PREP(FBC_SYS_CACHE_TAG_MASK, 0)
> > > > +#define FBC_SYS_CACHE_TAG_USE_RES_SPACE
> > > > REG_FIELD_PREP(FBC_SYS_CACHE_TAG_MASK, 3)
> > > > +
> > > > #endif /* __INTEL_FBC_REGS__ */
> > > >
> > > > --
> > > > 2.51.0
> > > >
> > >
> > > --
> > > Matt Roper
> > > Graphics Software Engineer
> > > Linux GPU Platform Enablement
> > > Intel Corporation
next prev parent reply other threads:[~2025-11-04 17:02 UTC|newest]
Thread overview: 58+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-03 17:17 [PATCH v3 00/29] drm/i915/display: Add initial support for Xe3p_LPD Gustavo Sousa
2025-11-03 17:17 ` [PATCH v3 01/29] drm/i915/xe3p_lpd: Add Xe3p_LPD display IP features Gustavo Sousa
2025-11-03 17:17 ` [PATCH v3 02/29] drm/i915/xe3p_lpd: Drop north display reset option programming Gustavo Sousa
2025-11-03 17:17 ` [PATCH v3 03/29] drm/i915/display: Use braces for if-ladder in intel_bw_init_hw() Gustavo Sousa
2025-11-03 17:17 ` [PATCH v3 04/29] drm/i915/xe3p_lpd: Update bandwidth parameters Gustavo Sousa
2025-11-03 17:17 ` [PATCH v3 05/29] drm/i915/xe3p_lpd: Expand bifield masks dbuf blocks fields Gustavo Sousa
2025-11-03 17:17 ` [PATCH v3 06/29] drm/i915/xe3p_lpd: Horizontal flip support for linear surfaces Gustavo Sousa
2025-11-03 17:17 ` [PATCH v3 07/29] drm/i915/xe3p_lpd: Remove gamma,csc bottom color checks Gustavo Sousa
2025-11-03 17:17 ` [PATCH v3 08/29] drm/i915/xe3p_lpd: Add CDCLK table Gustavo Sousa
2025-11-03 17:18 ` [PATCH v3 09/29] drm/i915/xe3p_lpd: Load DMC firmware Gustavo Sousa
2025-11-03 17:18 ` [PATCH v3 10/29] drm/i915/xe3p_lpd: Drop support for interlace mode Gustavo Sousa
2025-11-03 17:18 ` [PATCH v3 11/29] drm/i915/xe3p_lpd: Extend Wa_16025573575 Gustavo Sousa
2025-11-03 17:18 ` [PATCH v3 12/29] drm/i915/xe3p_lpd: Don't allow odd ypan or ysize with semiplanar format Gustavo Sousa
2025-11-03 17:18 ` [PATCH v3 13/29] drm/i915/xe3p_lpd: Reload DMC MMIO for pipes C and D Gustavo Sousa
2025-11-03 17:18 ` [PATCH v3 14/29] drm/i915/wm: don't use method1 in Xe3p_LPD onwards Gustavo Sousa
2025-11-03 17:18 ` [PATCH v3 15/29] drm/i915/dram: Add field ecc_impacting_de_bw Gustavo Sousa
2025-11-03 17:36 ` Matt Roper
2025-11-03 17:18 ` [PATCH v3 16/29] drm/i915/xe3p_lpd: Underrun debuggability and error codes/hints Gustavo Sousa
2025-11-03 21:51 ` Matt Roper
2025-11-05 14:42 ` Gustavo Sousa
2025-11-05 14:54 ` Gustavo Sousa
2025-11-03 17:18 ` [PATCH v3 17/29] drm/i915/xe3p_lpd: Log FBC-related debug info for PIPE underrun Gustavo Sousa
2025-11-03 22:30 ` Matt Roper
2025-11-06 15:55 ` Gustavo Sousa
2025-11-03 17:18 ` [PATCH v3 18/29] drm/i915/xe3p_lpd: Adapt to updates on MBUS_CTL/DBUF_CTL registers Gustavo Sousa
2025-11-03 17:47 ` Matt Roper
2025-11-03 17:18 ` [PATCH v3 19/29] drm/i915/wm: Do not make latency values monotonic on Xe3 onward Gustavo Sousa
2025-11-03 22:48 ` Matt Roper
2025-11-07 23:53 ` Gustavo Sousa
2025-11-03 17:18 ` [PATCH v3 20/29] drm/i915/xe3p_lpd: Always apply WaWmMemoryReadLatency Gustavo Sousa
2025-11-03 18:18 ` Matt Roper
2025-11-03 17:18 ` [PATCH v3 21/29] drm/i915/xe3p_lpd: Enable system caching for FBC Gustavo Sousa
2025-11-04 0:15 ` Matt Roper
2025-11-04 16:16 ` Gustavo Sousa
2025-11-04 16:28 ` Gustavo Sousa
2025-11-04 17:02 ` Govindapillai, Vinod [this message]
2025-11-04 16:35 ` Govindapillai, Vinod
2025-11-04 16:58 ` Gustavo Sousa
2025-11-07 23:22 ` Gustavo Sousa
2025-11-10 8:16 ` Govindapillai, Vinod
2025-11-03 17:18 ` [PATCH v3 22/29] drm/i915/vbt: Add fields dedicated_external and dyn_port_over_tc Gustavo Sousa
2025-11-03 17:18 ` [PATCH v3 23/29] drm/i915/power: Use intel_encoder_is_tc() Gustavo Sousa
2025-11-03 17:18 ` [PATCH v3 24/29] drm/i915/display: Handle dedicated external ports in intel_encoder_is_tc() Gustavo Sousa
2025-11-03 17:18 ` [PATCH v3 25/29] drm/i915/xe3p_lpd: Extend Type-C flow for static DDI allocation Gustavo Sousa
2025-11-03 17:18 ` [PATCH v3 26/29] drm/i915/nvls: Add NVL-S display support Gustavo Sousa
2025-11-03 17:18 ` [PATCH v3 27/29] drm/i915/display: Use platform check in HAS_LT_PHY() Gustavo Sousa
2025-11-03 17:42 ` Bhadane, Dnyaneshwar
2025-11-03 17:42 ` Matt Roper
2025-11-03 17:44 ` Gustavo Sousa
2025-11-03 17:18 ` [PATCH v3 28/29] drm/i915/display: Move HAS_LT_PHY() to intel_display_device.h Gustavo Sousa
2025-11-03 17:39 ` Matt Roper
2025-11-03 17:18 ` [PATCH v3 29/29] drm/i915/display: Use HAS_LT_PHY() for LT PHY AUX power Gustavo Sousa
2025-11-03 17:40 ` Matt Roper
2025-11-03 18:36 ` ✗ CI.checkpatch: warning for drm/i915/display: Add initial support for Xe3p_LPD (rev3) Patchwork
2025-11-03 18:37 ` ✓ CI.KUnit: success " Patchwork
2025-11-03 18:53 ` ✗ CI.checksparse: warning " Patchwork
2025-11-04 9:10 ` ✗ Xe.CI.Full: failure " Patchwork
2025-11-07 1:02 ` [PATCH v3 00/29] drm/i915/display: Add initial support for Xe3p_LPD Gustavo Sousa
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=6638166b854fb9fb46d14fe608e51a1c60cb62b7.camel@intel.com \
--to=vinod.govindapillai@intel.com \
--cc=ankit.k.nautiyal@intel.com \
--cc=dnyaneshwar.bhadane@intel.com \
--cc=gustavo.sousa@intel.com \
--cc=intel-gfx@lists.freedesktop.org \
--cc=intel-xe@lists.freedesktop.org \
--cc=jouni.hogander@intel.com \
--cc=juha-pekka.heikkila@intel.com \
--cc=lucas.demarchi@intel.com \
--cc=luciano.coelho@intel.com \
--cc=matthew.d.roper@intel.com \
--cc=matthew.s.atwood@intel.com \
--cc=ravi.kumar.vodapalli@intel.com \
--cc=shekhar.chauhan@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).