* [RFC][PATCH 01/11] drm/i915: Reject modes with linetime > 64 usec
2025-10-08 18:25 [RFC][PATCH 00/11] drm/i915/prefill: Introduce helpers for prefill latency calculations Ville Syrjala
@ 2025-10-08 18:25 ` Ville Syrjala
2025-10-13 15:15 ` Shankar, Uma
2025-10-08 18:25 ` [RFC][PATCH 02/11] drm/i915/cdclk: Add prefill helpers for CDCLK Ville Syrjala
` (9 subsequent siblings)
10 siblings, 1 reply; 26+ messages in thread
From: Ville Syrjala @ 2025-10-08 18:25 UTC (permalink / raw)
To: intel-gfx; +Cc: intel-xe
From: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reject modes whose linetime exceeds 64 usec.
First reason being that WM_LINETIME is limited to (nearly) 64 usec.
Additionally knowing the linetime is bounded will help with
determining whether overflows may be a concern during various
calculations.
I decided to round up, and accept the linetime==64 case. We use
various rounding directions for this in other parts of the code,
so I feel this provides the most consistent result all around.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
---
drivers/gpu/drm/i915/display/intel_display.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c
index b57efd870774..afa78774eaeb 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -7967,6 +7967,14 @@ enum drm_mode_status intel_mode_valid(struct drm_device *dev,
mode->vtotal > vtotal_max)
return MODE_V_ILLEGAL;
+ /*
+ * WM_LINETIME only goes up to (almost) 64 usec, and also
+ * knowing that the linetime is always bounded will ease the
+ * mind during various calculations.
+ */
+ if (DIV_ROUND_UP(mode->htotal * 1000, mode->clock) > 64)
+ return MODE_H_ILLEGAL;
+
return MODE_OK;
}
--
2.49.1
^ permalink raw reply related [flat|nested] 26+ messages in thread* RE: [RFC][PATCH 01/11] drm/i915: Reject modes with linetime > 64 usec
2025-10-08 18:25 ` [RFC][PATCH 01/11] drm/i915: Reject modes with linetime > 64 usec Ville Syrjala
@ 2025-10-13 15:15 ` Shankar, Uma
0 siblings, 0 replies; 26+ messages in thread
From: Shankar, Uma @ 2025-10-13 15:15 UTC (permalink / raw)
To: Ville Syrjala, intel-gfx@lists.freedesktop.org
Cc: intel-xe@lists.freedesktop.org
> -----Original Message-----
> From: Intel-gfx <intel-gfx-bounces@lists.freedesktop.org> On Behalf Of Ville
> Syrjala
> Sent: Wednesday, October 8, 2025 11:56 PM
> To: intel-gfx@lists.freedesktop.org
> Cc: intel-xe@lists.freedesktop.org
> Subject: [RFC][PATCH 01/11] drm/i915: Reject modes with linetime > 64 usec
>
> From: Ville Syrjälä <ville.syrjala@linux.intel.com>
>
> Reject modes whose linetime exceeds 64 usec.
>
> First reason being that WM_LINETIME is limited to (nearly) 64 usec.
>
> Additionally knowing the linetime is bounded will help with determining whether
> overflows may be a concern during various calculations.
>
> I decided to round up, and accept the linetime==64 case. We use various rounding
> directions for this in other parts of the code, so I feel this provides the most
> consistent result all around.
Yeah, this seems fair given max can go upto only 63.875. Good to reject mode.
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> ---
> drivers/gpu/drm/i915/display/intel_display.c | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/display/intel_display.c
> b/drivers/gpu/drm/i915/display/intel_display.c
> index b57efd870774..afa78774eaeb 100644
> --- a/drivers/gpu/drm/i915/display/intel_display.c
> +++ b/drivers/gpu/drm/i915/display/intel_display.c
> @@ -7967,6 +7967,14 @@ enum drm_mode_status intel_mode_valid(struct
> drm_device *dev,
> mode->vtotal > vtotal_max)
> return MODE_V_ILLEGAL;
>
> + /*
> + * WM_LINETIME only goes up to (almost) 64 usec, and also
> + * knowing that the linetime is always bounded will ease the
> + * mind during various calculations.
> + */
> + if (DIV_ROUND_UP(mode->htotal * 1000, mode->clock) > 64)
> + return MODE_H_ILLEGAL;
> +
> return MODE_OK;
> }
>
> --
> 2.49.1
^ permalink raw reply [flat|nested] 26+ messages in thread
* [RFC][PATCH 02/11] drm/i915/cdclk: Add prefill helpers for CDCLK
2025-10-08 18:25 [RFC][PATCH 00/11] drm/i915/prefill: Introduce helpers for prefill latency calculations Ville Syrjala
2025-10-08 18:25 ` [RFC][PATCH 01/11] drm/i915: Reject modes with linetime > 64 usec Ville Syrjala
@ 2025-10-08 18:25 ` Ville Syrjala
2025-10-13 16:11 ` Shankar, Uma
2025-10-08 18:25 ` [RFC][PATCH 03/11] drm/i915/cdclk: Add intel_cdclk_min_cdclk_for_prefill() Ville Syrjala
` (8 subsequent siblings)
10 siblings, 1 reply; 26+ messages in thread
From: Ville Syrjala @ 2025-10-08 18:25 UTC (permalink / raw)
To: intel-gfx; +Cc: intel-xe
From: Ville Syrjälä <ville.syrjala@linux.intel.com>
Add helpers to compute the CDCLKl adjustment factor for prefill
calculations. The adjustment factor is always <= 1.0. That is,
a faster that strictly necessary CDCLK speeds up the pipe prefill.
intel_cdclk_prefill_adjustment_worst() gives out a worst case estimate,
meant to be used during guardband sizing. We can actually do better
than 1.0 here because the absolute minimum CDCLK is limited by the
dotclock. This will still allow planes, pfit, etc. to be changed any
which way without having to resize the guardband yet again.
intel_cdclk_prefill_adjustment() gives out a potentially more optimal
value, purely based on the final minimum CDCLK (also considering
planes/pfit/etc.) for the current pipe. We can't actually check against
the current CDCLK frequency as that might be much higher due to some
other pipe, and said other pipe might later reduce the CDCLK below
what the current pipe would find acceptable (given which WM levels
are enabled). Ie. we don't consider any global constraints (other
pipes, dbuf bandwidth, etc) on the mimimum CDCLK frequency here.
The returned numbers are in .16 binary fixed point.
TODO: the intel_cdclk_prefill_adjustment_worst() approach here
can result in guardband changes for DRRS. But I'm thinking
that is fine since M/N changes will always happen on the
legacy timing generator so guardband doesn't actually matter.
May need to think about this a bit more though...
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
---
drivers/gpu/drm/i915/display/intel_cdclk.c | 87 +++++++++++++++++++++-
drivers/gpu/drm/i915/display/intel_cdclk.h | 4 +
2 files changed, 89 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/display/intel_cdclk.c b/drivers/gpu/drm/i915/display/intel_cdclk.c
index b54b1006aeb0..598593eafdf5 100644
--- a/drivers/gpu/drm/i915/display/intel_cdclk.c
+++ b/drivers/gpu/drm/i915/display/intel_cdclk.c
@@ -2791,16 +2791,20 @@ static int intel_cdclk_guardband(struct intel_display *display)
return 90;
}
-static int intel_pixel_rate_to_cdclk(const struct intel_crtc_state *crtc_state)
+static int _intel_pixel_rate_to_cdclk(const struct intel_crtc_state *crtc_state, int pixel_rate)
{
struct intel_display *display = to_intel_display(crtc_state);
int ppc = intel_cdclk_ppc(display, crtc_state->double_wide);
int guardband = intel_cdclk_guardband(display);
- int pixel_rate = crtc_state->pixel_rate;
return DIV_ROUND_UP(pixel_rate * 100, guardband * ppc);
}
+static int intel_pixel_rate_to_cdclk(const struct intel_crtc_state *crtc_state)
+{
+ return _intel_pixel_rate_to_cdclk(crtc_state, crtc_state->pixel_rate);
+}
+
static int intel_planes_min_cdclk(const struct intel_crtc_state *crtc_state)
{
struct intel_crtc *crtc = to_intel_crtc(crtc_state->uapi.crtc);
@@ -3917,3 +3921,82 @@ void intel_cdclk_read_hw(struct intel_display *display)
cdclk_state->actual = display->cdclk.hw;
cdclk_state->logical = display->cdclk.hw;
}
+
+static int calc_cdclk(const struct intel_crtc_state *crtc_state, int min_cdclk)
+{
+ struct intel_display *display = to_intel_display(crtc_state);
+
+ if (DISPLAY_VER(display) >= 10 || display->platform.broxton) {
+ return bxt_calc_cdclk(display, min_cdclk);
+ } else if (DISPLAY_VER(display) == 9) {
+ int vco;
+
+ vco = display->cdclk.skl_preferred_vco_freq;
+ if (vco == 0)
+ vco = 8100000;
+
+ return skl_calc_cdclk(min_cdclk, vco);
+ } else if (display->platform.broadwell) {
+ return bdw_calc_cdclk(min_cdclk);
+ } else if (display->platform.cherryview || display->platform.valleyview) {
+ return vlv_calc_cdclk(display, min_cdclk);
+ } else {
+ return display->cdclk.max_cdclk_freq;
+ }
+}
+
+static unsigned int _intel_cdclk_prefill_adj(const struct intel_crtc_state *crtc_state,
+ int clock, int min_cdclk)
+{
+ struct intel_display *display = to_intel_display(crtc_state);
+ int ppc = intel_cdclk_ppc(display, crtc_state->double_wide);
+ int cdclk = calc_cdclk(crtc_state, min_cdclk);
+
+ return min(0x10000, DIV_ROUND_UP_ULL((u64)clock << 16, ppc * cdclk));
+}
+
+unsigned int intel_cdclk_prefill_adjustment(const struct intel_crtc_state *crtc_state,
+ const struct intel_cdclk_state *cdclk_state)
+{
+ struct intel_crtc *crtc = to_intel_crtc(crtc_state->uapi.crtc);
+ int clock = crtc_state->hw.pipe_mode.crtc_clock;
+ int min_cdclk;
+
+ /*
+ * Only consider the current pipe's minimum cdclk here as a safe
+ * lower bound. This must *not* be based on the actual/logical cdclk
+ * frequency here as that may get reduced later due to eg. a modeset
+ * on a different pipe, and that would completely invalidate the
+ * guardband length checks we did on this pipe previously. That
+ * could lead to prefill exceeding the guardband which would result
+ * in underruns.
+ */
+ min_cdclk = intel_cdclk_min_cdclk(cdclk_state, crtc->pipe);
+
+ return _intel_cdclk_prefill_adj(crtc_state, clock, min_cdclk);
+}
+
+unsigned int intel_cdclk_prefill_adjustment_worst(const struct intel_crtc_state *crtc_state)
+{
+ int clock = crtc_state->hw.pipe_mode.crtc_clock;
+ int min_cdclk;
+
+ /*
+ * FIXME could perhaps consider a few more of the factors
+ * that go into cdclk_state->min_cdclk[] here. Namely anything
+ * that only changes during full modesets.
+ *
+ * Should perhaps just cache those into a crtc_state->min_cdclk
+ * at .compute_config() time. Then we could also skip recomputing
+ * those parts during intel_cdclk_atomic_check().
+ *
+ * FIXME this assumes 1:1 scaling, but the other _worst() stuff
+ * assumes max downscaling, so the final result will be
+ * unrealistically bad. Figure out where the actual maximum value
+ * lies and use that to compute a more realistic worst case
+ * estimate...
+ */
+ min_cdclk = _intel_pixel_rate_to_cdclk(crtc_state, clock);
+
+ return _intel_cdclk_prefill_adj(crtc_state, clock, min_cdclk);
+}
diff --git a/drivers/gpu/drm/i915/display/intel_cdclk.h b/drivers/gpu/drm/i915/display/intel_cdclk.h
index cacee598af0e..d97f725a0160 100644
--- a/drivers/gpu/drm/i915/display/intel_cdclk.h
+++ b/drivers/gpu/drm/i915/display/intel_cdclk.h
@@ -69,4 +69,8 @@ bool intel_cdclk_pmdemand_needs_update(struct intel_atomic_state *state);
void intel_cdclk_force_min_cdclk(struct intel_cdclk_state *cdclk_state, int force_min_cdclk);
void intel_cdclk_read_hw(struct intel_display *display);
+unsigned int intel_cdclk_prefill_adjustment(const struct intel_crtc_state *crtc_state,
+ const struct intel_cdclk_state *cdclk_state);
+unsigned int intel_cdclk_prefill_adjustment_worst(const struct intel_crtc_state *crtc_state);
+
#endif /* __INTEL_CDCLK_H__ */
--
2.49.1
^ permalink raw reply related [flat|nested] 26+ messages in thread* RE: [RFC][PATCH 02/11] drm/i915/cdclk: Add prefill helpers for CDCLK
2025-10-08 18:25 ` [RFC][PATCH 02/11] drm/i915/cdclk: Add prefill helpers for CDCLK Ville Syrjala
@ 2025-10-13 16:11 ` Shankar, Uma
0 siblings, 0 replies; 26+ messages in thread
From: Shankar, Uma @ 2025-10-13 16:11 UTC (permalink / raw)
To: Ville Syrjala, intel-gfx@lists.freedesktop.org
Cc: intel-xe@lists.freedesktop.org
> -----Original Message-----
> From: Intel-xe <intel-xe-bounces@lists.freedesktop.org> On Behalf Of Ville
> Syrjala
> Sent: Wednesday, October 8, 2025 11:56 PM
> To: intel-gfx@lists.freedesktop.org
> Cc: intel-xe@lists.freedesktop.org
> Subject: [RFC][PATCH 02/11] drm/i915/cdclk: Add prefill helpers for CDCLK
>
> From: Ville Syrjälä <ville.syrjala@linux.intel.com>
>
> Add helpers to compute the CDCLKl adjustment factor for prefill calculations. The
> adjustment factor is always <= 1.0. That is, a faster that strictly necessary
> CDCLK speeds up the pipe prefill.
Nit: Typo in CDCLK and last line can be clarified better.
Overall changes look good to me.
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
> intel_cdclk_prefill_adjustment_worst() gives out a worst case estimate, meant to
> be used during guardband sizing. We can actually do better than 1.0 here because
> the absolute minimum CDCLK is limited by the dotclock. This will still allow
> planes, pfit, etc. to be changed any which way without having to resize the
> guardband yet again.
>
> intel_cdclk_prefill_adjustment() gives out a potentially more optimal value, purely
> based on the final minimum CDCLK (also considering
> planes/pfit/etc.) for the current pipe. We can't actually check against the current
> CDCLK frequency as that might be much higher due to some other pipe, and said
> other pipe might later reduce the CDCLK below what the current pipe would find
> acceptable (given which WM levels are enabled). Ie. we don't consider any global
> constraints (other pipes, dbuf bandwidth, etc) on the mimimum CDCLK frequency
> here.
>
> The returned numbers are in .16 binary fixed point.
>
> TODO: the intel_cdclk_prefill_adjustment_worst() approach here
> can result in guardband changes for DRRS. But I'm thinking
> that is fine since M/N changes will always happen on the
> legacy timing generator so guardband doesn't actually matter.
> May need to think about this a bit more though...
> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> ---
> drivers/gpu/drm/i915/display/intel_cdclk.c | 87 +++++++++++++++++++++-
> drivers/gpu/drm/i915/display/intel_cdclk.h | 4 +
> 2 files changed, 89 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/display/intel_cdclk.c
> b/drivers/gpu/drm/i915/display/intel_cdclk.c
> index b54b1006aeb0..598593eafdf5 100644
> --- a/drivers/gpu/drm/i915/display/intel_cdclk.c
> +++ b/drivers/gpu/drm/i915/display/intel_cdclk.c
> @@ -2791,16 +2791,20 @@ static int intel_cdclk_guardband(struct intel_display
> *display)
> return 90;
> }
>
> -static int intel_pixel_rate_to_cdclk(const struct intel_crtc_state *crtc_state)
> +static int _intel_pixel_rate_to_cdclk(const struct intel_crtc_state
> +*crtc_state, int pixel_rate)
> {
> struct intel_display *display = to_intel_display(crtc_state);
> int ppc = intel_cdclk_ppc(display, crtc_state->double_wide);
> int guardband = intel_cdclk_guardband(display);
> - int pixel_rate = crtc_state->pixel_rate;
>
> return DIV_ROUND_UP(pixel_rate * 100, guardband * ppc); }
>
> +static int intel_pixel_rate_to_cdclk(const struct intel_crtc_state
> +*crtc_state) {
> + return _intel_pixel_rate_to_cdclk(crtc_state, crtc_state->pixel_rate);
> +}
> +
> static int intel_planes_min_cdclk(const struct intel_crtc_state *crtc_state) {
> struct intel_crtc *crtc = to_intel_crtc(crtc_state->uapi.crtc);
> @@ -3917,3 +3921,82 @@ void intel_cdclk_read_hw(struct intel_display
> *display)
> cdclk_state->actual = display->cdclk.hw;
> cdclk_state->logical = display->cdclk.hw; }
> +
> +static int calc_cdclk(const struct intel_crtc_state *crtc_state, int
> +min_cdclk) {
> + struct intel_display *display = to_intel_display(crtc_state);
> +
> + if (DISPLAY_VER(display) >= 10 || display->platform.broxton) {
> + return bxt_calc_cdclk(display, min_cdclk);
> + } else if (DISPLAY_VER(display) == 9) {
> + int vco;
> +
> + vco = display->cdclk.skl_preferred_vco_freq;
> + if (vco == 0)
> + vco = 8100000;
> +
> + return skl_calc_cdclk(min_cdclk, vco);
> + } else if (display->platform.broadwell) {
> + return bdw_calc_cdclk(min_cdclk);
> + } else if (display->platform.cherryview || display->platform.valleyview) {
> + return vlv_calc_cdclk(display, min_cdclk);
> + } else {
> + return display->cdclk.max_cdclk_freq;
> + }
> +}
> +
> +static unsigned int _intel_cdclk_prefill_adj(const struct intel_crtc_state
> *crtc_state,
> + int clock, int min_cdclk)
> +{
> + struct intel_display *display = to_intel_display(crtc_state);
> + int ppc = intel_cdclk_ppc(display, crtc_state->double_wide);
> + int cdclk = calc_cdclk(crtc_state, min_cdclk);
> +
> + return min(0x10000, DIV_ROUND_UP_ULL((u64)clock << 16, ppc *
> cdclk));
> +}
> +
> +unsigned int intel_cdclk_prefill_adjustment(const struct intel_crtc_state
> *crtc_state,
> + const struct intel_cdclk_state
> *cdclk_state) {
> + struct intel_crtc *crtc = to_intel_crtc(crtc_state->uapi.crtc);
> + int clock = crtc_state->hw.pipe_mode.crtc_clock;
> + int min_cdclk;
> +
> + /*
> + * Only consider the current pipe's minimum cdclk here as a safe
> + * lower bound. This must *not* be based on the actual/logical cdclk
> + * frequency here as that may get reduced later due to eg. a modeset
> + * on a different pipe, and that would completely invalidate the
> + * guardband length checks we did on this pipe previously. That
> + * could lead to prefill exceeding the guardband which would result
> + * in underruns.
> + */
> + min_cdclk = intel_cdclk_min_cdclk(cdclk_state, crtc->pipe);
> +
> + return _intel_cdclk_prefill_adj(crtc_state, clock, min_cdclk); }
> +
> +unsigned int intel_cdclk_prefill_adjustment_worst(const struct
> +intel_crtc_state *crtc_state) {
> + int clock = crtc_state->hw.pipe_mode.crtc_clock;
> + int min_cdclk;
> +
> + /*
> + * FIXME could perhaps consider a few more of the factors
> + * that go into cdclk_state->min_cdclk[] here. Namely anything
> + * that only changes during full modesets.
> + *
> + * Should perhaps just cache those into a crtc_state->min_cdclk
> + * at .compute_config() time. Then we could also skip recomputing
> + * those parts during intel_cdclk_atomic_check().
> + *
> + * FIXME this assumes 1:1 scaling, but the other _worst() stuff
> + * assumes max downscaling, so the final result will be
> + * unrealistically bad. Figure out where the actual maximum value
> + * lies and use that to compute a more realistic worst case
> + * estimate...
> + */
> + min_cdclk = _intel_pixel_rate_to_cdclk(crtc_state, clock);
> +
> + return _intel_cdclk_prefill_adj(crtc_state, clock, min_cdclk); }
> diff --git a/drivers/gpu/drm/i915/display/intel_cdclk.h
> b/drivers/gpu/drm/i915/display/intel_cdclk.h
> index cacee598af0e..d97f725a0160 100644
> --- a/drivers/gpu/drm/i915/display/intel_cdclk.h
> +++ b/drivers/gpu/drm/i915/display/intel_cdclk.h
> @@ -69,4 +69,8 @@ bool intel_cdclk_pmdemand_needs_update(struct
> intel_atomic_state *state); void intel_cdclk_force_min_cdclk(struct
> intel_cdclk_state *cdclk_state, int force_min_cdclk); void
> intel_cdclk_read_hw(struct intel_display *display);
>
> +unsigned int intel_cdclk_prefill_adjustment(const struct intel_crtc_state
> *crtc_state,
> + const struct intel_cdclk_state
> *cdclk_state); unsigned int
> +intel_cdclk_prefill_adjustment_worst(const struct intel_crtc_state
> +*crtc_state);
> +
> #endif /* __INTEL_CDCLK_H__ */
> --
> 2.49.1
^ permalink raw reply [flat|nested] 26+ messages in thread
* [RFC][PATCH 03/11] drm/i915/cdclk: Add intel_cdclk_min_cdclk_for_prefill()
2025-10-08 18:25 [RFC][PATCH 00/11] drm/i915/prefill: Introduce helpers for prefill latency calculations Ville Syrjala
2025-10-08 18:25 ` [RFC][PATCH 01/11] drm/i915: Reject modes with linetime > 64 usec Ville Syrjala
2025-10-08 18:25 ` [RFC][PATCH 02/11] drm/i915/cdclk: Add prefill helpers for CDCLK Ville Syrjala
@ 2025-10-08 18:25 ` Ville Syrjala
2025-10-13 16:25 ` Shankar, Uma
2025-10-08 18:25 ` [RFC][PATCH 04/11] drm/i915/dsc: Add prefill helper for DSC Ville Syrjala
` (7 subsequent siblings)
10 siblings, 1 reply; 26+ messages in thread
From: Ville Syrjala @ 2025-10-08 18:25 UTC (permalink / raw)
To: intel-gfx; +Cc: intel-xe
From: Ville Syrjälä <ville.syrjala@linux.intel.com>
Introduce a helper to compute the min required cdclk frequency
for a given guardband size. This could be used to bump up the
cdclk in case the vblank is so small that the normally computed
minimum cdclk results in too slow a prefill.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
---
drivers/gpu/drm/i915/display/intel_cdclk.c | 12 ++++++++++++
drivers/gpu/drm/i915/display/intel_cdclk.h | 3 +++
2 files changed, 15 insertions(+)
diff --git a/drivers/gpu/drm/i915/display/intel_cdclk.c b/drivers/gpu/drm/i915/display/intel_cdclk.c
index 598593eafdf5..37515da056fe 100644
--- a/drivers/gpu/drm/i915/display/intel_cdclk.c
+++ b/drivers/gpu/drm/i915/display/intel_cdclk.c
@@ -4000,3 +4000,15 @@ unsigned int intel_cdclk_prefill_adjustment_worst(const struct intel_crtc_state
return _intel_cdclk_prefill_adj(crtc_state, clock, min_cdclk);
}
+
+int intel_cdclk_min_cdclk_for_prefill(const struct intel_crtc_state *crtc_state,
+ unsigned int prefill_lines_unadjusted,
+ unsigned int prefill_lines_available)
+{
+ struct intel_display *display = to_intel_display(crtc_state);
+ const struct drm_display_mode *pipe_mode = &crtc_state->hw.pipe_mode;
+ int ppc = intel_cdclk_ppc(display, crtc_state->double_wide);
+
+ return DIV_ROUND_UP_ULL(mul_u32_u32(pipe_mode->crtc_clock, prefill_lines_unadjusted),
+ ppc * prefill_lines_available);
+}
diff --git a/drivers/gpu/drm/i915/display/intel_cdclk.h b/drivers/gpu/drm/i915/display/intel_cdclk.h
index d97f725a0160..b908ce2d9eb6 100644
--- a/drivers/gpu/drm/i915/display/intel_cdclk.h
+++ b/drivers/gpu/drm/i915/display/intel_cdclk.h
@@ -72,5 +72,8 @@ void intel_cdclk_read_hw(struct intel_display *display);
unsigned int intel_cdclk_prefill_adjustment(const struct intel_crtc_state *crtc_state,
const struct intel_cdclk_state *cdclk_state);
unsigned int intel_cdclk_prefill_adjustment_worst(const struct intel_crtc_state *crtc_state);
+int intel_cdclk_min_cdclk_for_prefill(const struct intel_crtc_state *crtc_state,
+ unsigned int prefill_lines_unadjusted,
+ unsigned int prefill_lines_available);
#endif /* __INTEL_CDCLK_H__ */
--
2.49.1
^ permalink raw reply related [flat|nested] 26+ messages in thread* RE: [RFC][PATCH 03/11] drm/i915/cdclk: Add intel_cdclk_min_cdclk_for_prefill()
2025-10-08 18:25 ` [RFC][PATCH 03/11] drm/i915/cdclk: Add intel_cdclk_min_cdclk_for_prefill() Ville Syrjala
@ 2025-10-13 16:25 ` Shankar, Uma
0 siblings, 0 replies; 26+ messages in thread
From: Shankar, Uma @ 2025-10-13 16:25 UTC (permalink / raw)
To: Ville Syrjala, intel-gfx@lists.freedesktop.org
Cc: intel-xe@lists.freedesktop.org
> -----Original Message-----
> From: Intel-gfx <intel-gfx-bounces@lists.freedesktop.org> On Behalf Of Ville
> Syrjala
> Sent: Wednesday, October 8, 2025 11:56 PM
> To: intel-gfx@lists.freedesktop.org
> Cc: intel-xe@lists.freedesktop.org
> Subject: [RFC][PATCH 03/11] drm/i915/cdclk: Add
> intel_cdclk_min_cdclk_for_prefill()
>
> From: Ville Syrjälä <ville.syrjala@linux.intel.com>
>
> Introduce a helper to compute the min required cdclk frequency for a given
> guardband size. This could be used to bump up the cdclk in case the vblank is so
> small that the normally computed minimum cdclk results in too slow a prefill.
Looks Good to me.
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> ---
> drivers/gpu/drm/i915/display/intel_cdclk.c | 12 ++++++++++++
> drivers/gpu/drm/i915/display/intel_cdclk.h | 3 +++
> 2 files changed, 15 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/display/intel_cdclk.c
> b/drivers/gpu/drm/i915/display/intel_cdclk.c
> index 598593eafdf5..37515da056fe 100644
> --- a/drivers/gpu/drm/i915/display/intel_cdclk.c
> +++ b/drivers/gpu/drm/i915/display/intel_cdclk.c
> @@ -4000,3 +4000,15 @@ unsigned int
> intel_cdclk_prefill_adjustment_worst(const struct intel_crtc_state
>
> return _intel_cdclk_prefill_adj(crtc_state, clock, min_cdclk); }
> +
> +int intel_cdclk_min_cdclk_for_prefill(const struct intel_crtc_state *crtc_state,
> + unsigned int prefill_lines_unadjusted,
> + unsigned int prefill_lines_available) {
> + struct intel_display *display = to_intel_display(crtc_state);
> + const struct drm_display_mode *pipe_mode = &crtc_state-
> >hw.pipe_mode;
> + int ppc = intel_cdclk_ppc(display, crtc_state->double_wide);
> +
> + return DIV_ROUND_UP_ULL(mul_u32_u32(pipe_mode->crtc_clock,
> prefill_lines_unadjusted),
> + ppc * prefill_lines_available);
> +}
> diff --git a/drivers/gpu/drm/i915/display/intel_cdclk.h
> b/drivers/gpu/drm/i915/display/intel_cdclk.h
> index d97f725a0160..b908ce2d9eb6 100644
> --- a/drivers/gpu/drm/i915/display/intel_cdclk.h
> +++ b/drivers/gpu/drm/i915/display/intel_cdclk.h
> @@ -72,5 +72,8 @@ void intel_cdclk_read_hw(struct intel_display *display);
> unsigned int intel_cdclk_prefill_adjustment(const struct intel_crtc_state
> *crtc_state,
> const struct intel_cdclk_state
> *cdclk_state); unsigned int intel_cdclk_prefill_adjustment_worst(const struct
> intel_crtc_state *crtc_state);
> +int intel_cdclk_min_cdclk_for_prefill(const struct intel_crtc_state *crtc_state,
> + unsigned int prefill_lines_unadjusted,
> + unsigned int prefill_lines_available);
>
> #endif /* __INTEL_CDCLK_H__ */
> --
> 2.49.1
^ permalink raw reply [flat|nested] 26+ messages in thread
* [RFC][PATCH 04/11] drm/i915/dsc: Add prefill helper for DSC
2025-10-08 18:25 [RFC][PATCH 00/11] drm/i915/prefill: Introduce helpers for prefill latency calculations Ville Syrjala
` (2 preceding siblings ...)
2025-10-08 18:25 ` [RFC][PATCH 03/11] drm/i915/cdclk: Add intel_cdclk_min_cdclk_for_prefill() Ville Syrjala
@ 2025-10-08 18:25 ` Ville Syrjala
2025-10-13 16:26 ` Shankar, Uma
2025-10-08 18:25 ` [RFC][PATCH 05/11] drm/i915/scaler: Add scaler prefill helpers Ville Syrjala
` (6 subsequent siblings)
10 siblings, 1 reply; 26+ messages in thread
From: Ville Syrjala @ 2025-10-08 18:25 UTC (permalink / raw)
To: intel-gfx; +Cc: intel-xe
From: Ville Syrjälä <ville.syrjala@linux.intel.com>
Add intel_vdsc_prefill_lines() which tells us how many extra lines
of latency the DSC adds to the pipe prefill.
We shouldn't need a "worst case" vs, "current case" split here
as the DSC state should only change during full modesets.
The returned numbers are in .16 binary fixed point.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
---
drivers/gpu/drm/i915/display/intel_vdsc.c | 8 ++++++++
drivers/gpu/drm/i915/display/intel_vdsc.h | 1 +
2 files changed, 9 insertions(+)
diff --git a/drivers/gpu/drm/i915/display/intel_vdsc.c b/drivers/gpu/drm/i915/display/intel_vdsc.c
index 8e799e225af1..bca747e24a7f 100644
--- a/drivers/gpu/drm/i915/display/intel_vdsc.c
+++ b/drivers/gpu/drm/i915/display/intel_vdsc.c
@@ -1077,3 +1077,11 @@ int intel_vdsc_min_cdclk(const struct intel_crtc_state *crtc_state)
return min_cdclk;
}
+
+unsigned int intel_vdsc_prefill_lines(const struct intel_crtc_state *crtc_state)
+{
+ if (!crtc_state->dsc.compression_enable)
+ return 0;
+
+ return 0x18000; /* 1.5 */
+}
diff --git a/drivers/gpu/drm/i915/display/intel_vdsc.h b/drivers/gpu/drm/i915/display/intel_vdsc.h
index 9e2812f99dd7..2139391ff881 100644
--- a/drivers/gpu/drm/i915/display/intel_vdsc.h
+++ b/drivers/gpu/drm/i915/display/intel_vdsc.h
@@ -32,5 +32,6 @@ void intel_dsc_dp_pps_write(struct intel_encoder *encoder,
void intel_vdsc_state_dump(struct drm_printer *p, int indent,
const struct intel_crtc_state *crtc_state);
int intel_vdsc_min_cdclk(const struct intel_crtc_state *crtc_state);
+unsigned int intel_vdsc_prefill_lines(const struct intel_crtc_state *crtc_state);
#endif /* __INTEL_VDSC_H__ */
--
2.49.1
^ permalink raw reply related [flat|nested] 26+ messages in thread* RE: [RFC][PATCH 04/11] drm/i915/dsc: Add prefill helper for DSC
2025-10-08 18:25 ` [RFC][PATCH 04/11] drm/i915/dsc: Add prefill helper for DSC Ville Syrjala
@ 2025-10-13 16:26 ` Shankar, Uma
0 siblings, 0 replies; 26+ messages in thread
From: Shankar, Uma @ 2025-10-13 16:26 UTC (permalink / raw)
To: Ville Syrjala, intel-gfx@lists.freedesktop.org
Cc: intel-xe@lists.freedesktop.org
> -----Original Message-----
> From: Intel-gfx <intel-gfx-bounces@lists.freedesktop.org> On Behalf Of Ville
> Syrjala
> Sent: Wednesday, October 8, 2025 11:56 PM
> To: intel-gfx@lists.freedesktop.org
> Cc: intel-xe@lists.freedesktop.org
> Subject: [RFC][PATCH 04/11] drm/i915/dsc: Add prefill helper for DSC
>
> From: Ville Syrjälä <ville.syrjala@linux.intel.com>
>
> Add intel_vdsc_prefill_lines() which tells us how many extra lines of latency the
> DSC adds to the pipe prefill.
>
> We shouldn't need a "worst case" vs, "current case" split here as the DSC state
> should only change during full modesets.
>
> The returned numbers are in .16 binary fixed point.
Looks Good to me.
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> ---
> drivers/gpu/drm/i915/display/intel_vdsc.c | 8 ++++++++
> drivers/gpu/drm/i915/display/intel_vdsc.h | 1 +
> 2 files changed, 9 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/display/intel_vdsc.c
> b/drivers/gpu/drm/i915/display/intel_vdsc.c
> index 8e799e225af1..bca747e24a7f 100644
> --- a/drivers/gpu/drm/i915/display/intel_vdsc.c
> +++ b/drivers/gpu/drm/i915/display/intel_vdsc.c
> @@ -1077,3 +1077,11 @@ int intel_vdsc_min_cdclk(const struct intel_crtc_state
> *crtc_state)
>
> return min_cdclk;
> }
> +
> +unsigned int intel_vdsc_prefill_lines(const struct intel_crtc_state
> +*crtc_state) {
> + if (!crtc_state->dsc.compression_enable)
> + return 0;
> +
> + return 0x18000; /* 1.5 */
> +}
> diff --git a/drivers/gpu/drm/i915/display/intel_vdsc.h
> b/drivers/gpu/drm/i915/display/intel_vdsc.h
> index 9e2812f99dd7..2139391ff881 100644
> --- a/drivers/gpu/drm/i915/display/intel_vdsc.h
> +++ b/drivers/gpu/drm/i915/display/intel_vdsc.h
> @@ -32,5 +32,6 @@ void intel_dsc_dp_pps_write(struct intel_encoder *encoder,
> void intel_vdsc_state_dump(struct drm_printer *p, int indent,
> const struct intel_crtc_state *crtc_state); int
> intel_vdsc_min_cdclk(const struct intel_crtc_state *crtc_state);
> +unsigned int intel_vdsc_prefill_lines(const struct intel_crtc_state
> +*crtc_state);
>
> #endif /* __INTEL_VDSC_H__ */
> --
> 2.49.1
^ permalink raw reply [flat|nested] 26+ messages in thread
* [RFC][PATCH 05/11] drm/i915/scaler: Add scaler prefill helpers
2025-10-08 18:25 [RFC][PATCH 00/11] drm/i915/prefill: Introduce helpers for prefill latency calculations Ville Syrjala
` (3 preceding siblings ...)
2025-10-08 18:25 ` [RFC][PATCH 04/11] drm/i915/dsc: Add prefill helper for DSC Ville Syrjala
@ 2025-10-08 18:25 ` Ville Syrjala
2025-10-13 17:56 ` Shankar, Uma
2025-10-08 18:25 ` [RFC][PATCH 06/11] drm/i195/wm: Add WM0 " Ville Syrjala
` (5 subsequent siblings)
10 siblings, 1 reply; 26+ messages in thread
From: Ville Syrjala @ 2025-10-08 18:25 UTC (permalink / raw)
To: intel-gfx; +Cc: intel-xe
From: Ville Syrjälä <ville.syrjala@linux.intel.com>
Add helpers to compute the required prefill line count and
adjustment factors for the scalers.
The "1st" variants hand out numbers for the first scaler stage
in the pipeline (pipe scaler if no plane scalers are enabled,
or the max from all the plane scaler). The "2nd" variants deal
with second scaler stage (pipe scaler when plane scaling is also
enabled, otherwise there is no second stage).
The _worst() variants give out worst case estimates, meant for
guardband sizing. The other variants are meant for the actual
vblank/guardband length check vs. prefill+pkgc/sagv latency.
A few other helpers are added for the purpose of the WM0 prefill
worst case estimates (to be introduced later).
The returned numbers are in .16 binary fixed point.
TODO: pretty rough, should check the actual scaler max scaling
factros instead of just assuming 3x everywhere
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
---
drivers/gpu/drm/i915/display/skl_scaler.c | 174 ++++++++++++++++++++++
drivers/gpu/drm/i915/display/skl_scaler.h | 11 ++
2 files changed, 185 insertions(+)
diff --git a/drivers/gpu/drm/i915/display/skl_scaler.c b/drivers/gpu/drm/i915/display/skl_scaler.c
index c6cccf170ff1..6e90639494ca 100644
--- a/drivers/gpu/drm/i915/display/skl_scaler.c
+++ b/drivers/gpu/drm/i915/display/skl_scaler.c
@@ -968,3 +968,177 @@ void adl_scaler_ecc_unmask(const struct intel_crtc_state *crtc_state)
1);
intel_de_write(display, XELPD_DISPLAY_ERR_FATAL_MASK, 0);
}
+
+static unsigned int skl_scaler_scale(const struct intel_crtc_state *crtc_state, int i)
+{
+ const struct intel_crtc_scaler_state *scaler_state =
+ &crtc_state->scaler_state;
+
+ drm_dbg_kms(to_intel_display(crtc_state)->drm, "scaler %d %x %x\n",
+ i, scaler_state->scalers[i].hscale,
+ scaler_state->scalers[i].vscale);
+
+ return DIV_ROUND_UP_ULL(mul_u32_u32(scaler_state->scalers[i].hscale,
+ scaler_state->scalers[i].vscale),
+ 0x10000);
+}
+
+static unsigned int skl_scaler_downscale(const struct intel_crtc_state *crtc_state, int i)
+{
+ return max(0x10000, skl_scaler_scale(crtc_state, i));
+}
+
+static unsigned int skl_plane_scaler_downscale(const struct intel_crtc_state *crtc_state)
+{
+ struct intel_crtc *crtc = to_intel_crtc(crtc_state->uapi.crtc);
+ const struct intel_crtc_scaler_state *scaler_state =
+ &crtc_state->scaler_state;
+ unsigned int scale = 0x10000;
+ int i;
+
+ for (i = 0; i < crtc->num_scalers; i++) {
+ /* ignore pfit */
+ if (i == scaler_state->scaler_id)
+ continue;
+
+ if (!scaler_state->scalers[i].in_use)
+ continue;
+
+ scale = max(scale, skl_scaler_downscale(crtc_state, i));
+ }
+
+ return scale;
+}
+
+static unsigned int skl_pipe_scaler_downscale(const struct intel_crtc_state *crtc_state)
+{
+ const struct intel_crtc_scaler_state *scaler_state =
+ &crtc_state->scaler_state;
+
+ if (!crtc_state->pch_pfit.enabled)
+ return 0x10000;
+
+ return skl_scaler_downscale(crtc_state, scaler_state->scaler_id);
+}
+
+unsigned int skl_scaler_1st_prefill_adjustment(const struct intel_crtc_state *crtc_state)
+{
+ const struct intel_crtc_scaler_state *scaler_state =
+ &crtc_state->scaler_state;
+ int num_scalers = hweight32(scaler_state->scaler_users);
+
+ if (num_scalers < 1)
+ return 0x10000;
+
+ if (num_scalers == 1 && crtc_state->pch_pfit.enabled)
+ return skl_pipe_scaler_downscale(crtc_state);
+ else
+ return skl_plane_scaler_downscale(crtc_state);
+}
+
+unsigned int skl_scaler_2nd_prefill_adjustment(const struct intel_crtc_state *crtc_state)
+{
+ const struct intel_crtc_scaler_state *scaler_state =
+ &crtc_state->scaler_state;
+ int num_scalers = hweight32(scaler_state->scaler_users);
+
+ if (num_scalers < 2)
+ return 0x10000;
+
+ return skl_pipe_scaler_downscale(crtc_state);
+}
+
+unsigned int skl_scaler_1st_prefill_lines(const struct intel_crtc_state *crtc_state)
+{
+ const struct intel_crtc_scaler_state *scaler_state =
+ &crtc_state->scaler_state;
+ int num_scalers = hweight32(scaler_state->scaler_users);
+
+ if (num_scalers > 0)
+ return 4 << 16;
+
+ return 0;
+}
+
+unsigned int skl_scaler_2nd_prefill_lines(const struct intel_crtc_state *crtc_state)
+{
+ const struct intel_crtc_scaler_state *scaler_state =
+ &crtc_state->scaler_state;
+ int num_scalers = hweight32(scaler_state->scaler_users);
+
+ if (num_scalers > 1 && crtc_state->pch_pfit.enabled)
+ return 4 << 16;
+
+ return 0;
+}
+
+static unsigned int _skl_scaler_max_scale(const struct intel_crtc_state *crtc_state,
+ unsigned int max_scale)
+{
+ struct intel_display *display = to_intel_display(crtc_state);
+
+ /*
+ * Downscaling requires increasing cdclk, so max scale
+ * factor is limited to the max_dotclock/dotclock ratio.
+ */
+ drm_dbg_kms(display->drm, "max %d, dotclock %d\n",
+ display->cdclk.max_dotclk_freq, crtc_state->hw.pipe_mode.crtc_clock);
+
+ /* FIXME find out the max downscale factors properly */
+ return min(max_scale, DIV_ROUND_UP_ULL((u64)display->cdclk.max_dotclk_freq << 16,
+ crtc_state->hw.pipe_mode.crtc_clock));
+}
+
+static unsigned int skl_scaler_max_scale(const struct intel_crtc_state *crtc_state)
+{
+ struct intel_crtc *crtc = to_intel_crtc(crtc_state->uapi.crtc);
+ unsigned int max_scale;
+
+ if (crtc->num_scalers < 1)
+ return 0x10000;
+
+ /* FIXME find out the max downscale factors properly */
+ max_scale = 9 << 16;
+
+ return _skl_scaler_max_scale(crtc_state, max_scale);
+}
+
+unsigned int skl_scaler_1st_prefill_adjustment_worst(const struct intel_crtc_state *crtc_state)
+{
+ struct intel_crtc *crtc = to_intel_crtc(crtc_state->uapi.crtc);
+
+ if (crtc->num_scalers > 0)
+ return skl_scaler_max_scale(crtc_state);
+ else
+ return 0x10000;
+}
+
+unsigned int skl_scaler_2nd_prefill_adjustment_worst(const struct intel_crtc_state *crtc_state)
+{
+ struct intel_crtc *crtc = to_intel_crtc(crtc_state->uapi.crtc);
+
+ if (crtc->num_scalers > 1)
+ return skl_scaler_max_scale(crtc_state);
+ else
+ return 0x10000;
+}
+
+unsigned int skl_scaler_1st_prefill_lines_worst(const struct intel_crtc_state *crtc_state)
+{
+ struct intel_crtc *crtc = to_intel_crtc(crtc_state->uapi.crtc);
+
+ if (crtc->num_scalers > 0)
+ return 4 << 16;
+ else
+ return 0;
+}
+
+unsigned int skl_scaler_2nd_prefill_lines_worst(const struct intel_crtc_state *crtc_state)
+{
+ struct intel_crtc *crtc = to_intel_crtc(crtc_state->uapi.crtc);
+
+ if (crtc->num_scalers > 1)
+ return 4 << 16;
+ else
+ return 0;
+}
diff --git a/drivers/gpu/drm/i915/display/skl_scaler.h b/drivers/gpu/drm/i915/display/skl_scaler.h
index 12a19016c5f6..6fab40d2b4ee 100644
--- a/drivers/gpu/drm/i915/display/skl_scaler.h
+++ b/drivers/gpu/drm/i915/display/skl_scaler.h
@@ -45,4 +45,15 @@ skl_scaler_mode_valid(struct intel_display *display,
void adl_scaler_ecc_mask(const struct intel_crtc_state *crtc_state);
void adl_scaler_ecc_unmask(const struct intel_crtc_state *crtc_state);
+
+unsigned int skl_scaler_1st_prefill_adjustment_worst(const struct intel_crtc_state *crtc_state);
+unsigned int skl_scaler_2nd_prefill_adjustment_worst(const struct intel_crtc_state *crtc_state);
+unsigned int skl_scaler_1st_prefill_lines_worst(const struct intel_crtc_state *crtc_state);
+unsigned int skl_scaler_2nd_prefill_lines_worst(const struct intel_crtc_state *crtc_state);
+
+unsigned int skl_scaler_1st_prefill_adjustment(const struct intel_crtc_state *crtc_state);
+unsigned int skl_scaler_2nd_prefill_adjustment(const struct intel_crtc_state *crtc_state);
+unsigned int skl_scaler_1st_prefill_lines(const struct intel_crtc_state *crtc_state);
+unsigned int skl_scaler_2nd_prefill_lines(const struct intel_crtc_state *crtc_state);
+
#endif
--
2.49.1
^ permalink raw reply related [flat|nested] 26+ messages in thread* RE: [RFC][PATCH 05/11] drm/i915/scaler: Add scaler prefill helpers
2025-10-08 18:25 ` [RFC][PATCH 05/11] drm/i915/scaler: Add scaler prefill helpers Ville Syrjala
@ 2025-10-13 17:56 ` Shankar, Uma
0 siblings, 0 replies; 26+ messages in thread
From: Shankar, Uma @ 2025-10-13 17:56 UTC (permalink / raw)
To: Ville Syrjala, intel-gfx@lists.freedesktop.org
Cc: intel-xe@lists.freedesktop.org
> -----Original Message-----
> From: Intel-gfx <intel-gfx-bounces@lists.freedesktop.org> On Behalf Of Ville
> Syrjala
> Sent: Wednesday, October 8, 2025 11:56 PM
> To: intel-gfx@lists.freedesktop.org
> Cc: intel-xe@lists.freedesktop.org
> Subject: [RFC][PATCH 05/11] drm/i915/scaler: Add scaler prefill helpers
>
> From: Ville Syrjälä <ville.syrjala@linux.intel.com>
>
> Add helpers to compute the required prefill line count and adjustment factors for
> the scalers.
>
> The "1st" variants hand out numbers for the first scaler stage in the pipeline (pipe
> scaler if no plane scalers are enabled, or the max from all the plane scaler). The
> "2nd" variants deal with second scaler stage (pipe scaler when plane scaling is
> also enabled, otherwise there is no second stage).
>
> The _worst() variants give out worst case estimates, meant for guardband sizing.
> The other variants are meant for the actual vblank/guardband length check vs.
> prefill+pkgc/sagv latency.
>
> A few other helpers are added for the purpose of the WM0 prefill worst case
> estimates (to be introduced later).
>
> The returned numbers are in .16 binary fixed point.
>
> TODO: pretty rough, should check the actual scaler max scaling
> factros instead of just assuming 3x everywhere
Nit: Typo in factors
Looks Good to me.
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> ---
> drivers/gpu/drm/i915/display/skl_scaler.c | 174 ++++++++++++++++++++++
> drivers/gpu/drm/i915/display/skl_scaler.h | 11 ++
> 2 files changed, 185 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/display/skl_scaler.c
> b/drivers/gpu/drm/i915/display/skl_scaler.c
> index c6cccf170ff1..6e90639494ca 100644
> --- a/drivers/gpu/drm/i915/display/skl_scaler.c
> +++ b/drivers/gpu/drm/i915/display/skl_scaler.c
> @@ -968,3 +968,177 @@ void adl_scaler_ecc_unmask(const struct
> intel_crtc_state *crtc_state)
> 1);
> intel_de_write(display, XELPD_DISPLAY_ERR_FATAL_MASK, 0); }
> +
> +static unsigned int skl_scaler_scale(const struct intel_crtc_state
> +*crtc_state, int i) {
> + const struct intel_crtc_scaler_state *scaler_state =
> + &crtc_state->scaler_state;
> +
> + drm_dbg_kms(to_intel_display(crtc_state)->drm, "scaler %d %x %x\n",
> + i, scaler_state->scalers[i].hscale,
> + scaler_state->scalers[i].vscale);
> +
> + return DIV_ROUND_UP_ULL(mul_u32_u32(scaler_state-
> >scalers[i].hscale,
> + scaler_state->scalers[i].vscale),
> + 0x10000);
> +}
> +
> +static unsigned int skl_scaler_downscale(const struct intel_crtc_state
> +*crtc_state, int i) {
> + return max(0x10000, skl_scaler_scale(crtc_state, i)); }
> +
> +static unsigned int skl_plane_scaler_downscale(const struct
> +intel_crtc_state *crtc_state) {
> + struct intel_crtc *crtc = to_intel_crtc(crtc_state->uapi.crtc);
> + const struct intel_crtc_scaler_state *scaler_state =
> + &crtc_state->scaler_state;
> + unsigned int scale = 0x10000;
> + int i;
> +
> + for (i = 0; i < crtc->num_scalers; i++) {
> + /* ignore pfit */
> + if (i == scaler_state->scaler_id)
> + continue;
> +
> + if (!scaler_state->scalers[i].in_use)
> + continue;
> +
> + scale = max(scale, skl_scaler_downscale(crtc_state, i));
> + }
> +
> + return scale;
> +}
> +
> +static unsigned int skl_pipe_scaler_downscale(const struct
> +intel_crtc_state *crtc_state) {
> + const struct intel_crtc_scaler_state *scaler_state =
> + &crtc_state->scaler_state;
> +
> + if (!crtc_state->pch_pfit.enabled)
> + return 0x10000;
> +
> + return skl_scaler_downscale(crtc_state, scaler_state->scaler_id); }
> +
> +unsigned int skl_scaler_1st_prefill_adjustment(const struct
> +intel_crtc_state *crtc_state) {
> + const struct intel_crtc_scaler_state *scaler_state =
> + &crtc_state->scaler_state;
> + int num_scalers = hweight32(scaler_state->scaler_users);
> +
> + if (num_scalers < 1)
> + return 0x10000;
> +
> + if (num_scalers == 1 && crtc_state->pch_pfit.enabled)
> + return skl_pipe_scaler_downscale(crtc_state);
> + else
> + return skl_plane_scaler_downscale(crtc_state);
> +}
> +
> +unsigned int skl_scaler_2nd_prefill_adjustment(const struct
> +intel_crtc_state *crtc_state) {
> + const struct intel_crtc_scaler_state *scaler_state =
> + &crtc_state->scaler_state;
> + int num_scalers = hweight32(scaler_state->scaler_users);
> +
> + if (num_scalers < 2)
> + return 0x10000;
> +
> + return skl_pipe_scaler_downscale(crtc_state);
> +}
> +
> +unsigned int skl_scaler_1st_prefill_lines(const struct intel_crtc_state
> +*crtc_state) {
> + const struct intel_crtc_scaler_state *scaler_state =
> + &crtc_state->scaler_state;
> + int num_scalers = hweight32(scaler_state->scaler_users);
> +
> + if (num_scalers > 0)
> + return 4 << 16;
> +
> + return 0;
> +}
> +
> +unsigned int skl_scaler_2nd_prefill_lines(const struct intel_crtc_state
> +*crtc_state) {
> + const struct intel_crtc_scaler_state *scaler_state =
> + &crtc_state->scaler_state;
> + int num_scalers = hweight32(scaler_state->scaler_users);
> +
> + if (num_scalers > 1 && crtc_state->pch_pfit.enabled)
> + return 4 << 16;
> +
> + return 0;
> +}
> +
> +static unsigned int _skl_scaler_max_scale(const struct intel_crtc_state
> *crtc_state,
> + unsigned int max_scale)
> +{
> + struct intel_display *display = to_intel_display(crtc_state);
> +
> + /*
> + * Downscaling requires increasing cdclk, so max scale
> + * factor is limited to the max_dotclock/dotclock ratio.
> + */
> + drm_dbg_kms(display->drm, "max %d, dotclock %d\n",
> + display->cdclk.max_dotclk_freq,
> +crtc_state->hw.pipe_mode.crtc_clock);
> +
> + /* FIXME find out the max downscale factors properly */
> + return min(max_scale, DIV_ROUND_UP_ULL((u64)display-
> >cdclk.max_dotclk_freq << 16,
> + crtc_state-
> >hw.pipe_mode.crtc_clock));
> +}
> +
> +static unsigned int skl_scaler_max_scale(const struct intel_crtc_state
> +*crtc_state) {
> + struct intel_crtc *crtc = to_intel_crtc(crtc_state->uapi.crtc);
> + unsigned int max_scale;
> +
> + if (crtc->num_scalers < 1)
> + return 0x10000;
> +
> + /* FIXME find out the max downscale factors properly */
> + max_scale = 9 << 16;
> +
> + return _skl_scaler_max_scale(crtc_state, max_scale); }
> +
> +unsigned int skl_scaler_1st_prefill_adjustment_worst(const struct
> +intel_crtc_state *crtc_state) {
> + struct intel_crtc *crtc = to_intel_crtc(crtc_state->uapi.crtc);
> +
> + if (crtc->num_scalers > 0)
> + return skl_scaler_max_scale(crtc_state);
> + else
> + return 0x10000;
> +}
> +
> +unsigned int skl_scaler_2nd_prefill_adjustment_worst(const struct
> +intel_crtc_state *crtc_state) {
> + struct intel_crtc *crtc = to_intel_crtc(crtc_state->uapi.crtc);
> +
> + if (crtc->num_scalers > 1)
> + return skl_scaler_max_scale(crtc_state);
> + else
> + return 0x10000;
> +}
> +
> +unsigned int skl_scaler_1st_prefill_lines_worst(const struct
> +intel_crtc_state *crtc_state) {
> + struct intel_crtc *crtc = to_intel_crtc(crtc_state->uapi.crtc);
> +
> + if (crtc->num_scalers > 0)
> + return 4 << 16;
> + else
> + return 0;
> +}
> +
> +unsigned int skl_scaler_2nd_prefill_lines_worst(const struct
> +intel_crtc_state *crtc_state) {
> + struct intel_crtc *crtc = to_intel_crtc(crtc_state->uapi.crtc);
> +
> + if (crtc->num_scalers > 1)
> + return 4 << 16;
> + else
> + return 0;
> +}
> diff --git a/drivers/gpu/drm/i915/display/skl_scaler.h
> b/drivers/gpu/drm/i915/display/skl_scaler.h
> index 12a19016c5f6..6fab40d2b4ee 100644
> --- a/drivers/gpu/drm/i915/display/skl_scaler.h
> +++ b/drivers/gpu/drm/i915/display/skl_scaler.h
> @@ -45,4 +45,15 @@ skl_scaler_mode_valid(struct intel_display *display, void
> adl_scaler_ecc_mask(const struct intel_crtc_state *crtc_state);
>
> void adl_scaler_ecc_unmask(const struct intel_crtc_state *crtc_state);
> +
> +unsigned int skl_scaler_1st_prefill_adjustment_worst(const struct
> +intel_crtc_state *crtc_state); unsigned int
> +skl_scaler_2nd_prefill_adjustment_worst(const struct intel_crtc_state
> +*crtc_state); unsigned int skl_scaler_1st_prefill_lines_worst(const
> +struct intel_crtc_state *crtc_state); unsigned int
> +skl_scaler_2nd_prefill_lines_worst(const struct intel_crtc_state
> +*crtc_state);
> +
> +unsigned int skl_scaler_1st_prefill_adjustment(const struct
> +intel_crtc_state *crtc_state); unsigned int
> +skl_scaler_2nd_prefill_adjustment(const struct intel_crtc_state
> +*crtc_state); unsigned int skl_scaler_1st_prefill_lines(const struct
> +intel_crtc_state *crtc_state); unsigned int
> +skl_scaler_2nd_prefill_lines(const struct intel_crtc_state
> +*crtc_state);
> +
> #endif
> --
> 2.49.1
^ permalink raw reply [flat|nested] 26+ messages in thread
* [RFC][PATCH 06/11] drm/i195/wm: Add WM0 prefill helpers
2025-10-08 18:25 [RFC][PATCH 00/11] drm/i915/prefill: Introduce helpers for prefill latency calculations Ville Syrjala
` (4 preceding siblings ...)
2025-10-08 18:25 ` [RFC][PATCH 05/11] drm/i915/scaler: Add scaler prefill helpers Ville Syrjala
@ 2025-10-08 18:25 ` Ville Syrjala
2025-10-13 18:30 ` Shankar, Uma
2025-10-08 18:25 ` [RFC][PATCH 07/11] drm/i915: Introduce intel_compute_global_watermarks_late() Ville Syrjala
` (4 subsequent siblings)
10 siblings, 1 reply; 26+ messages in thread
From: Ville Syrjala @ 2025-10-08 18:25 UTC (permalink / raw)
To: intel-gfx; +Cc: intel-xe
From: Ville Syrjälä <ville.syrjala@linux.intel.com>
Add skl_wm0_prefill_lines() (based on the actual state) and
skl_wm0_prefill_lines_worst() (worst case estimate) which
tell us how many extra lines are needed in prefill for WM0.
The returned numbers are in .16 binary fixed point.
TODO: skl_wm0_prefill_lines_worst() is a bit rough still
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
---
drivers/gpu/drm/i915/display/skl_scaler.c | 32 ++++++++++-
drivers/gpu/drm/i915/display/skl_scaler.h | 4 ++
drivers/gpu/drm/i915/display/skl_watermark.c | 59 ++++++++++++++++++++
drivers/gpu/drm/i915/display/skl_watermark.h | 3 +
4 files changed, 97 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/display/skl_scaler.c b/drivers/gpu/drm/i915/display/skl_scaler.c
index 6e90639494ca..783fee985e84 100644
--- a/drivers/gpu/drm/i915/display/skl_scaler.c
+++ b/drivers/gpu/drm/i915/display/skl_scaler.c
@@ -1089,7 +1089,37 @@ static unsigned int _skl_scaler_max_scale(const struct intel_crtc_state *crtc_st
crtc_state->hw.pipe_mode.crtc_clock));
}
-static unsigned int skl_scaler_max_scale(const struct intel_crtc_state *crtc_state)
+unsigned int skl_scaler_max_total_scale(const struct intel_crtc_state *crtc_state)
+{
+ struct intel_crtc *crtc = to_intel_crtc(crtc_state->uapi.crtc);
+ unsigned int max_scale;
+
+ if (crtc->num_scalers < 1)
+ return 0x10000;
+
+ /* FIXME find out the max downscale factors properly */
+ max_scale = 9 << 16;
+ if (crtc->num_scalers > 1)
+ max_scale *= 9;
+
+ return _skl_scaler_max_scale(crtc_state, max_scale);
+}
+
+unsigned int skl_scaler_max_hscale(const struct intel_crtc_state *crtc_state)
+{
+ struct intel_crtc *crtc = to_intel_crtc(crtc_state->uapi.crtc);
+ unsigned int max_scale;
+
+ if (crtc->num_scalers < 1)
+ return 0x10000;
+
+ /* FIXME find out the max downscale factors properly */
+ max_scale = 3 << 16;
+
+ return _skl_scaler_max_scale(crtc_state, max_scale);
+}
+
+unsigned int skl_scaler_max_scale(const struct intel_crtc_state *crtc_state)
{
struct intel_crtc *crtc = to_intel_crtc(crtc_state->uapi.crtc);
unsigned int max_scale;
diff --git a/drivers/gpu/drm/i915/display/skl_scaler.h b/drivers/gpu/drm/i915/display/skl_scaler.h
index 6fab40d2b4ee..5deabca909e6 100644
--- a/drivers/gpu/drm/i915/display/skl_scaler.h
+++ b/drivers/gpu/drm/i915/display/skl_scaler.h
@@ -46,6 +46,10 @@ void adl_scaler_ecc_mask(const struct intel_crtc_state *crtc_state);
void adl_scaler_ecc_unmask(const struct intel_crtc_state *crtc_state);
+unsigned int skl_scaler_max_total_scale(const struct intel_crtc_state *crtc_state);
+unsigned int skl_scaler_max_scale(const struct intel_crtc_state *crtc_state);
+unsigned int skl_scaler_max_hscale(const struct intel_crtc_state *crtc_state);
+
unsigned int skl_scaler_1st_prefill_adjustment_worst(const struct intel_crtc_state *crtc_state);
unsigned int skl_scaler_2nd_prefill_adjustment_worst(const struct intel_crtc_state *crtc_state);
unsigned int skl_scaler_1st_prefill_lines_worst(const struct intel_crtc_state *crtc_state);
diff --git a/drivers/gpu/drm/i915/display/skl_watermark.c b/drivers/gpu/drm/i915/display/skl_watermark.c
index 9df9ee137bf9..aac3ca8f6c0f 100644
--- a/drivers/gpu/drm/i915/display/skl_watermark.c
+++ b/drivers/gpu/drm/i915/display/skl_watermark.c
@@ -29,6 +29,7 @@
#include "intel_pcode.h"
#include "intel_plane.h"
#include "intel_wm.h"
+#include "skl_scaler.h"
#include "skl_universal_plane_regs.h"
#include "skl_watermark.h"
#include "skl_watermark_regs.h"
@@ -2244,6 +2245,59 @@ skl_is_vblank_too_short(const struct intel_crtc_state *crtc_state,
adjusted_mode->crtc_vtotal - adjusted_mode->crtc_vblank_start;
}
+unsigned int skl_wm0_prefill_lines_worst(const struct intel_crtc_state *crtc_state)
+{
+ struct intel_display *display = to_intel_display(crtc_state);
+ struct intel_plane *plane = to_intel_plane(crtc_state->uapi.crtc->primary);
+ const struct drm_display_mode *pipe_mode = &crtc_state->hw.pipe_mode;
+ int ret, pixel_rate, width, level = 0;
+ struct skl_wm_level wm = {};
+ struct skl_wm_params wp;
+ unsigned int latency;
+ u64 modifier;
+
+ /*
+ * FIXME rather ugly to pick this by hand but maybe no other way?
+ * FIXME older hw doesn't support 16bpc+scaling so we should figure
+ * out a more realistic modifier+scaling combo on those...
+ */
+ if (DISPLAY_VER(display) == 9)
+ modifier = I915_FORMAT_MOD_Y_TILED_CCS;
+ else if (HAS_4TILE(display))
+ modifier = I915_FORMAT_MOD_4_TILED;
+ else
+ modifier = I915_FORMAT_MOD_Y_TILED;
+
+ pixel_rate = DIV_ROUND_UP_ULL(mul_u32_u32(skl_scaler_max_total_scale(crtc_state),
+ pipe_mode->crtc_clock),
+ 0x10000);
+
+ /* FIXME limit to max plane width? */
+ width = DIV_ROUND_UP_ULL(mul_u32_u32(skl_scaler_max_hscale(crtc_state),
+ pipe_mode->crtc_hdisplay),
+ 0x10000);
+
+ /* FIXME is 90/270 rotation worse than 0/180? */
+ ret = skl_compute_wm_params(crtc_state, width,
+ drm_format_info(DRM_FORMAT_XBGR16161616F),
+ modifier, DRM_MODE_ROTATE_0,
+ pixel_rate, &wp, 0, 1);
+ drm_WARN_ON(display->drm, ret);
+
+ latency = skl_wm_latency(display, level, &wp);
+
+ skl_compute_plane_wm(crtc_state, plane, level, latency, &wp, &wm, &wm);
+
+ /*
+ * FIXME Is this sane? Older hw doesn't even have wm.lines for WM0 so
+ * those will never hit this and just return the computed wm.lines.
+ */
+ if (wm.min_ddb_alloc == U16_MAX)
+ wm.lines = skl_wm_max_lines(display);
+
+ return wm.lines << 16;
+}
+
static int skl_max_wm0_lines(const struct intel_crtc_state *crtc_state)
{
struct intel_crtc *crtc = to_intel_crtc(crtc_state->uapi.crtc);
@@ -2260,6 +2314,11 @@ static int skl_max_wm0_lines(const struct intel_crtc_state *crtc_state)
return wm0_lines;
}
+unsigned int skl_wm0_prefill_lines(const struct intel_crtc_state *crtc_state)
+{
+ return skl_max_wm0_lines(crtc_state) << 16;
+}
+
/*
* TODO: In case we use PKG_C_LATENCY to allow C-states when the delayed vblank
* size is too small for the package C exit latency we need to notify PSR about
diff --git a/drivers/gpu/drm/i915/display/skl_watermark.h b/drivers/gpu/drm/i915/display/skl_watermark.h
index 62790816f030..6bc2ec9164bf 100644
--- a/drivers/gpu/drm/i915/display/skl_watermark.h
+++ b/drivers/gpu/drm/i915/display/skl_watermark.h
@@ -79,5 +79,8 @@ void intel_program_dpkgc_latency(struct intel_atomic_state *state);
bool intel_dbuf_pmdemand_needs_update(struct intel_atomic_state *state);
+unsigned int skl_wm0_prefill_lines_worst(const struct intel_crtc_state *crtc_state);
+unsigned int skl_wm0_prefill_lines(const struct intel_crtc_state *crtc_state);
+
#endif /* __SKL_WATERMARK_H__ */
--
2.49.1
^ permalink raw reply related [flat|nested] 26+ messages in thread* RE: [RFC][PATCH 06/11] drm/i195/wm: Add WM0 prefill helpers
2025-10-08 18:25 ` [RFC][PATCH 06/11] drm/i195/wm: Add WM0 " Ville Syrjala
@ 2025-10-13 18:30 ` Shankar, Uma
2025-10-13 21:49 ` Ville Syrjälä
0 siblings, 1 reply; 26+ messages in thread
From: Shankar, Uma @ 2025-10-13 18:30 UTC (permalink / raw)
To: Ville Syrjala, intel-gfx@lists.freedesktop.org
Cc: intel-xe@lists.freedesktop.org
> -----Original Message-----
> From: Intel-gfx <intel-gfx-bounces@lists.freedesktop.org> On Behalf Of Ville
> Syrjala
> Sent: Wednesday, October 8, 2025 11:56 PM
> To: intel-gfx@lists.freedesktop.org
> Cc: intel-xe@lists.freedesktop.org
> Subject: [RFC][PATCH 06/11] drm/i195/wm: Add WM0 prefill helpers
>
Nit: Typo in i915
Overall looks good to me. More refinement can be done later for the FIXME's
and Todo's
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
> From: Ville Syrjälä <ville.syrjala@linux.intel.com>
>
> Add skl_wm0_prefill_lines() (based on the actual state) and
> skl_wm0_prefill_lines_worst() (worst case estimate) which tell us how many extra
> lines are needed in prefill for WM0.
>
> The returned numbers are in .16 binary fixed point.
>
> TODO: skl_wm0_prefill_lines_worst() is a bit rough still
>
> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> ---
> drivers/gpu/drm/i915/display/skl_scaler.c | 32 ++++++++++-
> drivers/gpu/drm/i915/display/skl_scaler.h | 4 ++
> drivers/gpu/drm/i915/display/skl_watermark.c | 59 ++++++++++++++++++++
> drivers/gpu/drm/i915/display/skl_watermark.h | 3 +
> 4 files changed, 97 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/display/skl_scaler.c
> b/drivers/gpu/drm/i915/display/skl_scaler.c
> index 6e90639494ca..783fee985e84 100644
> --- a/drivers/gpu/drm/i915/display/skl_scaler.c
> +++ b/drivers/gpu/drm/i915/display/skl_scaler.c
> @@ -1089,7 +1089,37 @@ static unsigned int _skl_scaler_max_scale(const
> struct intel_crtc_state *crtc_st
> crtc_state-
> >hw.pipe_mode.crtc_clock));
> }
>
> -static unsigned int skl_scaler_max_scale(const struct intel_crtc_state
> *crtc_state)
> +unsigned int skl_scaler_max_total_scale(const struct intel_crtc_state
> +*crtc_state) {
> + struct intel_crtc *crtc = to_intel_crtc(crtc_state->uapi.crtc);
> + unsigned int max_scale;
> +
> + if (crtc->num_scalers < 1)
> + return 0x10000;
> +
> + /* FIXME find out the max downscale factors properly */
> + max_scale = 9 << 16;
> + if (crtc->num_scalers > 1)
> + max_scale *= 9;
> +
> + return _skl_scaler_max_scale(crtc_state, max_scale); }
> +
> +unsigned int skl_scaler_max_hscale(const struct intel_crtc_state
> +*crtc_state) {
> + struct intel_crtc *crtc = to_intel_crtc(crtc_state->uapi.crtc);
> + unsigned int max_scale;
> +
> + if (crtc->num_scalers < 1)
> + return 0x10000;
> +
> + /* FIXME find out the max downscale factors properly */
> + max_scale = 3 << 16;
> +
> + return _skl_scaler_max_scale(crtc_state, max_scale); }
> +
> +unsigned int skl_scaler_max_scale(const struct intel_crtc_state
> +*crtc_state)
> {
> struct intel_crtc *crtc = to_intel_crtc(crtc_state->uapi.crtc);
> unsigned int max_scale;
> diff --git a/drivers/gpu/drm/i915/display/skl_scaler.h
> b/drivers/gpu/drm/i915/display/skl_scaler.h
> index 6fab40d2b4ee..5deabca909e6 100644
> --- a/drivers/gpu/drm/i915/display/skl_scaler.h
> +++ b/drivers/gpu/drm/i915/display/skl_scaler.h
> @@ -46,6 +46,10 @@ void adl_scaler_ecc_mask(const struct intel_crtc_state
> *crtc_state);
>
> void adl_scaler_ecc_unmask(const struct intel_crtc_state *crtc_state);
>
> +unsigned int skl_scaler_max_total_scale(const struct intel_crtc_state
> +*crtc_state); unsigned int skl_scaler_max_scale(const struct
> +intel_crtc_state *crtc_state); unsigned int skl_scaler_max_hscale(const
> +struct intel_crtc_state *crtc_state);
> +
> unsigned int skl_scaler_1st_prefill_adjustment_worst(const struct intel_crtc_state
> *crtc_state); unsigned int skl_scaler_2nd_prefill_adjustment_worst(const struct
> intel_crtc_state *crtc_state); unsigned int
> skl_scaler_1st_prefill_lines_worst(const struct intel_crtc_state *crtc_state); diff --
> git a/drivers/gpu/drm/i915/display/skl_watermark.c
> b/drivers/gpu/drm/i915/display/skl_watermark.c
> index 9df9ee137bf9..aac3ca8f6c0f 100644
> --- a/drivers/gpu/drm/i915/display/skl_watermark.c
> +++ b/drivers/gpu/drm/i915/display/skl_watermark.c
> @@ -29,6 +29,7 @@
> #include "intel_pcode.h"
> #include "intel_plane.h"
> #include "intel_wm.h"
> +#include "skl_scaler.h"
> #include "skl_universal_plane_regs.h"
> #include "skl_watermark.h"
> #include "skl_watermark_regs.h"
> @@ -2244,6 +2245,59 @@ skl_is_vblank_too_short(const struct intel_crtc_state
> *crtc_state,
> adjusted_mode->crtc_vtotal - adjusted_mode->crtc_vblank_start;
> }
>
> +unsigned int skl_wm0_prefill_lines_worst(const struct intel_crtc_state
> +*crtc_state) {
> + struct intel_display *display = to_intel_display(crtc_state);
> + struct intel_plane *plane = to_intel_plane(crtc_state->uapi.crtc->primary);
> + const struct drm_display_mode *pipe_mode = &crtc_state-
> >hw.pipe_mode;
> + int ret, pixel_rate, width, level = 0;
> + struct skl_wm_level wm = {};
> + struct skl_wm_params wp;
> + unsigned int latency;
> + u64 modifier;
> +
> + /*
> + * FIXME rather ugly to pick this by hand but maybe no other way?
> + * FIXME older hw doesn't support 16bpc+scaling so we should figure
> + * out a more realistic modifier+scaling combo on those...
> + */
> + if (DISPLAY_VER(display) == 9)
> + modifier = I915_FORMAT_MOD_Y_TILED_CCS;
> + else if (HAS_4TILE(display))
> + modifier = I915_FORMAT_MOD_4_TILED;
> + else
> + modifier = I915_FORMAT_MOD_Y_TILED;
> +
> + pixel_rate =
> DIV_ROUND_UP_ULL(mul_u32_u32(skl_scaler_max_total_scale(crtc_state),
> + pipe_mode->crtc_clock),
> + 0x10000);
> +
> + /* FIXME limit to max plane width? */
> + width =
> DIV_ROUND_UP_ULL(mul_u32_u32(skl_scaler_max_hscale(crtc_state),
> + pipe_mode->crtc_hdisplay),
> + 0x10000);
> +
> + /* FIXME is 90/270 rotation worse than 0/180? */
> + ret = skl_compute_wm_params(crtc_state, width,
> +
> drm_format_info(DRM_FORMAT_XBGR16161616F),
> + modifier, DRM_MODE_ROTATE_0,
> + pixel_rate, &wp, 0, 1);
> + drm_WARN_ON(display->drm, ret);
> +
> + latency = skl_wm_latency(display, level, &wp);
> +
> + skl_compute_plane_wm(crtc_state, plane, level, latency, &wp, &wm,
> +&wm);
> +
> + /*
> + * FIXME Is this sane? Older hw doesn't even have wm.lines for WM0 so
> + * those will never hit this and just return the computed wm.lines.
> + */
> + if (wm.min_ddb_alloc == U16_MAX)
> + wm.lines = skl_wm_max_lines(display);
> +
> + return wm.lines << 16;
> +}
> +
> static int skl_max_wm0_lines(const struct intel_crtc_state *crtc_state) {
> struct intel_crtc *crtc = to_intel_crtc(crtc_state->uapi.crtc);
> @@ -2260,6 +2314,11 @@ static int skl_max_wm0_lines(const struct
> intel_crtc_state *crtc_state)
> return wm0_lines;
> }
>
> +unsigned int skl_wm0_prefill_lines(const struct intel_crtc_state
> +*crtc_state) {
> + return skl_max_wm0_lines(crtc_state) << 16; }
> +
> /*
> * TODO: In case we use PKG_C_LATENCY to allow C-states when the delayed
> vblank
> * size is too small for the package C exit latency we need to notify PSR about
> diff --git a/drivers/gpu/drm/i915/display/skl_watermark.h
> b/drivers/gpu/drm/i915/display/skl_watermark.h
> index 62790816f030..6bc2ec9164bf 100644
> --- a/drivers/gpu/drm/i915/display/skl_watermark.h
> +++ b/drivers/gpu/drm/i915/display/skl_watermark.h
> @@ -79,5 +79,8 @@ void intel_program_dpkgc_latency(struct intel_atomic_state
> *state);
>
> bool intel_dbuf_pmdemand_needs_update(struct intel_atomic_state *state);
>
> +unsigned int skl_wm0_prefill_lines_worst(const struct intel_crtc_state
> +*crtc_state); unsigned int skl_wm0_prefill_lines(const struct
> +intel_crtc_state *crtc_state);
> +
> #endif /* __SKL_WATERMARK_H__ */
>
> --
> 2.49.1
^ permalink raw reply [flat|nested] 26+ messages in thread* Re: [RFC][PATCH 06/11] drm/i195/wm: Add WM0 prefill helpers
2025-10-13 18:30 ` Shankar, Uma
@ 2025-10-13 21:49 ` Ville Syrjälä
0 siblings, 0 replies; 26+ messages in thread
From: Ville Syrjälä @ 2025-10-13 21:49 UTC (permalink / raw)
To: Shankar, Uma
Cc: intel-gfx@lists.freedesktop.org, intel-xe@lists.freedesktop.org
On Mon, Oct 13, 2025 at 06:30:20PM +0000, Shankar, Uma wrote:
>
>
> > -----Original Message-----
> > From: Intel-gfx <intel-gfx-bounces@lists.freedesktop.org> On Behalf Of Ville
> > Syrjala
> > Sent: Wednesday, October 8, 2025 11:56 PM
> > To: intel-gfx@lists.freedesktop.org
> > Cc: intel-xe@lists.freedesktop.org
> > Subject: [RFC][PATCH 06/11] drm/i195/wm: Add WM0 prefill helpers
> >
>
> Nit: Typo in i915
>
> Overall looks good to me. More refinement can be done later for the FIXME's
> and Todo's
Actually I think I can just drop a bunch of it since this will only
get used for the VRR guardband calculation, so we definitely don't
have to worry about any pre-icl stuff here. I'll probably throw in
a WARN(!HAS_VRR)+comment as a reminder.
> Reviewed-by: Uma Shankar <uma.shankar@intel.com>
>
> > From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> >
> > Add skl_wm0_prefill_lines() (based on the actual state) and
> > skl_wm0_prefill_lines_worst() (worst case estimate) which tell us how many extra
> > lines are needed in prefill for WM0.
> >
> > The returned numbers are in .16 binary fixed point.
> >
> > TODO: skl_wm0_prefill_lines_worst() is a bit rough still
> >
> > Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > ---
> > drivers/gpu/drm/i915/display/skl_scaler.c | 32 ++++++++++-
> > drivers/gpu/drm/i915/display/skl_scaler.h | 4 ++
> > drivers/gpu/drm/i915/display/skl_watermark.c | 59 ++++++++++++++++++++
> > drivers/gpu/drm/i915/display/skl_watermark.h | 3 +
> > 4 files changed, 97 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/i915/display/skl_scaler.c
> > b/drivers/gpu/drm/i915/display/skl_scaler.c
> > index 6e90639494ca..783fee985e84 100644
> > --- a/drivers/gpu/drm/i915/display/skl_scaler.c
> > +++ b/drivers/gpu/drm/i915/display/skl_scaler.c
> > @@ -1089,7 +1089,37 @@ static unsigned int _skl_scaler_max_scale(const
> > struct intel_crtc_state *crtc_st
> > crtc_state-
> > >hw.pipe_mode.crtc_clock));
> > }
> >
> > -static unsigned int skl_scaler_max_scale(const struct intel_crtc_state
> > *crtc_state)
> > +unsigned int skl_scaler_max_total_scale(const struct intel_crtc_state
> > +*crtc_state) {
> > + struct intel_crtc *crtc = to_intel_crtc(crtc_state->uapi.crtc);
> > + unsigned int max_scale;
> > +
> > + if (crtc->num_scalers < 1)
> > + return 0x10000;
> > +
> > + /* FIXME find out the max downscale factors properly */
> > + max_scale = 9 << 16;
> > + if (crtc->num_scalers > 1)
> > + max_scale *= 9;
> > +
> > + return _skl_scaler_max_scale(crtc_state, max_scale); }
> > +
> > +unsigned int skl_scaler_max_hscale(const struct intel_crtc_state
> > +*crtc_state) {
> > + struct intel_crtc *crtc = to_intel_crtc(crtc_state->uapi.crtc);
> > + unsigned int max_scale;
> > +
> > + if (crtc->num_scalers < 1)
> > + return 0x10000;
> > +
> > + /* FIXME find out the max downscale factors properly */
> > + max_scale = 3 << 16;
> > +
> > + return _skl_scaler_max_scale(crtc_state, max_scale); }
> > +
> > +unsigned int skl_scaler_max_scale(const struct intel_crtc_state
> > +*crtc_state)
> > {
> > struct intel_crtc *crtc = to_intel_crtc(crtc_state->uapi.crtc);
> > unsigned int max_scale;
> > diff --git a/drivers/gpu/drm/i915/display/skl_scaler.h
> > b/drivers/gpu/drm/i915/display/skl_scaler.h
> > index 6fab40d2b4ee..5deabca909e6 100644
> > --- a/drivers/gpu/drm/i915/display/skl_scaler.h
> > +++ b/drivers/gpu/drm/i915/display/skl_scaler.h
> > @@ -46,6 +46,10 @@ void adl_scaler_ecc_mask(const struct intel_crtc_state
> > *crtc_state);
> >
> > void adl_scaler_ecc_unmask(const struct intel_crtc_state *crtc_state);
> >
> > +unsigned int skl_scaler_max_total_scale(const struct intel_crtc_state
> > +*crtc_state); unsigned int skl_scaler_max_scale(const struct
> > +intel_crtc_state *crtc_state); unsigned int skl_scaler_max_hscale(const
> > +struct intel_crtc_state *crtc_state);
> > +
> > unsigned int skl_scaler_1st_prefill_adjustment_worst(const struct intel_crtc_state
> > *crtc_state); unsigned int skl_scaler_2nd_prefill_adjustment_worst(const struct
> > intel_crtc_state *crtc_state); unsigned int
> > skl_scaler_1st_prefill_lines_worst(const struct intel_crtc_state *crtc_state); diff --
> > git a/drivers/gpu/drm/i915/display/skl_watermark.c
> > b/drivers/gpu/drm/i915/display/skl_watermark.c
> > index 9df9ee137bf9..aac3ca8f6c0f 100644
> > --- a/drivers/gpu/drm/i915/display/skl_watermark.c
> > +++ b/drivers/gpu/drm/i915/display/skl_watermark.c
> > @@ -29,6 +29,7 @@
> > #include "intel_pcode.h"
> > #include "intel_plane.h"
> > #include "intel_wm.h"
> > +#include "skl_scaler.h"
> > #include "skl_universal_plane_regs.h"
> > #include "skl_watermark.h"
> > #include "skl_watermark_regs.h"
> > @@ -2244,6 +2245,59 @@ skl_is_vblank_too_short(const struct intel_crtc_state
> > *crtc_state,
> > adjusted_mode->crtc_vtotal - adjusted_mode->crtc_vblank_start;
> > }
> >
> > +unsigned int skl_wm0_prefill_lines_worst(const struct intel_crtc_state
> > +*crtc_state) {
> > + struct intel_display *display = to_intel_display(crtc_state);
> > + struct intel_plane *plane = to_intel_plane(crtc_state->uapi.crtc->primary);
> > + const struct drm_display_mode *pipe_mode = &crtc_state-
> > >hw.pipe_mode;
> > + int ret, pixel_rate, width, level = 0;
> > + struct skl_wm_level wm = {};
> > + struct skl_wm_params wp;
> > + unsigned int latency;
> > + u64 modifier;
> > +
> > + /*
> > + * FIXME rather ugly to pick this by hand but maybe no other way?
> > + * FIXME older hw doesn't support 16bpc+scaling so we should figure
> > + * out a more realistic modifier+scaling combo on those...
> > + */
> > + if (DISPLAY_VER(display) == 9)
> > + modifier = I915_FORMAT_MOD_Y_TILED_CCS;
> > + else if (HAS_4TILE(display))
> > + modifier = I915_FORMAT_MOD_4_TILED;
> > + else
> > + modifier = I915_FORMAT_MOD_Y_TILED;
> > +
> > + pixel_rate =
> > DIV_ROUND_UP_ULL(mul_u32_u32(skl_scaler_max_total_scale(crtc_state),
> > + pipe_mode->crtc_clock),
> > + 0x10000);
> > +
> > + /* FIXME limit to max plane width? */
> > + width =
> > DIV_ROUND_UP_ULL(mul_u32_u32(skl_scaler_max_hscale(crtc_state),
> > + pipe_mode->crtc_hdisplay),
> > + 0x10000);
> > +
> > + /* FIXME is 90/270 rotation worse than 0/180? */
> > + ret = skl_compute_wm_params(crtc_state, width,
> > +
> > drm_format_info(DRM_FORMAT_XBGR16161616F),
> > + modifier, DRM_MODE_ROTATE_0,
> > + pixel_rate, &wp, 0, 1);
> > + drm_WARN_ON(display->drm, ret);
> > +
> > + latency = skl_wm_latency(display, level, &wp);
> > +
> > + skl_compute_plane_wm(crtc_state, plane, level, latency, &wp, &wm,
> > +&wm);
> > +
> > + /*
> > + * FIXME Is this sane? Older hw doesn't even have wm.lines for WM0 so
> > + * those will never hit this and just return the computed wm.lines.
> > + */
> > + if (wm.min_ddb_alloc == U16_MAX)
> > + wm.lines = skl_wm_max_lines(display);
> > +
> > + return wm.lines << 16;
> > +}
> > +
> > static int skl_max_wm0_lines(const struct intel_crtc_state *crtc_state) {
> > struct intel_crtc *crtc = to_intel_crtc(crtc_state->uapi.crtc);
> > @@ -2260,6 +2314,11 @@ static int skl_max_wm0_lines(const struct
> > intel_crtc_state *crtc_state)
> > return wm0_lines;
> > }
> >
> > +unsigned int skl_wm0_prefill_lines(const struct intel_crtc_state
> > +*crtc_state) {
> > + return skl_max_wm0_lines(crtc_state) << 16; }
> > +
> > /*
> > * TODO: In case we use PKG_C_LATENCY to allow C-states when the delayed
> > vblank
> > * size is too small for the package C exit latency we need to notify PSR about
> > diff --git a/drivers/gpu/drm/i915/display/skl_watermark.h
> > b/drivers/gpu/drm/i915/display/skl_watermark.h
> > index 62790816f030..6bc2ec9164bf 100644
> > --- a/drivers/gpu/drm/i915/display/skl_watermark.h
> > +++ b/drivers/gpu/drm/i915/display/skl_watermark.h
> > @@ -79,5 +79,8 @@ void intel_program_dpkgc_latency(struct intel_atomic_state
> > *state);
> >
> > bool intel_dbuf_pmdemand_needs_update(struct intel_atomic_state *state);
> >
> > +unsigned int skl_wm0_prefill_lines_worst(const struct intel_crtc_state
> > +*crtc_state); unsigned int skl_wm0_prefill_lines(const struct
> > +intel_crtc_state *crtc_state);
> > +
> > #endif /* __SKL_WATERMARK_H__ */
> >
> > --
> > 2.49.1
>
--
Ville Syrjälä
Intel
^ permalink raw reply [flat|nested] 26+ messages in thread
* [RFC][PATCH 07/11] drm/i915: Introduce intel_compute_global_watermarks_late()
2025-10-08 18:25 [RFC][PATCH 00/11] drm/i915/prefill: Introduce helpers for prefill latency calculations Ville Syrjala
` (5 preceding siblings ...)
2025-10-08 18:25 ` [RFC][PATCH 06/11] drm/i195/wm: Add WM0 " Ville Syrjala
@ 2025-10-08 18:25 ` Ville Syrjala
2025-10-13 18:36 ` Shankar, Uma
2025-10-08 18:25 ` [RFC][PATCH 08/11] drm/i915/prefill: Introduce intel_prefill.c Ville Syrjala
` (3 subsequent siblings)
10 siblings, 1 reply; 26+ messages in thread
From: Ville Syrjala @ 2025-10-08 18:25 UTC (permalink / raw)
To: intel-gfx; +Cc: intel-xe
From: Ville Syrjälä <ville.syrjala@linux.intel.com>
Add a late watermarks computation stage for skl+. Need this
(temporarily perhaps) to get the final cdclk values into
skl_wm_check_vblank().
But the order is actually wrong now for SAGV. For SAGV we
want to first do skl_wm_check_vblank(), then copute
crttc_state->use_sagv_wm, and then do intel_bw_atomic_check().
Scalers are completely borked as far as skl_wm_check_vblank()
goes as well.
TODO: just a hack really, need to figure out the correct order
for real
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
---
drivers/gpu/drm/i915/display/intel_display.c | 4 ++
.../gpu/drm/i915/display/intel_display_core.h | 1 +
drivers/gpu/drm/i915/display/intel_wm.c | 10 +++++
drivers/gpu/drm/i915/display/intel_wm.h | 1 +
drivers/gpu/drm/i915/display/skl_watermark.c | 38 ++++++++++++++++---
5 files changed, 48 insertions(+), 6 deletions(-)
diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c
index afa78774eaeb..54ed36183ebe 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -6455,6 +6455,10 @@ int intel_atomic_check(struct drm_device *dev,
return ret;
}
+ ret = intel_compute_global_watermarks_late(state);
+ if (ret)
+ goto fail;
+
ret = intel_pmdemand_atomic_check(state);
if (ret)
goto fail;
diff --git a/drivers/gpu/drm/i915/display/intel_display_core.h b/drivers/gpu/drm/i915/display/intel_display_core.h
index df4da52cbdb3..62bd894a72f9 100644
--- a/drivers/gpu/drm/i915/display/intel_display_core.h
+++ b/drivers/gpu/drm/i915/display/intel_display_core.h
@@ -89,6 +89,7 @@ struct intel_wm_funcs {
void (*optimize_watermarks)(struct intel_atomic_state *state,
struct intel_crtc *crtc);
int (*compute_global_watermarks)(struct intel_atomic_state *state);
+ int (*compute_global_watermarks_late)(struct intel_atomic_state *state);
void (*get_hw_state)(struct intel_display *display);
void (*sanitize)(struct intel_display *display);
};
diff --git a/drivers/gpu/drm/i915/display/intel_wm.c b/drivers/gpu/drm/i915/display/intel_wm.c
index f887a664fe22..3035d9879d83 100644
--- a/drivers/gpu/drm/i915/display/intel_wm.c
+++ b/drivers/gpu/drm/i915/display/intel_wm.c
@@ -104,6 +104,16 @@ int intel_compute_global_watermarks(struct intel_atomic_state *state)
return 0;
}
+int intel_compute_global_watermarks_late(struct intel_atomic_state *state)
+{
+ struct intel_display *display = to_intel_display(state);
+
+ if (display->funcs.wm->compute_global_watermarks_late)
+ return display->funcs.wm->compute_global_watermarks_late(state);
+
+ return 0;
+}
+
void intel_wm_get_hw_state(struct intel_display *display)
{
if (display->funcs.wm->get_hw_state)
diff --git a/drivers/gpu/drm/i915/display/intel_wm.h b/drivers/gpu/drm/i915/display/intel_wm.h
index 9ad4e9eae5ca..9f69dc5dfda5 100644
--- a/drivers/gpu/drm/i915/display/intel_wm.h
+++ b/drivers/gpu/drm/i915/display/intel_wm.h
@@ -24,6 +24,7 @@ void intel_atomic_update_watermarks(struct intel_atomic_state *state,
void intel_optimize_watermarks(struct intel_atomic_state *state,
struct intel_crtc *crtc);
int intel_compute_global_watermarks(struct intel_atomic_state *state);
+int intel_compute_global_watermarks_late(struct intel_atomic_state *state);
void intel_wm_get_hw_state(struct intel_display *display);
void intel_wm_sanitize(struct intel_display *display);
bool intel_wm_plane_visible(const struct intel_crtc_state *crtc_state,
diff --git a/drivers/gpu/drm/i915/display/skl_watermark.c b/drivers/gpu/drm/i915/display/skl_watermark.c
index aac3ca8f6c0f..5c18fe9a5237 100644
--- a/drivers/gpu/drm/i915/display/skl_watermark.c
+++ b/drivers/gpu/drm/i915/display/skl_watermark.c
@@ -2433,7 +2433,7 @@ static int skl_build_pipe_wm(struct intel_atomic_state *state,
crtc_state->wm.skl.optimal = crtc_state->wm.skl.raw;
- return skl_wm_check_vblank(crtc_state);
+ return 0;
}
static bool skl_wm_level_equals(const struct skl_wm_level *l1,
@@ -3002,11 +3002,6 @@ skl_compute_wm(struct intel_atomic_state *state)
if (ret)
return ret;
- /*
- * skl_compute_ddb() will have adjusted the final watermarks
- * based on how much ddb is available. Now we can actually
- * check if the final watermarks changed.
- */
for_each_new_intel_crtc_in_state(state, crtc, new_crtc_state, i) {
struct skl_pipe_wm *pipe_wm = &new_crtc_state->wm.skl.optimal;
@@ -3030,6 +3025,36 @@ skl_compute_wm(struct intel_atomic_state *state)
pipe_wm->use_sagv_wm = !HAS_HW_SAGV_WM(display) &&
DISPLAY_VER(display) >= 12 &&
intel_crtc_can_enable_sagv(new_crtc_state);
+ }
+
+ return 0;
+}
+
+static int
+skl_compute_wm_late(struct intel_atomic_state *state)
+{
+ struct intel_crtc *crtc;
+ struct intel_crtc_state __maybe_unused *new_crtc_state;
+ int i;
+
+ for_each_new_intel_crtc_in_state(state, crtc, new_crtc_state, i) {
+ int ret;
+
+ /*
+ * FIXME we need something along the lins of the following order:
+ * 1. intel_atomic_setup_scalers() (needed by skl_wm_check_vblank())
+ * 2. intel_modeset_calc_cdclk() (needed by skl_wm_check_vblank())
+ * 3. skl_compute_wm() (skl_wm_check_vblank() + update use_sagv_wm)
+ * 4. intel_bw_atomic_check() (needs use_sagv_wm)
+ * but shuffling all that needs a lot more work...
+ *
+ * For now hack it by deferreing skl_wm_check_vblank() until
+ * intel_modeset_calc_cdclk() has been done. Scalers are still
+ * completely broken wrt. skl_wm_check_vblank().
+ */
+ ret = skl_wm_check_vblank(new_crtc_state);
+ if (ret)
+ return ret;
ret = skl_wm_add_affected_planes(state, crtc);
if (ret)
@@ -4060,6 +4085,7 @@ void intel_wm_state_verify(struct intel_atomic_state *state,
static const struct intel_wm_funcs skl_wm_funcs = {
.compute_global_watermarks = skl_compute_wm,
+ .compute_global_watermarks_late = skl_compute_wm_late,
.get_hw_state = skl_wm_get_hw_state,
.sanitize = skl_wm_sanitize,
};
--
2.49.1
^ permalink raw reply related [flat|nested] 26+ messages in thread* RE: [RFC][PATCH 07/11] drm/i915: Introduce intel_compute_global_watermarks_late()
2025-10-08 18:25 ` [RFC][PATCH 07/11] drm/i915: Introduce intel_compute_global_watermarks_late() Ville Syrjala
@ 2025-10-13 18:36 ` Shankar, Uma
2025-10-13 20:35 ` Ville Syrjälä
0 siblings, 1 reply; 26+ messages in thread
From: Shankar, Uma @ 2025-10-13 18:36 UTC (permalink / raw)
To: Ville Syrjala, intel-gfx@lists.freedesktop.org
Cc: intel-xe@lists.freedesktop.org
> -----Original Message-----
> From: Intel-gfx <intel-gfx-bounces@lists.freedesktop.org> On Behalf Of Ville
> Syrjala
> Sent: Wednesday, October 8, 2025 11:56 PM
> To: intel-gfx@lists.freedesktop.org
> Cc: intel-xe@lists.freedesktop.org
> Subject: [RFC][PATCH 07/11] drm/i915: Introduce
> intel_compute_global_watermarks_late()
>
> From: Ville Syrjälä <ville.syrjala@linux.intel.com>
>
> Add a late watermarks computation stage for skl+. Need this (temporarily
> perhaps) to get the final cdclk values into skl_wm_check_vblank().
>
> But the order is actually wrong now for SAGV. For SAGV we want to first do
> skl_wm_check_vblank(), then copute crttc_state->use_sagv_wm, and then do
Nit: Typo in crtc
> intel_bw_atomic_check().
>
> Scalers are completely borked as far as skl_wm_check_vblank() goes as well.
>
> TODO: just a hack really, need to figure out the correct order
> for real
We can go with hack for now and take up the re-factoring later to sort out the correct order.
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> ---
> drivers/gpu/drm/i915/display/intel_display.c | 4 ++
> .../gpu/drm/i915/display/intel_display_core.h | 1 +
> drivers/gpu/drm/i915/display/intel_wm.c | 10 +++++
> drivers/gpu/drm/i915/display/intel_wm.h | 1 +
> drivers/gpu/drm/i915/display/skl_watermark.c | 38 ++++++++++++++++---
> 5 files changed, 48 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/display/intel_display.c
> b/drivers/gpu/drm/i915/display/intel_display.c
> index afa78774eaeb..54ed36183ebe 100644
> --- a/drivers/gpu/drm/i915/display/intel_display.c
> +++ b/drivers/gpu/drm/i915/display/intel_display.c
> @@ -6455,6 +6455,10 @@ int intel_atomic_check(struct drm_device *dev,
> return ret;
> }
>
> + ret = intel_compute_global_watermarks_late(state);
> + if (ret)
> + goto fail;
> +
> ret = intel_pmdemand_atomic_check(state);
> if (ret)
> goto fail;
> diff --git a/drivers/gpu/drm/i915/display/intel_display_core.h
> b/drivers/gpu/drm/i915/display/intel_display_core.h
> index df4da52cbdb3..62bd894a72f9 100644
> --- a/drivers/gpu/drm/i915/display/intel_display_core.h
> +++ b/drivers/gpu/drm/i915/display/intel_display_core.h
> @@ -89,6 +89,7 @@ struct intel_wm_funcs {
> void (*optimize_watermarks)(struct intel_atomic_state *state,
> struct intel_crtc *crtc);
> int (*compute_global_watermarks)(struct intel_atomic_state *state);
> + int (*compute_global_watermarks_late)(struct intel_atomic_state
> +*state);
> void (*get_hw_state)(struct intel_display *display);
> void (*sanitize)(struct intel_display *display); }; diff --git
> a/drivers/gpu/drm/i915/display/intel_wm.c
> b/drivers/gpu/drm/i915/display/intel_wm.c
> index f887a664fe22..3035d9879d83 100644
> --- a/drivers/gpu/drm/i915/display/intel_wm.c
> +++ b/drivers/gpu/drm/i915/display/intel_wm.c
> @@ -104,6 +104,16 @@ int intel_compute_global_watermarks(struct
> intel_atomic_state *state)
> return 0;
> }
>
> +int intel_compute_global_watermarks_late(struct intel_atomic_state
> +*state) {
> + struct intel_display *display = to_intel_display(state);
> +
> + if (display->funcs.wm->compute_global_watermarks_late)
> + return display->funcs.wm-
> >compute_global_watermarks_late(state);
> +
> + return 0;
> +}
> +
> void intel_wm_get_hw_state(struct intel_display *display) {
> if (display->funcs.wm->get_hw_state)
> diff --git a/drivers/gpu/drm/i915/display/intel_wm.h
> b/drivers/gpu/drm/i915/display/intel_wm.h
> index 9ad4e9eae5ca..9f69dc5dfda5 100644
> --- a/drivers/gpu/drm/i915/display/intel_wm.h
> +++ b/drivers/gpu/drm/i915/display/intel_wm.h
> @@ -24,6 +24,7 @@ void intel_atomic_update_watermarks(struct
> intel_atomic_state *state, void intel_optimize_watermarks(struct
> intel_atomic_state *state,
> struct intel_crtc *crtc);
> int intel_compute_global_watermarks(struct intel_atomic_state *state);
> +int intel_compute_global_watermarks_late(struct intel_atomic_state
> +*state);
> void intel_wm_get_hw_state(struct intel_display *display); void
> intel_wm_sanitize(struct intel_display *display); bool
> intel_wm_plane_visible(const struct intel_crtc_state *crtc_state, diff --git
> a/drivers/gpu/drm/i915/display/skl_watermark.c
> b/drivers/gpu/drm/i915/display/skl_watermark.c
> index aac3ca8f6c0f..5c18fe9a5237 100644
> --- a/drivers/gpu/drm/i915/display/skl_watermark.c
> +++ b/drivers/gpu/drm/i915/display/skl_watermark.c
> @@ -2433,7 +2433,7 @@ static int skl_build_pipe_wm(struct intel_atomic_state
> *state,
>
> crtc_state->wm.skl.optimal = crtc_state->wm.skl.raw;
>
> - return skl_wm_check_vblank(crtc_state);
> + return 0;
> }
>
> static bool skl_wm_level_equals(const struct skl_wm_level *l1, @@ -3002,11
> +3002,6 @@ skl_compute_wm(struct intel_atomic_state *state)
> if (ret)
> return ret;
>
> - /*
> - * skl_compute_ddb() will have adjusted the final watermarks
> - * based on how much ddb is available. Now we can actually
> - * check if the final watermarks changed.
> - */
> for_each_new_intel_crtc_in_state(state, crtc, new_crtc_state, i) {
> struct skl_pipe_wm *pipe_wm = &new_crtc_state-
> >wm.skl.optimal;
>
> @@ -3030,6 +3025,36 @@ skl_compute_wm(struct intel_atomic_state *state)
> pipe_wm->use_sagv_wm = !HAS_HW_SAGV_WM(display) &&
> DISPLAY_VER(display) >= 12 &&
> intel_crtc_can_enable_sagv(new_crtc_state);
> + }
> +
> + return 0;
> +}
> +
> +static int
> +skl_compute_wm_late(struct intel_atomic_state *state) {
> + struct intel_crtc *crtc;
> + struct intel_crtc_state __maybe_unused *new_crtc_state;
> + int i;
> +
> + for_each_new_intel_crtc_in_state(state, crtc, new_crtc_state, i) {
> + int ret;
> +
> + /*
> + * FIXME we need something along the lins of the following order:
> + * 1. intel_atomic_setup_scalers() (needed by
> skl_wm_check_vblank())
> + * 2. intel_modeset_calc_cdclk() (needed by
> skl_wm_check_vblank())
> + * 3. skl_compute_wm() (skl_wm_check_vblank() + update
> use_sagv_wm)
> + * 4. intel_bw_atomic_check() (needs use_sagv_wm)
> + * but shuffling all that needs a lot more work...
> + *
> + * For now hack it by deferreing skl_wm_check_vblank() until
> + * intel_modeset_calc_cdclk() has been done. Scalers are still
> + * completely broken wrt. skl_wm_check_vblank().
> + */
> + ret = skl_wm_check_vblank(new_crtc_state);
> + if (ret)
> + return ret;
>
> ret = skl_wm_add_affected_planes(state, crtc);
> if (ret)
> @@ -4060,6 +4085,7 @@ void intel_wm_state_verify(struct intel_atomic_state
> *state,
>
> static const struct intel_wm_funcs skl_wm_funcs = {
> .compute_global_watermarks = skl_compute_wm,
> + .compute_global_watermarks_late = skl_compute_wm_late,
> .get_hw_state = skl_wm_get_hw_state,
> .sanitize = skl_wm_sanitize,
> };
> --
> 2.49.1
^ permalink raw reply [flat|nested] 26+ messages in thread* Re: [RFC][PATCH 07/11] drm/i915: Introduce intel_compute_global_watermarks_late()
2025-10-13 18:36 ` Shankar, Uma
@ 2025-10-13 20:35 ` Ville Syrjälä
0 siblings, 0 replies; 26+ messages in thread
From: Ville Syrjälä @ 2025-10-13 20:35 UTC (permalink / raw)
To: Shankar, Uma
Cc: intel-gfx@lists.freedesktop.org, intel-xe@lists.freedesktop.org
On Mon, Oct 13, 2025 at 06:36:07PM +0000, Shankar, Uma wrote:
>
>
> > -----Original Message-----
> > From: Intel-gfx <intel-gfx-bounces@lists.freedesktop.org> On Behalf Of Ville
> > Syrjala
> > Sent: Wednesday, October 8, 2025 11:56 PM
> > To: intel-gfx@lists.freedesktop.org
> > Cc: intel-xe@lists.freedesktop.org
> > Subject: [RFC][PATCH 07/11] drm/i915: Introduce
> > intel_compute_global_watermarks_late()
> >
> > From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> >
> > Add a late watermarks computation stage for skl+. Need this (temporarily
> > perhaps) to get the final cdclk values into skl_wm_check_vblank().
> >
> > But the order is actually wrong now for SAGV. For SAGV we want to first do
> > skl_wm_check_vblank(), then copute crttc_state->use_sagv_wm, and then do
>
> Nit: Typo in crtc
>
> > intel_bw_atomic_check().
> >
> > Scalers are completely borked as far as skl_wm_check_vblank() goes as well.
> >
> > TODO: just a hack really, need to figure out the correct order
> > for real
>
> We can go with hack for now and take up the re-factoring later to sort out the correct order.
I just send a series that should solve this particular problem
as far as the cdclk goes:
https://patchwork.freedesktop.org/series/155860/
I still need to rebase this prefill stuff on top of it, at
which point I should know for sure whether it really fixes
the order or not.
> Reviewed-by: Uma Shankar <uma.shankar@intel.com>
>
> > Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > ---
> > drivers/gpu/drm/i915/display/intel_display.c | 4 ++
> > .../gpu/drm/i915/display/intel_display_core.h | 1 +
> > drivers/gpu/drm/i915/display/intel_wm.c | 10 +++++
> > drivers/gpu/drm/i915/display/intel_wm.h | 1 +
> > drivers/gpu/drm/i915/display/skl_watermark.c | 38 ++++++++++++++++---
> > 5 files changed, 48 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/display/intel_display.c
> > b/drivers/gpu/drm/i915/display/intel_display.c
> > index afa78774eaeb..54ed36183ebe 100644
> > --- a/drivers/gpu/drm/i915/display/intel_display.c
> > +++ b/drivers/gpu/drm/i915/display/intel_display.c
> > @@ -6455,6 +6455,10 @@ int intel_atomic_check(struct drm_device *dev,
> > return ret;
> > }
> >
> > + ret = intel_compute_global_watermarks_late(state);
> > + if (ret)
> > + goto fail;
> > +
> > ret = intel_pmdemand_atomic_check(state);
> > if (ret)
> > goto fail;
> > diff --git a/drivers/gpu/drm/i915/display/intel_display_core.h
> > b/drivers/gpu/drm/i915/display/intel_display_core.h
> > index df4da52cbdb3..62bd894a72f9 100644
> > --- a/drivers/gpu/drm/i915/display/intel_display_core.h
> > +++ b/drivers/gpu/drm/i915/display/intel_display_core.h
> > @@ -89,6 +89,7 @@ struct intel_wm_funcs {
> > void (*optimize_watermarks)(struct intel_atomic_state *state,
> > struct intel_crtc *crtc);
> > int (*compute_global_watermarks)(struct intel_atomic_state *state);
> > + int (*compute_global_watermarks_late)(struct intel_atomic_state
> > +*state);
> > void (*get_hw_state)(struct intel_display *display);
> > void (*sanitize)(struct intel_display *display); }; diff --git
> > a/drivers/gpu/drm/i915/display/intel_wm.c
> > b/drivers/gpu/drm/i915/display/intel_wm.c
> > index f887a664fe22..3035d9879d83 100644
> > --- a/drivers/gpu/drm/i915/display/intel_wm.c
> > +++ b/drivers/gpu/drm/i915/display/intel_wm.c
> > @@ -104,6 +104,16 @@ int intel_compute_global_watermarks(struct
> > intel_atomic_state *state)
> > return 0;
> > }
> >
> > +int intel_compute_global_watermarks_late(struct intel_atomic_state
> > +*state) {
> > + struct intel_display *display = to_intel_display(state);
> > +
> > + if (display->funcs.wm->compute_global_watermarks_late)
> > + return display->funcs.wm-
> > >compute_global_watermarks_late(state);
> > +
> > + return 0;
> > +}
> > +
> > void intel_wm_get_hw_state(struct intel_display *display) {
> > if (display->funcs.wm->get_hw_state)
> > diff --git a/drivers/gpu/drm/i915/display/intel_wm.h
> > b/drivers/gpu/drm/i915/display/intel_wm.h
> > index 9ad4e9eae5ca..9f69dc5dfda5 100644
> > --- a/drivers/gpu/drm/i915/display/intel_wm.h
> > +++ b/drivers/gpu/drm/i915/display/intel_wm.h
> > @@ -24,6 +24,7 @@ void intel_atomic_update_watermarks(struct
> > intel_atomic_state *state, void intel_optimize_watermarks(struct
> > intel_atomic_state *state,
> > struct intel_crtc *crtc);
> > int intel_compute_global_watermarks(struct intel_atomic_state *state);
> > +int intel_compute_global_watermarks_late(struct intel_atomic_state
> > +*state);
> > void intel_wm_get_hw_state(struct intel_display *display); void
> > intel_wm_sanitize(struct intel_display *display); bool
> > intel_wm_plane_visible(const struct intel_crtc_state *crtc_state, diff --git
> > a/drivers/gpu/drm/i915/display/skl_watermark.c
> > b/drivers/gpu/drm/i915/display/skl_watermark.c
> > index aac3ca8f6c0f..5c18fe9a5237 100644
> > --- a/drivers/gpu/drm/i915/display/skl_watermark.c
> > +++ b/drivers/gpu/drm/i915/display/skl_watermark.c
> > @@ -2433,7 +2433,7 @@ static int skl_build_pipe_wm(struct intel_atomic_state
> > *state,
> >
> > crtc_state->wm.skl.optimal = crtc_state->wm.skl.raw;
> >
> > - return skl_wm_check_vblank(crtc_state);
> > + return 0;
> > }
> >
> > static bool skl_wm_level_equals(const struct skl_wm_level *l1, @@ -3002,11
> > +3002,6 @@ skl_compute_wm(struct intel_atomic_state *state)
> > if (ret)
> > return ret;
> >
> > - /*
> > - * skl_compute_ddb() will have adjusted the final watermarks
> > - * based on how much ddb is available. Now we can actually
> > - * check if the final watermarks changed.
> > - */
> > for_each_new_intel_crtc_in_state(state, crtc, new_crtc_state, i) {
> > struct skl_pipe_wm *pipe_wm = &new_crtc_state-
> > >wm.skl.optimal;
> >
> > @@ -3030,6 +3025,36 @@ skl_compute_wm(struct intel_atomic_state *state)
> > pipe_wm->use_sagv_wm = !HAS_HW_SAGV_WM(display) &&
> > DISPLAY_VER(display) >= 12 &&
> > intel_crtc_can_enable_sagv(new_crtc_state);
> > + }
> > +
> > + return 0;
> > +}
> > +
> > +static int
> > +skl_compute_wm_late(struct intel_atomic_state *state) {
> > + struct intel_crtc *crtc;
> > + struct intel_crtc_state __maybe_unused *new_crtc_state;
> > + int i;
> > +
> > + for_each_new_intel_crtc_in_state(state, crtc, new_crtc_state, i) {
> > + int ret;
> > +
> > + /*
> > + * FIXME we need something along the lins of the following order:
> > + * 1. intel_atomic_setup_scalers() (needed by
> > skl_wm_check_vblank())
> > + * 2. intel_modeset_calc_cdclk() (needed by
> > skl_wm_check_vblank())
> > + * 3. skl_compute_wm() (skl_wm_check_vblank() + update
> > use_sagv_wm)
> > + * 4. intel_bw_atomic_check() (needs use_sagv_wm)
> > + * but shuffling all that needs a lot more work...
> > + *
> > + * For now hack it by deferreing skl_wm_check_vblank() until
> > + * intel_modeset_calc_cdclk() has been done. Scalers are still
> > + * completely broken wrt. skl_wm_check_vblank().
> > + */
> > + ret = skl_wm_check_vblank(new_crtc_state);
> > + if (ret)
> > + return ret;
> >
> > ret = skl_wm_add_affected_planes(state, crtc);
> > if (ret)
> > @@ -4060,6 +4085,7 @@ void intel_wm_state_verify(struct intel_atomic_state
> > *state,
> >
> > static const struct intel_wm_funcs skl_wm_funcs = {
> > .compute_global_watermarks = skl_compute_wm,
> > + .compute_global_watermarks_late = skl_compute_wm_late,
> > .get_hw_state = skl_wm_get_hw_state,
> > .sanitize = skl_wm_sanitize,
> > };
> > --
> > 2.49.1
>
--
Ville Syrjälä
Intel
^ permalink raw reply [flat|nested] 26+ messages in thread
* [RFC][PATCH 08/11] drm/i915/prefill: Introduce intel_prefill.c
2025-10-08 18:25 [RFC][PATCH 00/11] drm/i915/prefill: Introduce helpers for prefill latency calculations Ville Syrjala
` (6 preceding siblings ...)
2025-10-08 18:25 ` [RFC][PATCH 07/11] drm/i915: Introduce intel_compute_global_watermarks_late() Ville Syrjala
@ 2025-10-08 18:25 ` Ville Syrjala
2025-10-13 18:42 ` Shankar, Uma
2025-10-08 18:25 ` [RFC][PATCH 09/11] drm/i915/wm: Use intel_prefill Ville Syrjala
` (2 subsequent siblings)
10 siblings, 1 reply; 26+ messages in thread
From: Ville Syrjala @ 2025-10-08 18:25 UTC (permalink / raw)
To: intel-gfx; +Cc: intel-xe
From: Ville Syrjälä <ville.syrjala@linux.intel.com>
Add a new helper thingy to deal with the pipe prefill latency.
We get three potentially useful thigns out of this:
- intel_prefill_vblank_too_short() used for checking the
actual vblank/guardband length
- intel_prefill_min_guardband() to calculate a suitable guardband
size based on some worst case scaling/etc. estimates
- intel_prefill_min_cdclk() used to calculate a minimum cdclk
freqency required for very small vblank lengths (in case the
otherwise compute minimum cdclk doesn't result in fast enough
prefill).
The internal arithmetic is done terms of scanlines using .16
binary fixed point represantion.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
---
drivers/gpu/drm/i915/Makefile | 1 +
drivers/gpu/drm/i915/display/intel_prefill.c | 167 +++++++++++++++++++
drivers/gpu/drm/i915/display/intel_prefill.h | 48 ++++++
drivers/gpu/drm/xe/Makefile | 1 +
4 files changed, 217 insertions(+)
create mode 100644 drivers/gpu/drm/i915/display/intel_prefill.c
create mode 100644 drivers/gpu/drm/i915/display/intel_prefill.h
diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 78a45a6681df..088a6c6cd138 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -351,6 +351,7 @@ i915-y += \
display/intel_panel.o \
display/intel_pfit.o \
display/intel_pps.o \
+ display/intel_prefill.o \
display/intel_qp_tables.o \
display/intel_sdvo.o \
display/intel_snps_hdmi_pll.o \
diff --git a/drivers/gpu/drm/i915/display/intel_prefill.c b/drivers/gpu/drm/i915/display/intel_prefill.c
new file mode 100644
index 000000000000..8b9c14e5c505
--- /dev/null
+++ b/drivers/gpu/drm/i915/display/intel_prefill.c
@@ -0,0 +1,167 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2025 Intel Corporation
+ */
+
+#include <linux/debugfs.h>
+
+#include <drm/drm_print.h>
+
+#include "intel_cdclk.h"
+#include "intel_display_core.h"
+#include "intel_display_types.h"
+#include "intel_prefill.h"
+#include "intel_vdsc.h"
+#include "skl_scaler.h"
+#include "skl_watermark.h"
+
+static unsigned int prefill_usecs_to_lines(const struct intel_crtc_state *crtc_state, unsigned int usecs)
+{
+ const struct drm_display_mode *pipe_mode = &crtc_state->hw.pipe_mode;
+
+ return DIV_ROUND_UP_ULL(mul_u32_u32(pipe_mode->crtc_clock, usecs << 16),
+ pipe_mode->crtc_htotal * 1000);
+}
+
+static void _intel_prefill_init(struct intel_prefill_ctx *ctx,
+ const struct intel_crtc_state *crtc_state)
+{
+ ctx->prefill.fixed = crtc_state->framestart_delay;
+
+ /* 20 usec for translation walks/etc. */
+ ctx->prefill.fixed += prefill_usecs_to_lines(crtc_state, 20);
+
+ ctx->prefill.dsc = intel_vdsc_prefill_lines(crtc_state);
+
+ ctx->prefill.full = 0;
+}
+
+static void intel_prefill_init_nocdclk_worst(struct intel_prefill_ctx *ctx,
+ const struct intel_crtc_state *crtc_state)
+{
+ _intel_prefill_init(ctx, crtc_state);
+
+ ctx->prefill.wm0 = skl_wm0_prefill_lines_worst(crtc_state);
+ ctx->prefill.scaler_1st = skl_scaler_1st_prefill_lines_worst(crtc_state);
+ ctx->prefill.scaler_2nd = skl_scaler_2nd_prefill_lines_worst(crtc_state);
+
+ ctx->adj.scaler_1st = skl_scaler_1st_prefill_adjustment_worst(crtc_state);
+ ctx->adj.scaler_2nd = skl_scaler_2nd_prefill_adjustment_worst(crtc_state);
+}
+
+static void intel_prefill_init_nocdclk(struct intel_prefill_ctx *ctx,
+ const struct intel_crtc_state *crtc_state)
+{
+ _intel_prefill_init(ctx, crtc_state);
+
+ ctx->prefill.wm0 = skl_wm0_prefill_lines(crtc_state);
+ ctx->prefill.scaler_1st = skl_scaler_1st_prefill_lines(crtc_state);
+ ctx->prefill.scaler_2nd = skl_scaler_2nd_prefill_lines(crtc_state);
+
+ ctx->adj.scaler_1st = skl_scaler_1st_prefill_adjustment(crtc_state);
+ ctx->adj.scaler_2nd = skl_scaler_2nd_prefill_adjustment(crtc_state);
+}
+
+static unsigned int prefill_adjust(unsigned int value, unsigned int factor)
+{
+ return DIV_ROUND_UP_ULL(mul_u32_u32(value, factor), 0x10000);
+}
+
+static unsigned int prefill_lines_nocdclk(const struct intel_prefill_ctx *ctx)
+{
+ unsigned int prefill = 0;
+
+ prefill += ctx->prefill.dsc;
+ prefill = prefill_adjust(prefill, ctx->adj.scaler_2nd);
+
+ prefill += ctx->prefill.scaler_2nd;
+ prefill = prefill_adjust(prefill, ctx->adj.scaler_1st);
+
+ prefill += ctx->prefill.scaler_1st;
+ prefill += ctx->prefill.wm0;
+
+ return prefill;
+}
+
+static unsigned int prefill_lines_cdclk(const struct intel_prefill_ctx *ctx)
+{
+ return prefill_adjust(prefill_lines_nocdclk(ctx), ctx->adj.cdclk);
+}
+
+static unsigned int prefill_lines_full(const struct intel_prefill_ctx *ctx)
+{
+ return ctx->prefill.fixed + prefill_lines_cdclk(ctx);
+}
+
+void intel_prefill_init_worst(struct intel_prefill_ctx *ctx,
+ const struct intel_crtc_state *crtc_state)
+{
+ intel_prefill_init_nocdclk_worst(ctx, crtc_state);
+
+ ctx->adj.cdclk = intel_cdclk_prefill_adjustment_worst(crtc_state);
+
+ ctx->prefill.full = prefill_lines_full(ctx);
+}
+
+void intel_prefill_init(struct intel_prefill_ctx *ctx,
+ const struct intel_crtc_state *crtc_state,
+ const struct intel_cdclk_state *cdclk_state)
+{
+ intel_prefill_init_nocdclk(ctx, crtc_state);
+
+ ctx->adj.cdclk = intel_cdclk_prefill_adjustment(crtc_state, cdclk_state);
+
+ ctx->prefill.full = prefill_lines_full(ctx);
+}
+
+static unsigned int prefill_lines_with_latency(const struct intel_prefill_ctx *ctx,
+ const struct intel_crtc_state *crtc_state,
+ unsigned int latency_us)
+{
+ return ctx->prefill.full + prefill_usecs_to_lines(crtc_state, latency_us);
+}
+
+int intel_prefill_min_guardband(const struct intel_prefill_ctx *ctx,
+ const struct intel_crtc_state *crtc_state,
+ unsigned int latency_us)
+{
+ unsigned int prefill = prefill_lines_with_latency(ctx, crtc_state, latency_us);
+
+ return DIV_ROUND_UP(prefill, 0x10000);
+}
+
+static int intel_guardband(const struct intel_crtc_state *crtc_state)
+{
+ const struct drm_display_mode *pipe_mode = &crtc_state->hw.pipe_mode;
+
+ if (crtc_state->vrr.enable)
+ return crtc_state->vrr.guardband;
+ else
+ return pipe_mode->crtc_vblank_end - pipe_mode->crtc_vblank_start;
+}
+
+static int intel_prefill_guardband(const struct intel_crtc_state *crtc_state)
+{
+ return intel_guardband(crtc_state) << 16;
+}
+
+bool intel_prefill_vblank_too_short(const struct intel_prefill_ctx *ctx,
+ const struct intel_crtc_state *crtc_state,
+ unsigned int latency_us)
+{
+ unsigned int guardband = intel_prefill_guardband(crtc_state);
+ unsigned int prefill = prefill_lines_with_latency(ctx, crtc_state, latency_us);
+
+ return guardband < prefill;
+}
+
+int intel_prefill_min_cdclk(const struct intel_prefill_ctx *ctx,
+ const struct intel_crtc_state *crtc_state)
+{
+ unsigned int prefill_unadjusted = prefill_lines_nocdclk(ctx);
+ unsigned int prefill_available = intel_prefill_guardband(crtc_state) -
+ ctx->prefill.fixed;
+
+ return intel_cdclk_min_cdclk_for_prefill(crtc_state, prefill_unadjusted,
+ prefill_available);
+}
diff --git a/drivers/gpu/drm/i915/display/intel_prefill.h b/drivers/gpu/drm/i915/display/intel_prefill.h
new file mode 100644
index 000000000000..0f07660261dc
--- /dev/null
+++ b/drivers/gpu/drm/i915/display/intel_prefill.h
@@ -0,0 +1,48 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2025 Intel Corporation
+ */
+
+#ifndef __INTEL_PREFILL_H__
+#define __INTEL_PREFILL_H__
+
+#include <linux/types.h>
+
+struct intel_cdclk_state;
+struct intel_crtc_state;
+
+struct intel_prefill_ctx {
+ /* .16 scanlines */
+ struct {
+ unsigned int fixed;
+ unsigned int wm0;
+ unsigned int scaler_1st;
+ unsigned int scaler_2nd;
+ unsigned int dsc;
+ unsigned int full;
+ } prefill;
+
+ /* .16 adjustment factors */
+ struct {
+ unsigned int cdclk;
+ unsigned int scaler_1st;
+ unsigned int scaler_2nd;
+ } adj;
+};
+
+void intel_prefill_init_worst(struct intel_prefill_ctx *ctx,
+ const struct intel_crtc_state *crtc_state);
+void intel_prefill_init(struct intel_prefill_ctx *ctx,
+ const struct intel_crtc_state *crtc_state,
+ const struct intel_cdclk_state *cdclk_state);
+
+bool intel_prefill_vblank_too_short(const struct intel_prefill_ctx *ctx,
+ const struct intel_crtc_state *crtc_state,
+ unsigned int latency_us);
+int intel_prefill_min_guardband(const struct intel_prefill_ctx *ctx,
+ const struct intel_crtc_state *crtc_state,
+ unsigned int latency_us);
+int intel_prefill_min_cdclk(const struct intel_prefill_ctx *ctx,
+ const struct intel_crtc_state *crtc_state);
+
+#endif /* __INTEL_PREFILL_H__ */
diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
index 84321fad3265..1be020cc417d 100644
--- a/drivers/gpu/drm/xe/Makefile
+++ b/drivers/gpu/drm/xe/Makefile
@@ -300,6 +300,7 @@ xe-$(CONFIG_DRM_XE_DISPLAY) += \
i915-display/intel_pmdemand.o \
i915-display/intel_pch.o \
i915-display/intel_pps.o \
+ i915-display/intel_prefill.o \
i915-display/intel_psr.o \
i915-display/intel_qp_tables.o \
i915-display/intel_quirks.o \
--
2.49.1
^ permalink raw reply related [flat|nested] 26+ messages in thread* RE: [RFC][PATCH 08/11] drm/i915/prefill: Introduce intel_prefill.c
2025-10-08 18:25 ` [RFC][PATCH 08/11] drm/i915/prefill: Introduce intel_prefill.c Ville Syrjala
@ 2025-10-13 18:42 ` Shankar, Uma
2025-10-13 21:37 ` Ville Syrjälä
0 siblings, 1 reply; 26+ messages in thread
From: Shankar, Uma @ 2025-10-13 18:42 UTC (permalink / raw)
To: Ville Syrjala, intel-gfx@lists.freedesktop.org
Cc: intel-xe@lists.freedesktop.org
> -----Original Message-----
> From: Intel-gfx <intel-gfx-bounces@lists.freedesktop.org> On Behalf Of Ville
> Syrjala
> Sent: Wednesday, October 8, 2025 11:56 PM
> To: intel-gfx@lists.freedesktop.org
> Cc: intel-xe@lists.freedesktop.org
> Subject: [RFC][PATCH 08/11] drm/i915/prefill: Introduce intel_prefill.c
>
> From: Ville Syrjälä <ville.syrjala@linux.intel.com>
>
> Add a new helper thingy to deal with the pipe prefill latency.
>
> We get three potentially useful thigns out of this:
> - intel_prefill_vblank_too_short() used for checking the
> actual vblank/guardband length
> - intel_prefill_min_guardband() to calculate a suitable guardband
> size based on some worst case scaling/etc. estimates
> - intel_prefill_min_cdclk() used to calculate a minimum cdclk
> freqency required for very small vblank lengths (in case the
> otherwise compute minimum cdclk doesn't result in fast enough
> prefill).
>
> The internal arithmetic is done terms of scanlines using .16 binary fixed point
> represantion.
Nit: Typo in representation.
Only thing maybe left is the SDP related latency, which also can be planned for worst case.
Looks Good to me.
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
>
> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> ---
> drivers/gpu/drm/i915/Makefile | 1 +
> drivers/gpu/drm/i915/display/intel_prefill.c | 167 +++++++++++++++++++
> drivers/gpu/drm/i915/display/intel_prefill.h | 48 ++++++
> drivers/gpu/drm/xe/Makefile | 1 +
> 4 files changed, 217 insertions(+)
> create mode 100644 drivers/gpu/drm/i915/display/intel_prefill.c
> create mode 100644 drivers/gpu/drm/i915/display/intel_prefill.h
>
> diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index
> 78a45a6681df..088a6c6cd138 100644
> --- a/drivers/gpu/drm/i915/Makefile
> +++ b/drivers/gpu/drm/i915/Makefile
> @@ -351,6 +351,7 @@ i915-y += \
> display/intel_panel.o \
> display/intel_pfit.o \
> display/intel_pps.o \
> + display/intel_prefill.o \
> display/intel_qp_tables.o \
> display/intel_sdvo.o \
> display/intel_snps_hdmi_pll.o \
> diff --git a/drivers/gpu/drm/i915/display/intel_prefill.c
> b/drivers/gpu/drm/i915/display/intel_prefill.c
> new file mode 100644
> index 000000000000..8b9c14e5c505
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/display/intel_prefill.c
> @@ -0,0 +1,167 @@
> +// SPDX-License-Identifier: MIT
> +/*
> + * Copyright © 2025 Intel Corporation
> + */
> +
> +#include <linux/debugfs.h>
> +
> +#include <drm/drm_print.h>
> +
> +#include "intel_cdclk.h"
> +#include "intel_display_core.h"
> +#include "intel_display_types.h"
> +#include "intel_prefill.h"
> +#include "intel_vdsc.h"
> +#include "skl_scaler.h"
> +#include "skl_watermark.h"
> +
> +static unsigned int prefill_usecs_to_lines(const struct
> +intel_crtc_state *crtc_state, unsigned int usecs) {
> + const struct drm_display_mode *pipe_mode = &crtc_state-
> >hw.pipe_mode;
> +
> + return DIV_ROUND_UP_ULL(mul_u32_u32(pipe_mode->crtc_clock,
> usecs << 16),
> + pipe_mode->crtc_htotal * 1000);
> +}
> +
> +static void _intel_prefill_init(struct intel_prefill_ctx *ctx,
> + const struct intel_crtc_state *crtc_state) {
> + ctx->prefill.fixed = crtc_state->framestart_delay;
> +
> + /* 20 usec for translation walks/etc. */
> + ctx->prefill.fixed += prefill_usecs_to_lines(crtc_state, 20);
> +
> + ctx->prefill.dsc = intel_vdsc_prefill_lines(crtc_state);
> +
> + ctx->prefill.full = 0;
> +}
> +
> +static void intel_prefill_init_nocdclk_worst(struct intel_prefill_ctx *ctx,
> + const struct intel_crtc_state
> *crtc_state) {
> + _intel_prefill_init(ctx, crtc_state);
> +
> + ctx->prefill.wm0 = skl_wm0_prefill_lines_worst(crtc_state);
> + ctx->prefill.scaler_1st = skl_scaler_1st_prefill_lines_worst(crtc_state);
> + ctx->prefill.scaler_2nd =
> +skl_scaler_2nd_prefill_lines_worst(crtc_state);
> +
> + ctx->adj.scaler_1st = skl_scaler_1st_prefill_adjustment_worst(crtc_state);
> + ctx->adj.scaler_2nd =
> +skl_scaler_2nd_prefill_adjustment_worst(crtc_state);
> +}
> +
> +static void intel_prefill_init_nocdclk(struct intel_prefill_ctx *ctx,
> + const struct intel_crtc_state *crtc_state) {
> + _intel_prefill_init(ctx, crtc_state);
> +
> + ctx->prefill.wm0 = skl_wm0_prefill_lines(crtc_state);
> + ctx->prefill.scaler_1st = skl_scaler_1st_prefill_lines(crtc_state);
> + ctx->prefill.scaler_2nd = skl_scaler_2nd_prefill_lines(crtc_state);
> +
> + ctx->adj.scaler_1st = skl_scaler_1st_prefill_adjustment(crtc_state);
> + ctx->adj.scaler_2nd = skl_scaler_2nd_prefill_adjustment(crtc_state);
> +}
> +
> +static unsigned int prefill_adjust(unsigned int value, unsigned int
> +factor) {
> + return DIV_ROUND_UP_ULL(mul_u32_u32(value, factor), 0x10000); }
> +
> +static unsigned int prefill_lines_nocdclk(const struct
> +intel_prefill_ctx *ctx) {
> + unsigned int prefill = 0;
> +
> + prefill += ctx->prefill.dsc;
> + prefill = prefill_adjust(prefill, ctx->adj.scaler_2nd);
> +
> + prefill += ctx->prefill.scaler_2nd;
> + prefill = prefill_adjust(prefill, ctx->adj.scaler_1st);
> +
> + prefill += ctx->prefill.scaler_1st;
> + prefill += ctx->prefill.wm0;
> +
> + return prefill;
> +}
> +
> +static unsigned int prefill_lines_cdclk(const struct intel_prefill_ctx
> +*ctx) {
> + return prefill_adjust(prefill_lines_nocdclk(ctx), ctx->adj.cdclk); }
> +
> +static unsigned int prefill_lines_full(const struct intel_prefill_ctx
> +*ctx) {
> + return ctx->prefill.fixed + prefill_lines_cdclk(ctx); }
> +
> +void intel_prefill_init_worst(struct intel_prefill_ctx *ctx,
> + const struct intel_crtc_state *crtc_state) {
> + intel_prefill_init_nocdclk_worst(ctx, crtc_state);
> +
> + ctx->adj.cdclk = intel_cdclk_prefill_adjustment_worst(crtc_state);
> +
> + ctx->prefill.full = prefill_lines_full(ctx); }
> +
> +void intel_prefill_init(struct intel_prefill_ctx *ctx,
> + const struct intel_crtc_state *crtc_state,
> + const struct intel_cdclk_state *cdclk_state) {
> + intel_prefill_init_nocdclk(ctx, crtc_state);
> +
> + ctx->adj.cdclk = intel_cdclk_prefill_adjustment(crtc_state,
> +cdclk_state);
> +
> + ctx->prefill.full = prefill_lines_full(ctx); }
> +
> +static unsigned int prefill_lines_with_latency(const struct intel_prefill_ctx *ctx,
> + const struct intel_crtc_state
> *crtc_state,
> + unsigned int latency_us)
> +{
> + return ctx->prefill.full + prefill_usecs_to_lines(crtc_state,
> +latency_us); }
> +
> +int intel_prefill_min_guardband(const struct intel_prefill_ctx *ctx,
> + const struct intel_crtc_state *crtc_state,
> + unsigned int latency_us)
> +{
> + unsigned int prefill = prefill_lines_with_latency(ctx, crtc_state,
> +latency_us);
> +
> + return DIV_ROUND_UP(prefill, 0x10000); }
> +
> +static int intel_guardband(const struct intel_crtc_state *crtc_state) {
> + const struct drm_display_mode *pipe_mode = &crtc_state-
> >hw.pipe_mode;
> +
> + if (crtc_state->vrr.enable)
> + return crtc_state->vrr.guardband;
> + else
> + return pipe_mode->crtc_vblank_end - pipe_mode-
> >crtc_vblank_start; }
> +
> +static int intel_prefill_guardband(const struct intel_crtc_state
> +*crtc_state) {
> + return intel_guardband(crtc_state) << 16; }
> +
> +bool intel_prefill_vblank_too_short(const struct intel_prefill_ctx *ctx,
> + const struct intel_crtc_state *crtc_state,
> + unsigned int latency_us)
> +{
> + unsigned int guardband = intel_prefill_guardband(crtc_state);
> + unsigned int prefill = prefill_lines_with_latency(ctx, crtc_state,
> +latency_us);
> +
> + return guardband < prefill;
> +}
> +
> +int intel_prefill_min_cdclk(const struct intel_prefill_ctx *ctx,
> + const struct intel_crtc_state *crtc_state) {
> + unsigned int prefill_unadjusted = prefill_lines_nocdclk(ctx);
> + unsigned int prefill_available = intel_prefill_guardband(crtc_state) -
> + ctx->prefill.fixed;
> +
> + return intel_cdclk_min_cdclk_for_prefill(crtc_state, prefill_unadjusted,
> + prefill_available);
> +}
> diff --git a/drivers/gpu/drm/i915/display/intel_prefill.h
> b/drivers/gpu/drm/i915/display/intel_prefill.h
> new file mode 100644
> index 000000000000..0f07660261dc
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/display/intel_prefill.h
> @@ -0,0 +1,48 @@
> +/* SPDX-License-Identifier: MIT */
> +/*
> + * Copyright © 2025 Intel Corporation
> + */
> +
> +#ifndef __INTEL_PREFILL_H__
> +#define __INTEL_PREFILL_H__
> +
> +#include <linux/types.h>
> +
> +struct intel_cdclk_state;
> +struct intel_crtc_state;
> +
> +struct intel_prefill_ctx {
> + /* .16 scanlines */
> + struct {
> + unsigned int fixed;
> + unsigned int wm0;
> + unsigned int scaler_1st;
> + unsigned int scaler_2nd;
> + unsigned int dsc;
> + unsigned int full;
> + } prefill;
> +
> + /* .16 adjustment factors */
> + struct {
> + unsigned int cdclk;
> + unsigned int scaler_1st;
> + unsigned int scaler_2nd;
> + } adj;
> +};
> +
> +void intel_prefill_init_worst(struct intel_prefill_ctx *ctx,
> + const struct intel_crtc_state *crtc_state); void
> +intel_prefill_init(struct intel_prefill_ctx *ctx,
> + const struct intel_crtc_state *crtc_state,
> + const struct intel_cdclk_state *cdclk_state);
> +
> +bool intel_prefill_vblank_too_short(const struct intel_prefill_ctx *ctx,
> + const struct intel_crtc_state *crtc_state,
> + unsigned int latency_us);
> +int intel_prefill_min_guardband(const struct intel_prefill_ctx *ctx,
> + const struct intel_crtc_state *crtc_state,
> + unsigned int latency_us);
> +int intel_prefill_min_cdclk(const struct intel_prefill_ctx *ctx,
> + const struct intel_crtc_state *crtc_state);
> +
> +#endif /* __INTEL_PREFILL_H__ */
> diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile index
> 84321fad3265..1be020cc417d 100644
> --- a/drivers/gpu/drm/xe/Makefile
> +++ b/drivers/gpu/drm/xe/Makefile
> @@ -300,6 +300,7 @@ xe-$(CONFIG_DRM_XE_DISPLAY) += \
> i915-display/intel_pmdemand.o \
> i915-display/intel_pch.o \
> i915-display/intel_pps.o \
> + i915-display/intel_prefill.o \
> i915-display/intel_psr.o \
> i915-display/intel_qp_tables.o \
> i915-display/intel_quirks.o \
> --
> 2.49.1
^ permalink raw reply [flat|nested] 26+ messages in thread* Re: [RFC][PATCH 08/11] drm/i915/prefill: Introduce intel_prefill.c
2025-10-13 18:42 ` Shankar, Uma
@ 2025-10-13 21:37 ` Ville Syrjälä
0 siblings, 0 replies; 26+ messages in thread
From: Ville Syrjälä @ 2025-10-13 21:37 UTC (permalink / raw)
To: Shankar, Uma
Cc: intel-gfx@lists.freedesktop.org, intel-xe@lists.freedesktop.org
On Mon, Oct 13, 2025 at 06:42:09PM +0000, Shankar, Uma wrote:
>
>
> > -----Original Message-----
> > From: Intel-gfx <intel-gfx-bounces@lists.freedesktop.org> On Behalf Of Ville
> > Syrjala
> > Sent: Wednesday, October 8, 2025 11:56 PM
> > To: intel-gfx@lists.freedesktop.org
> > Cc: intel-xe@lists.freedesktop.org
> > Subject: [RFC][PATCH 08/11] drm/i915/prefill: Introduce intel_prefill.c
> >
> > From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> >
> > Add a new helper thingy to deal with the pipe prefill latency.
> >
> > We get three potentially useful thigns out of this:
> > - intel_prefill_vblank_too_short() used for checking the
> > actual vblank/guardband length
> > - intel_prefill_min_guardband() to calculate a suitable guardband
> > size based on some worst case scaling/etc. estimates
> > - intel_prefill_min_cdclk() used to calculate a minimum cdclk
> > freqency required for very small vblank lengths (in case the
> > otherwise compute minimum cdclk doesn't result in fast enough
> > prefill).
> >
> > The internal arithmetic is done terms of scanlines using .16 binary fixed point
> > represantion.
>
> Nit: Typo in representation.
>
> Only thing maybe left is the SDP related latency, which also can be planned for worst case.
Yeah, though SDP (and PSR) are separate from the pre-fill. So
they should be handled by the higher level guardband calculation
code, not by intel_prefill.
>
> Looks Good to me.
> Reviewed-by: Uma Shankar <uma.shankar@intel.com>
>
> >
> > Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > ---
> > drivers/gpu/drm/i915/Makefile | 1 +
> > drivers/gpu/drm/i915/display/intel_prefill.c | 167 +++++++++++++++++++
> > drivers/gpu/drm/i915/display/intel_prefill.h | 48 ++++++
> > drivers/gpu/drm/xe/Makefile | 1 +
> > 4 files changed, 217 insertions(+)
> > create mode 100644 drivers/gpu/drm/i915/display/intel_prefill.c
> > create mode 100644 drivers/gpu/drm/i915/display/intel_prefill.h
> >
> > diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index
> > 78a45a6681df..088a6c6cd138 100644
> > --- a/drivers/gpu/drm/i915/Makefile
> > +++ b/drivers/gpu/drm/i915/Makefile
> > @@ -351,6 +351,7 @@ i915-y += \
> > display/intel_panel.o \
> > display/intel_pfit.o \
> > display/intel_pps.o \
> > + display/intel_prefill.o \
> > display/intel_qp_tables.o \
> > display/intel_sdvo.o \
> > display/intel_snps_hdmi_pll.o \
> > diff --git a/drivers/gpu/drm/i915/display/intel_prefill.c
> > b/drivers/gpu/drm/i915/display/intel_prefill.c
> > new file mode 100644
> > index 000000000000..8b9c14e5c505
> > --- /dev/null
> > +++ b/drivers/gpu/drm/i915/display/intel_prefill.c
> > @@ -0,0 +1,167 @@
> > +// SPDX-License-Identifier: MIT
> > +/*
> > + * Copyright © 2025 Intel Corporation
> > + */
> > +
> > +#include <linux/debugfs.h>
> > +
> > +#include <drm/drm_print.h>
> > +
> > +#include "intel_cdclk.h"
> > +#include "intel_display_core.h"
> > +#include "intel_display_types.h"
> > +#include "intel_prefill.h"
> > +#include "intel_vdsc.h"
> > +#include "skl_scaler.h"
> > +#include "skl_watermark.h"
> > +
> > +static unsigned int prefill_usecs_to_lines(const struct
> > +intel_crtc_state *crtc_state, unsigned int usecs) {
> > + const struct drm_display_mode *pipe_mode = &crtc_state-
> > >hw.pipe_mode;
> > +
> > + return DIV_ROUND_UP_ULL(mul_u32_u32(pipe_mode->crtc_clock,
> > usecs << 16),
> > + pipe_mode->crtc_htotal * 1000);
> > +}
> > +
> > +static void _intel_prefill_init(struct intel_prefill_ctx *ctx,
> > + const struct intel_crtc_state *crtc_state) {
> > + ctx->prefill.fixed = crtc_state->framestart_delay;
> > +
> > + /* 20 usec for translation walks/etc. */
> > + ctx->prefill.fixed += prefill_usecs_to_lines(crtc_state, 20);
> > +
> > + ctx->prefill.dsc = intel_vdsc_prefill_lines(crtc_state);
> > +
> > + ctx->prefill.full = 0;
> > +}
> > +
> > +static void intel_prefill_init_nocdclk_worst(struct intel_prefill_ctx *ctx,
> > + const struct intel_crtc_state
> > *crtc_state) {
> > + _intel_prefill_init(ctx, crtc_state);
> > +
> > + ctx->prefill.wm0 = skl_wm0_prefill_lines_worst(crtc_state);
> > + ctx->prefill.scaler_1st = skl_scaler_1st_prefill_lines_worst(crtc_state);
> > + ctx->prefill.scaler_2nd =
> > +skl_scaler_2nd_prefill_lines_worst(crtc_state);
> > +
> > + ctx->adj.scaler_1st = skl_scaler_1st_prefill_adjustment_worst(crtc_state);
> > + ctx->adj.scaler_2nd =
> > +skl_scaler_2nd_prefill_adjustment_worst(crtc_state);
> > +}
> > +
> > +static void intel_prefill_init_nocdclk(struct intel_prefill_ctx *ctx,
> > + const struct intel_crtc_state *crtc_state) {
> > + _intel_prefill_init(ctx, crtc_state);
> > +
> > + ctx->prefill.wm0 = skl_wm0_prefill_lines(crtc_state);
> > + ctx->prefill.scaler_1st = skl_scaler_1st_prefill_lines(crtc_state);
> > + ctx->prefill.scaler_2nd = skl_scaler_2nd_prefill_lines(crtc_state);
> > +
> > + ctx->adj.scaler_1st = skl_scaler_1st_prefill_adjustment(crtc_state);
> > + ctx->adj.scaler_2nd = skl_scaler_2nd_prefill_adjustment(crtc_state);
> > +}
> > +
> > +static unsigned int prefill_adjust(unsigned int value, unsigned int
> > +factor) {
> > + return DIV_ROUND_UP_ULL(mul_u32_u32(value, factor), 0x10000); }
> > +
> > +static unsigned int prefill_lines_nocdclk(const struct
> > +intel_prefill_ctx *ctx) {
> > + unsigned int prefill = 0;
> > +
> > + prefill += ctx->prefill.dsc;
> > + prefill = prefill_adjust(prefill, ctx->adj.scaler_2nd);
> > +
> > + prefill += ctx->prefill.scaler_2nd;
> > + prefill = prefill_adjust(prefill, ctx->adj.scaler_1st);
> > +
> > + prefill += ctx->prefill.scaler_1st;
> > + prefill += ctx->prefill.wm0;
> > +
> > + return prefill;
> > +}
> > +
> > +static unsigned int prefill_lines_cdclk(const struct intel_prefill_ctx
> > +*ctx) {
> > + return prefill_adjust(prefill_lines_nocdclk(ctx), ctx->adj.cdclk); }
> > +
> > +static unsigned int prefill_lines_full(const struct intel_prefill_ctx
> > +*ctx) {
> > + return ctx->prefill.fixed + prefill_lines_cdclk(ctx); }
> > +
> > +void intel_prefill_init_worst(struct intel_prefill_ctx *ctx,
> > + const struct intel_crtc_state *crtc_state) {
> > + intel_prefill_init_nocdclk_worst(ctx, crtc_state);
> > +
> > + ctx->adj.cdclk = intel_cdclk_prefill_adjustment_worst(crtc_state);
> > +
> > + ctx->prefill.full = prefill_lines_full(ctx); }
> > +
> > +void intel_prefill_init(struct intel_prefill_ctx *ctx,
> > + const struct intel_crtc_state *crtc_state,
> > + const struct intel_cdclk_state *cdclk_state) {
> > + intel_prefill_init_nocdclk(ctx, crtc_state);
> > +
> > + ctx->adj.cdclk = intel_cdclk_prefill_adjustment(crtc_state,
> > +cdclk_state);
> > +
> > + ctx->prefill.full = prefill_lines_full(ctx); }
> > +
> > +static unsigned int prefill_lines_with_latency(const struct intel_prefill_ctx *ctx,
> > + const struct intel_crtc_state
> > *crtc_state,
> > + unsigned int latency_us)
> > +{
> > + return ctx->prefill.full + prefill_usecs_to_lines(crtc_state,
> > +latency_us); }
> > +
> > +int intel_prefill_min_guardband(const struct intel_prefill_ctx *ctx,
> > + const struct intel_crtc_state *crtc_state,
> > + unsigned int latency_us)
> > +{
> > + unsigned int prefill = prefill_lines_with_latency(ctx, crtc_state,
> > +latency_us);
> > +
> > + return DIV_ROUND_UP(prefill, 0x10000); }
> > +
> > +static int intel_guardband(const struct intel_crtc_state *crtc_state) {
> > + const struct drm_display_mode *pipe_mode = &crtc_state-
> > >hw.pipe_mode;
> > +
> > + if (crtc_state->vrr.enable)
> > + return crtc_state->vrr.guardband;
> > + else
> > + return pipe_mode->crtc_vblank_end - pipe_mode-
> > >crtc_vblank_start; }
> > +
> > +static int intel_prefill_guardband(const struct intel_crtc_state
> > +*crtc_state) {
> > + return intel_guardband(crtc_state) << 16; }
> > +
> > +bool intel_prefill_vblank_too_short(const struct intel_prefill_ctx *ctx,
> > + const struct intel_crtc_state *crtc_state,
> > + unsigned int latency_us)
> > +{
> > + unsigned int guardband = intel_prefill_guardband(crtc_state);
> > + unsigned int prefill = prefill_lines_with_latency(ctx, crtc_state,
> > +latency_us);
> > +
> > + return guardband < prefill;
> > +}
> > +
> > +int intel_prefill_min_cdclk(const struct intel_prefill_ctx *ctx,
> > + const struct intel_crtc_state *crtc_state) {
> > + unsigned int prefill_unadjusted = prefill_lines_nocdclk(ctx);
> > + unsigned int prefill_available = intel_prefill_guardband(crtc_state) -
> > + ctx->prefill.fixed;
> > +
> > + return intel_cdclk_min_cdclk_for_prefill(crtc_state, prefill_unadjusted,
> > + prefill_available);
> > +}
> > diff --git a/drivers/gpu/drm/i915/display/intel_prefill.h
> > b/drivers/gpu/drm/i915/display/intel_prefill.h
> > new file mode 100644
> > index 000000000000..0f07660261dc
> > --- /dev/null
> > +++ b/drivers/gpu/drm/i915/display/intel_prefill.h
> > @@ -0,0 +1,48 @@
> > +/* SPDX-License-Identifier: MIT */
> > +/*
> > + * Copyright © 2025 Intel Corporation
> > + */
> > +
> > +#ifndef __INTEL_PREFILL_H__
> > +#define __INTEL_PREFILL_H__
> > +
> > +#include <linux/types.h>
> > +
> > +struct intel_cdclk_state;
> > +struct intel_crtc_state;
> > +
> > +struct intel_prefill_ctx {
> > + /* .16 scanlines */
> > + struct {
> > + unsigned int fixed;
> > + unsigned int wm0;
> > + unsigned int scaler_1st;
> > + unsigned int scaler_2nd;
> > + unsigned int dsc;
> > + unsigned int full;
> > + } prefill;
> > +
> > + /* .16 adjustment factors */
> > + struct {
> > + unsigned int cdclk;
> > + unsigned int scaler_1st;
> > + unsigned int scaler_2nd;
> > + } adj;
> > +};
> > +
> > +void intel_prefill_init_worst(struct intel_prefill_ctx *ctx,
> > + const struct intel_crtc_state *crtc_state); void
> > +intel_prefill_init(struct intel_prefill_ctx *ctx,
> > + const struct intel_crtc_state *crtc_state,
> > + const struct intel_cdclk_state *cdclk_state);
> > +
> > +bool intel_prefill_vblank_too_short(const struct intel_prefill_ctx *ctx,
> > + const struct intel_crtc_state *crtc_state,
> > + unsigned int latency_us);
> > +int intel_prefill_min_guardband(const struct intel_prefill_ctx *ctx,
> > + const struct intel_crtc_state *crtc_state,
> > + unsigned int latency_us);
> > +int intel_prefill_min_cdclk(const struct intel_prefill_ctx *ctx,
> > + const struct intel_crtc_state *crtc_state);
> > +
> > +#endif /* __INTEL_PREFILL_H__ */
> > diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile index
> > 84321fad3265..1be020cc417d 100644
> > --- a/drivers/gpu/drm/xe/Makefile
> > +++ b/drivers/gpu/drm/xe/Makefile
> > @@ -300,6 +300,7 @@ xe-$(CONFIG_DRM_XE_DISPLAY) += \
> > i915-display/intel_pmdemand.o \
> > i915-display/intel_pch.o \
> > i915-display/intel_pps.o \
> > + i915-display/intel_prefill.o \
> > i915-display/intel_psr.o \
> > i915-display/intel_qp_tables.o \
> > i915-display/intel_quirks.o \
> > --
> > 2.49.1
>
--
Ville Syrjälä
Intel
^ permalink raw reply [flat|nested] 26+ messages in thread
* [RFC][PATCH 09/11] drm/i915/wm: Use intel_prefill
2025-10-08 18:25 [RFC][PATCH 00/11] drm/i915/prefill: Introduce helpers for prefill latency calculations Ville Syrjala
` (7 preceding siblings ...)
2025-10-08 18:25 ` [RFC][PATCH 08/11] drm/i915/prefill: Introduce intel_prefill.c Ville Syrjala
@ 2025-10-08 18:25 ` Ville Syrjala
2025-10-13 18:43 ` Shankar, Uma
2025-10-08 18:25 ` [RFC][PATCH 10/11] drm/i915/prefill: Print the prefill details Ville Syrjala
2025-10-08 18:25 ` [RFC][PATCH 11/11] drm/i915/prefill: Also print out the worst case estimates Ville Syrjala
10 siblings, 1 reply; 26+ messages in thread
From: Ville Syrjala @ 2025-10-08 18:25 UTC (permalink / raw)
To: intel-gfx; +Cc: intel-xe
From: Ville Syrjälä <ville.syrjala@linux.intel.com>
Replace the current ad-hoc prefill calculations with intel_prefill.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
---
drivers/gpu/drm/i915/display/skl_watermark.c | 141 ++++---------------
1 file changed, 31 insertions(+), 110 deletions(-)
diff --git a/drivers/gpu/drm/i915/display/skl_watermark.c b/drivers/gpu/drm/i915/display/skl_watermark.c
index 5c18fe9a5237..b3e9e2a0dab3 100644
--- a/drivers/gpu/drm/i915/display/skl_watermark.c
+++ b/drivers/gpu/drm/i915/display/skl_watermark.c
@@ -28,6 +28,7 @@
#include "intel_flipq.h"
#include "intel_pcode.h"
#include "intel_plane.h"
+#include "intel_prefill.h"
#include "intel_wm.h"
#include "skl_scaler.h"
#include "skl_universal_plane_regs.h"
@@ -2146,105 +2147,6 @@ static int icl_build_plane_wm(struct intel_crtc_state *crtc_state,
return 0;
}
-static int
-cdclk_prefill_adjustment(const struct intel_crtc_state *crtc_state)
-{
- struct intel_display *display = to_intel_display(crtc_state);
- struct intel_atomic_state *state =
- to_intel_atomic_state(crtc_state->uapi.state);
- const struct intel_cdclk_state *cdclk_state;
-
- cdclk_state = intel_atomic_get_cdclk_state(state);
- if (IS_ERR(cdclk_state)) {
- drm_WARN_ON(display->drm, PTR_ERR(cdclk_state));
- return 1;
- }
-
- return min(1, DIV_ROUND_UP(crtc_state->pixel_rate,
- 2 * intel_cdclk_logical(cdclk_state)));
-}
-
-static int
-dsc_prefill_latency(const struct intel_crtc_state *crtc_state)
-{
- struct intel_crtc *crtc = to_intel_crtc(crtc_state->uapi.crtc);
- const struct intel_crtc_scaler_state *scaler_state =
- &crtc_state->scaler_state;
- int linetime = DIV_ROUND_UP(1000 * crtc_state->hw.adjusted_mode.htotal,
- crtc_state->hw.adjusted_mode.clock);
- int num_scaler_users = hweight32(scaler_state->scaler_users);
- int chroma_downscaling_factor =
- crtc_state->output_format == INTEL_OUTPUT_FORMAT_YCBCR420 ? 2 : 1;
- u32 dsc_prefill_latency = 0;
-
- if (!crtc_state->dsc.compression_enable ||
- !num_scaler_users ||
- num_scaler_users > crtc->num_scalers)
- return dsc_prefill_latency;
-
- dsc_prefill_latency = DIV_ROUND_UP(15 * linetime * chroma_downscaling_factor, 10);
-
- for (int i = 0; i < num_scaler_users; i++) {
- u64 hscale_k, vscale_k;
-
- hscale_k = max(1000, mul_u32_u32(scaler_state->scalers[i].hscale, 1000) >> 16);
- vscale_k = max(1000, mul_u32_u32(scaler_state->scalers[i].vscale, 1000) >> 16);
- dsc_prefill_latency = DIV_ROUND_UP_ULL(dsc_prefill_latency * hscale_k * vscale_k,
- 1000000);
- }
-
- dsc_prefill_latency *= cdclk_prefill_adjustment(crtc_state);
-
- return intel_usecs_to_scanlines(&crtc_state->hw.adjusted_mode, dsc_prefill_latency);
-}
-
-static int
-scaler_prefill_latency(const struct intel_crtc_state *crtc_state)
-{
- const struct intel_crtc_scaler_state *scaler_state =
- &crtc_state->scaler_state;
- int num_scaler_users = hweight32(scaler_state->scaler_users);
- int scaler_prefill_latency = 0;
- int linetime = DIV_ROUND_UP(1000 * crtc_state->hw.adjusted_mode.htotal,
- crtc_state->hw.adjusted_mode.clock);
-
- if (!num_scaler_users)
- return scaler_prefill_latency;
-
- scaler_prefill_latency = 4 * linetime;
-
- if (num_scaler_users > 1) {
- u64 hscale_k = max(1000, mul_u32_u32(scaler_state->scalers[0].hscale, 1000) >> 16);
- u64 vscale_k = max(1000, mul_u32_u32(scaler_state->scalers[0].vscale, 1000) >> 16);
- int chroma_downscaling_factor =
- crtc_state->output_format == INTEL_OUTPUT_FORMAT_YCBCR420 ? 2 : 1;
- int latency;
-
- latency = DIV_ROUND_UP_ULL((4 * linetime * hscale_k * vscale_k *
- chroma_downscaling_factor), 1000000);
- scaler_prefill_latency += latency;
- }
-
- scaler_prefill_latency *= cdclk_prefill_adjustment(crtc_state);
-
- return intel_usecs_to_scanlines(&crtc_state->hw.adjusted_mode, scaler_prefill_latency);
-}
-
-static bool
-skl_is_vblank_too_short(const struct intel_crtc_state *crtc_state,
- int wm0_lines, int latency)
-{
- const struct drm_display_mode *adjusted_mode =
- &crtc_state->hw.adjusted_mode;
-
- return crtc_state->framestart_delay +
- intel_usecs_to_scanlines(adjusted_mode, latency) +
- scaler_prefill_latency(crtc_state) +
- dsc_prefill_latency(crtc_state) +
- wm0_lines >
- adjusted_mode->crtc_vtotal - adjusted_mode->crtc_vblank_start;
-}
-
unsigned int skl_wm0_prefill_lines_worst(const struct intel_crtc_state *crtc_state)
{
struct intel_display *display = to_intel_display(crtc_state);
@@ -2325,9 +2227,10 @@ unsigned int skl_wm0_prefill_lines(const struct intel_crtc_state *crtc_state)
* the scenario to apply Wa_16025596647.
*/
static int skl_max_wm_level_for_vblank(struct intel_crtc_state *crtc_state,
- int wm0_lines)
+ const struct intel_prefill_ctx *ctx)
{
struct intel_display *display = to_intel_display(crtc_state);
+ struct intel_crtc *crtc = to_intel_crtc(crtc_state->uapi.crtc);
int level;
for (level = display->wm.num_levels - 1; level >= 0; level--) {
@@ -2342,25 +2245,36 @@ static int skl_max_wm_level_for_vblank(struct intel_crtc_state *crtc_state,
if (level == 0)
latency = 0;
- if (!skl_is_vblank_too_short(crtc_state, wm0_lines, latency))
+ if (!intel_prefill_vblank_too_short(ctx, crtc_state, latency))
return level;
}
+ drm_dbg_kms(display->drm, "[CRTC:%d:%s] Not enough time in vblank for prefill\n",
+ crtc->base.base.id, crtc->base.name);
+
return -EINVAL;
}
-static int skl_wm_check_vblank(struct intel_crtc_state *crtc_state)
+static int skl_wm_check_vblank(struct intel_atomic_state *state,
+ struct intel_crtc *crtc)
{
- struct intel_display *display = to_intel_display(crtc_state);
- struct intel_crtc *crtc = to_intel_crtc(crtc_state->uapi.crtc);
- int wm0_lines, level;
+ struct intel_display *display = to_intel_display(state);
+ struct intel_crtc_state *crtc_state =
+ intel_atomic_get_new_crtc_state(state, crtc);
+ struct intel_cdclk_state *cdclk_state;
+ struct intel_prefill_ctx ctx;
+ int level;
if (!crtc_state->hw.active)
return 0;
- wm0_lines = skl_max_wm0_lines(crtc_state);
+ cdclk_state = intel_atomic_get_cdclk_state(state);
+ if (IS_ERR(cdclk_state))
+ return PTR_ERR(cdclk_state);
- level = skl_max_wm_level_for_vblank(crtc_state, wm0_lines);
+ intel_prefill_init(&ctx, crtc_state, cdclk_state);
+
+ level = skl_max_wm_level_for_vblank(crtc_state, &ctx);
if (level < 0)
return level;
@@ -2370,6 +2284,13 @@ static int skl_wm_check_vblank(struct intel_crtc_state *crtc_state)
*/
crtc_state->wm_level_disabled = level < display->wm.num_levels - 1;
+ /*
+ * TODO: assert that we are in fact using the maximum guardband
+ * if we end up disabling any WM levels here. Otherwise we clearly
+ * failed in using a realistic worst case prefill estimate during
+ * when determining the guardband size.
+ */
+
for (level++; level < display->wm.num_levels; level++) {
enum plane_id plane_id;
@@ -2388,8 +2309,8 @@ static int skl_wm_check_vblank(struct intel_crtc_state *crtc_state)
if (DISPLAY_VER(display) >= 12 &&
display->sagv.block_time_us &&
- skl_is_vblank_too_short(crtc_state, wm0_lines,
- display->sagv.block_time_us)) {
+ intel_prefill_vblank_too_short(&ctx, crtc_state,
+ display->sagv.block_time_us)) {
enum plane_id plane_id;
for_each_plane_id_on_crtc(crtc, plane_id) {
@@ -3052,7 +2973,7 @@ skl_compute_wm_late(struct intel_atomic_state *state)
* intel_modeset_calc_cdclk() has been done. Scalers are still
* completely broken wrt. skl_wm_check_vblank().
*/
- ret = skl_wm_check_vblank(new_crtc_state);
+ ret = skl_wm_check_vblank(state, crtc);
if (ret)
return ret;
--
2.49.1
^ permalink raw reply related [flat|nested] 26+ messages in thread* RE: [RFC][PATCH 09/11] drm/i915/wm: Use intel_prefill
2025-10-08 18:25 ` [RFC][PATCH 09/11] drm/i915/wm: Use intel_prefill Ville Syrjala
@ 2025-10-13 18:43 ` Shankar, Uma
0 siblings, 0 replies; 26+ messages in thread
From: Shankar, Uma @ 2025-10-13 18:43 UTC (permalink / raw)
To: Ville Syrjala, intel-gfx@lists.freedesktop.org
Cc: intel-xe@lists.freedesktop.org
> -----Original Message-----
> From: Intel-gfx <intel-gfx-bounces@lists.freedesktop.org> On Behalf Of Ville
> Syrjala
> Sent: Wednesday, October 8, 2025 11:56 PM
> To: intel-gfx@lists.freedesktop.org
> Cc: intel-xe@lists.freedesktop.org
> Subject: [RFC][PATCH 09/11] drm/i915/wm: Use intel_prefill
>
> From: Ville Syrjälä <ville.syrjala@linux.intel.com>
>
> Replace the current ad-hoc prefill calculations with intel_prefill.
Looks Good to me.
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> ---
> drivers/gpu/drm/i915/display/skl_watermark.c | 141 ++++---------------
> 1 file changed, 31 insertions(+), 110 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/display/skl_watermark.c
> b/drivers/gpu/drm/i915/display/skl_watermark.c
> index 5c18fe9a5237..b3e9e2a0dab3 100644
> --- a/drivers/gpu/drm/i915/display/skl_watermark.c
> +++ b/drivers/gpu/drm/i915/display/skl_watermark.c
> @@ -28,6 +28,7 @@
> #include "intel_flipq.h"
> #include "intel_pcode.h"
> #include "intel_plane.h"
> +#include "intel_prefill.h"
> #include "intel_wm.h"
> #include "skl_scaler.h"
> #include "skl_universal_plane_regs.h"
> @@ -2146,105 +2147,6 @@ static int icl_build_plane_wm(struct intel_crtc_state
> *crtc_state,
> return 0;
> }
>
> -static int
> -cdclk_prefill_adjustment(const struct intel_crtc_state *crtc_state) -{
> - struct intel_display *display = to_intel_display(crtc_state);
> - struct intel_atomic_state *state =
> - to_intel_atomic_state(crtc_state->uapi.state);
> - const struct intel_cdclk_state *cdclk_state;
> -
> - cdclk_state = intel_atomic_get_cdclk_state(state);
> - if (IS_ERR(cdclk_state)) {
> - drm_WARN_ON(display->drm, PTR_ERR(cdclk_state));
> - return 1;
> - }
> -
> - return min(1, DIV_ROUND_UP(crtc_state->pixel_rate,
> - 2 * intel_cdclk_logical(cdclk_state)));
> -}
> -
> -static int
> -dsc_prefill_latency(const struct intel_crtc_state *crtc_state) -{
> - struct intel_crtc *crtc = to_intel_crtc(crtc_state->uapi.crtc);
> - const struct intel_crtc_scaler_state *scaler_state =
> - &crtc_state->scaler_state;
> - int linetime = DIV_ROUND_UP(1000 * crtc_state-
> >hw.adjusted_mode.htotal,
> - crtc_state->hw.adjusted_mode.clock);
> - int num_scaler_users = hweight32(scaler_state->scaler_users);
> - int chroma_downscaling_factor =
> - crtc_state->output_format ==
> INTEL_OUTPUT_FORMAT_YCBCR420 ? 2 : 1;
> - u32 dsc_prefill_latency = 0;
> -
> - if (!crtc_state->dsc.compression_enable ||
> - !num_scaler_users ||
> - num_scaler_users > crtc->num_scalers)
> - return dsc_prefill_latency;
> -
> - dsc_prefill_latency = DIV_ROUND_UP(15 * linetime *
> chroma_downscaling_factor, 10);
> -
> - for (int i = 0; i < num_scaler_users; i++) {
> - u64 hscale_k, vscale_k;
> -
> - hscale_k = max(1000, mul_u32_u32(scaler_state-
> >scalers[i].hscale, 1000) >> 16);
> - vscale_k = max(1000, mul_u32_u32(scaler_state-
> >scalers[i].vscale, 1000) >> 16);
> - dsc_prefill_latency = DIV_ROUND_UP_ULL(dsc_prefill_latency *
> hscale_k * vscale_k,
> - 1000000);
> - }
> -
> - dsc_prefill_latency *= cdclk_prefill_adjustment(crtc_state);
> -
> - return intel_usecs_to_scanlines(&crtc_state->hw.adjusted_mode,
> dsc_prefill_latency);
> -}
> -
> -static int
> -scaler_prefill_latency(const struct intel_crtc_state *crtc_state) -{
> - const struct intel_crtc_scaler_state *scaler_state =
> - &crtc_state->scaler_state;
> - int num_scaler_users = hweight32(scaler_state->scaler_users);
> - int scaler_prefill_latency = 0;
> - int linetime = DIV_ROUND_UP(1000 * crtc_state-
> >hw.adjusted_mode.htotal,
> - crtc_state->hw.adjusted_mode.clock);
> -
> - if (!num_scaler_users)
> - return scaler_prefill_latency;
> -
> - scaler_prefill_latency = 4 * linetime;
> -
> - if (num_scaler_users > 1) {
> - u64 hscale_k = max(1000, mul_u32_u32(scaler_state-
> >scalers[0].hscale, 1000) >> 16);
> - u64 vscale_k = max(1000, mul_u32_u32(scaler_state-
> >scalers[0].vscale, 1000) >> 16);
> - int chroma_downscaling_factor =
> - crtc_state->output_format ==
> INTEL_OUTPUT_FORMAT_YCBCR420 ? 2 : 1;
> - int latency;
> -
> - latency = DIV_ROUND_UP_ULL((4 * linetime * hscale_k *
> vscale_k *
> - chroma_downscaling_factor),
> 1000000);
> - scaler_prefill_latency += latency;
> - }
> -
> - scaler_prefill_latency *= cdclk_prefill_adjustment(crtc_state);
> -
> - return intel_usecs_to_scanlines(&crtc_state->hw.adjusted_mode,
> scaler_prefill_latency);
> -}
> -
> -static bool
> -skl_is_vblank_too_short(const struct intel_crtc_state *crtc_state,
> - int wm0_lines, int latency)
> -{
> - const struct drm_display_mode *adjusted_mode =
> - &crtc_state->hw.adjusted_mode;
> -
> - return crtc_state->framestart_delay +
> - intel_usecs_to_scanlines(adjusted_mode, latency) +
> - scaler_prefill_latency(crtc_state) +
> - dsc_prefill_latency(crtc_state) +
> - wm0_lines >
> - adjusted_mode->crtc_vtotal - adjusted_mode->crtc_vblank_start;
> -}
> -
> unsigned int skl_wm0_prefill_lines_worst(const struct intel_crtc_state
> *crtc_state) {
> struct intel_display *display = to_intel_display(crtc_state); @@ -2325,9
> +2227,10 @@ unsigned int skl_wm0_prefill_lines(const struct intel_crtc_state
> *crtc_state)
> * the scenario to apply Wa_16025596647.
> */
> static int skl_max_wm_level_for_vblank(struct intel_crtc_state *crtc_state,
> - int wm0_lines)
> + const struct intel_prefill_ctx *ctx)
> {
> struct intel_display *display = to_intel_display(crtc_state);
> + struct intel_crtc *crtc = to_intel_crtc(crtc_state->uapi.crtc);
> int level;
>
> for (level = display->wm.num_levels - 1; level >= 0; level--) { @@ -
> 2342,25 +2245,36 @@ static int skl_max_wm_level_for_vblank(struct
> intel_crtc_state *crtc_state,
> if (level == 0)
> latency = 0;
>
> - if (!skl_is_vblank_too_short(crtc_state, wm0_lines, latency))
> + if (!intel_prefill_vblank_too_short(ctx, crtc_state, latency))
> return level;
> }
>
> + drm_dbg_kms(display->drm, "[CRTC:%d:%s] Not enough time in vblank
> for prefill\n",
> + crtc->base.base.id, crtc->base.name);
> +
> return -EINVAL;
> }
>
> -static int skl_wm_check_vblank(struct intel_crtc_state *crtc_state)
> +static int skl_wm_check_vblank(struct intel_atomic_state *state,
> + struct intel_crtc *crtc)
> {
> - struct intel_display *display = to_intel_display(crtc_state);
> - struct intel_crtc *crtc = to_intel_crtc(crtc_state->uapi.crtc);
> - int wm0_lines, level;
> + struct intel_display *display = to_intel_display(state);
> + struct intel_crtc_state *crtc_state =
> + intel_atomic_get_new_crtc_state(state, crtc);
> + struct intel_cdclk_state *cdclk_state;
> + struct intel_prefill_ctx ctx;
> + int level;
>
> if (!crtc_state->hw.active)
> return 0;
>
> - wm0_lines = skl_max_wm0_lines(crtc_state);
> + cdclk_state = intel_atomic_get_cdclk_state(state);
> + if (IS_ERR(cdclk_state))
> + return PTR_ERR(cdclk_state);
>
> - level = skl_max_wm_level_for_vblank(crtc_state, wm0_lines);
> + intel_prefill_init(&ctx, crtc_state, cdclk_state);
> +
> + level = skl_max_wm_level_for_vblank(crtc_state, &ctx);
> if (level < 0)
> return level;
>
> @@ -2370,6 +2284,13 @@ static int skl_wm_check_vblank(struct
> intel_crtc_state *crtc_state)
> */
> crtc_state->wm_level_disabled = level < display->wm.num_levels - 1;
>
> + /*
> + * TODO: assert that we are in fact using the maximum guardband
> + * if we end up disabling any WM levels here. Otherwise we clearly
> + * failed in using a realistic worst case prefill estimate during
> + * when determining the guardband size.
> + */
> +
> for (level++; level < display->wm.num_levels; level++) {
> enum plane_id plane_id;
>
> @@ -2388,8 +2309,8 @@ static int skl_wm_check_vblank(struct intel_crtc_state
> *crtc_state)
>
> if (DISPLAY_VER(display) >= 12 &&
> display->sagv.block_time_us &&
> - skl_is_vblank_too_short(crtc_state, wm0_lines,
> - display->sagv.block_time_us)) {
> + intel_prefill_vblank_too_short(&ctx, crtc_state,
> + display->sagv.block_time_us)) {
> enum plane_id plane_id;
>
> for_each_plane_id_on_crtc(crtc, plane_id) { @@ -3052,7 +2973,7
> @@ skl_compute_wm_late(struct intel_atomic_state *state)
> * intel_modeset_calc_cdclk() has been done. Scalers are still
> * completely broken wrt. skl_wm_check_vblank().
> */
> - ret = skl_wm_check_vblank(new_crtc_state);
> + ret = skl_wm_check_vblank(state, crtc);
> if (ret)
> return ret;
>
> --
> 2.49.1
^ permalink raw reply [flat|nested] 26+ messages in thread
* [RFC][PATCH 10/11] drm/i915/prefill: Print the prefill details
2025-10-08 18:25 [RFC][PATCH 00/11] drm/i915/prefill: Introduce helpers for prefill latency calculations Ville Syrjala
` (8 preceding siblings ...)
2025-10-08 18:25 ` [RFC][PATCH 09/11] drm/i915/wm: Use intel_prefill Ville Syrjala
@ 2025-10-08 18:25 ` Ville Syrjala
2025-10-13 18:45 ` Shankar, Uma
2025-10-08 18:25 ` [RFC][PATCH 11/11] drm/i915/prefill: Also print out the worst case estimates Ville Syrjala
10 siblings, 1 reply; 26+ messages in thread
From: Ville Syrjala @ 2025-10-08 18:25 UTC (permalink / raw)
To: intel-gfx; +Cc: intel-xe
From: Ville Syrjälä <ville.syrjala@linux.intel.com>
Print the prefill details to aid in debugging.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
---
drivers/gpu/drm/i915/display/intel_prefill.c | 33 ++++++++++++++++++++
1 file changed, 33 insertions(+)
diff --git a/drivers/gpu/drm/i915/display/intel_prefill.c b/drivers/gpu/drm/i915/display/intel_prefill.c
index 8b9c14e5c505..16ee72d1fc8a 100644
--- a/drivers/gpu/drm/i915/display/intel_prefill.c
+++ b/drivers/gpu/drm/i915/display/intel_prefill.c
@@ -15,6 +15,26 @@
#include "skl_scaler.h"
#include "skl_watermark.h"
+#define FP_FMT "%u.%06u"
+#define FP_ARG(val) (val) >> 16, (((val) & 0xffff) * 15625) >> 10
+
+static void intel_prefill_dump(struct intel_prefill_ctx *ctx,
+ const struct intel_crtc_state *crtc_state)
+{
+ struct intel_display *display = to_intel_display(crtc_state);
+
+ drm_dbg_kms(display->drm, "prefill prefill.fixed: " FP_FMT "\n", FP_ARG(ctx->prefill.fixed));
+ drm_dbg_kms(display->drm, "prefill prefill.wm0: " FP_FMT "\n", FP_ARG(ctx->prefill.wm0));
+ drm_dbg_kms(display->drm, "prefill prefill.scaler_1st: " FP_FMT "\n", FP_ARG(ctx->prefill.scaler_1st));
+ drm_dbg_kms(display->drm, "prefill prefill.scaler_2nd: " FP_FMT "\n", FP_ARG(ctx->prefill.scaler_2nd));
+ drm_dbg_kms(display->drm, "prefill prefill.dsc: " FP_FMT "\n", FP_ARG(ctx->prefill.dsc));
+ drm_dbg_kms(display->drm, "prefill prefill.full: " FP_FMT "\n", FP_ARG(ctx->prefill.full));
+
+ drm_dbg_kms(display->drm, "prefill adj.cdclk: " FP_FMT "\n", FP_ARG(ctx->adj.cdclk));
+ drm_dbg_kms(display->drm, "prefill adj.scaler_1st: " FP_FMT "\n", FP_ARG(ctx->adj.scaler_1st));
+ drm_dbg_kms(display->drm, "prefill adj.scaler_2nd: " FP_FMT "\n", FP_ARG(ctx->adj.scaler_2nd));
+}
+
static unsigned int prefill_usecs_to_lines(const struct intel_crtc_state *crtc_state, unsigned int usecs)
{
const struct drm_display_mode *pipe_mode = &crtc_state->hw.pipe_mode;
@@ -101,6 +121,8 @@ void intel_prefill_init_worst(struct intel_prefill_ctx *ctx,
ctx->adj.cdclk = intel_cdclk_prefill_adjustment_worst(crtc_state);
ctx->prefill.full = prefill_lines_full(ctx);
+
+ intel_prefill_dump(ctx, crtc_state);
}
void intel_prefill_init(struct intel_prefill_ctx *ctx,
@@ -112,6 +134,8 @@ void intel_prefill_init(struct intel_prefill_ctx *ctx,
ctx->adj.cdclk = intel_cdclk_prefill_adjustment(crtc_state, cdclk_state);
ctx->prefill.full = prefill_lines_full(ctx);
+
+ intel_prefill_dump(ctx, crtc_state);
}
static unsigned int prefill_lines_with_latency(const struct intel_prefill_ctx *ctx,
@@ -149,9 +173,18 @@ bool intel_prefill_vblank_too_short(const struct intel_prefill_ctx *ctx,
const struct intel_crtc_state *crtc_state,
unsigned int latency_us)
{
+ struct intel_display *display = to_intel_display(crtc_state);
unsigned int guardband = intel_prefill_guardband(crtc_state);
unsigned int prefill = prefill_lines_with_latency(ctx, crtc_state, latency_us);
+ drm_dbg_kms(display->drm, " prefill (%d): " FP_FMT "\n", latency_us, FP_ARG(prefill));
+ drm_dbg_kms(display->drm, "guardband (%d): " FP_FMT "\n", latency_us, FP_ARG(guardband));
+
+ drm_dbg_kms(display->drm, "min guardband (%d): %d lines\n", latency_us,
+ intel_prefill_min_guardband(ctx, crtc_state, latency_us));
+ drm_dbg_kms(display->drm, "min cdclk (%d): %d khz\n", latency_us,
+ intel_prefill_min_cdclk(ctx, crtc_state));
+
return guardband < prefill;
}
--
2.49.1
^ permalink raw reply related [flat|nested] 26+ messages in thread* RE: [RFC][PATCH 10/11] drm/i915/prefill: Print the prefill details
2025-10-08 18:25 ` [RFC][PATCH 10/11] drm/i915/prefill: Print the prefill details Ville Syrjala
@ 2025-10-13 18:45 ` Shankar, Uma
0 siblings, 0 replies; 26+ messages in thread
From: Shankar, Uma @ 2025-10-13 18:45 UTC (permalink / raw)
To: Ville Syrjala, intel-gfx@lists.freedesktop.org
Cc: intel-xe@lists.freedesktop.org
> -----Original Message-----
> From: Intel-gfx <intel-gfx-bounces@lists.freedesktop.org> On Behalf Of Ville
> Syrjala
> Sent: Wednesday, October 8, 2025 11:56 PM
> To: intel-gfx@lists.freedesktop.org
> Cc: intel-xe@lists.freedesktop.org
> Subject: [RFC][PATCH 10/11] drm/i915/prefill: Print the prefill details
>
> From: Ville Syrjälä <ville.syrjala@linux.intel.com>
>
> Print the prefill details to aid in debugging.
Looks Good to me.
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> ---
> drivers/gpu/drm/i915/display/intel_prefill.c | 33 ++++++++++++++++++++
> 1 file changed, 33 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/display/intel_prefill.c
> b/drivers/gpu/drm/i915/display/intel_prefill.c
> index 8b9c14e5c505..16ee72d1fc8a 100644
> --- a/drivers/gpu/drm/i915/display/intel_prefill.c
> +++ b/drivers/gpu/drm/i915/display/intel_prefill.c
> @@ -15,6 +15,26 @@
> #include "skl_scaler.h"
> #include "skl_watermark.h"
>
> +#define FP_FMT "%u.%06u"
> +#define FP_ARG(val) (val) >> 16, (((val) & 0xffff) * 15625) >> 10
> +
> +static void intel_prefill_dump(struct intel_prefill_ctx *ctx,
> + const struct intel_crtc_state *crtc_state) {
> + struct intel_display *display = to_intel_display(crtc_state);
> +
> + drm_dbg_kms(display->drm, "prefill prefill.fixed: " FP_FMT "\n",
> FP_ARG(ctx->prefill.fixed));
> + drm_dbg_kms(display->drm, "prefill prefill.wm0: " FP_FMT "\n",
> FP_ARG(ctx->prefill.wm0));
> + drm_dbg_kms(display->drm, "prefill prefill.scaler_1st: " FP_FMT "\n",
> FP_ARG(ctx->prefill.scaler_1st));
> + drm_dbg_kms(display->drm, "prefill prefill.scaler_2nd: " FP_FMT "\n",
> FP_ARG(ctx->prefill.scaler_2nd));
> + drm_dbg_kms(display->drm, "prefill prefill.dsc: " FP_FMT "\n",
> FP_ARG(ctx->prefill.dsc));
> + drm_dbg_kms(display->drm, "prefill prefill.full: " FP_FMT "\n",
> FP_ARG(ctx->prefill.full));
> +
> + drm_dbg_kms(display->drm, "prefill adj.cdclk: " FP_FMT "\n",
> FP_ARG(ctx->adj.cdclk));
> + drm_dbg_kms(display->drm, "prefill adj.scaler_1st: " FP_FMT "\n",
> FP_ARG(ctx->adj.scaler_1st));
> + drm_dbg_kms(display->drm, "prefill adj.scaler_2nd: " FP_FMT "\n",
> FP_ARG(ctx->adj.scaler_2nd));
> +}
> +
> static unsigned int prefill_usecs_to_lines(const struct intel_crtc_state *crtc_state,
> unsigned int usecs) {
> const struct drm_display_mode *pipe_mode = &crtc_state-
> >hw.pipe_mode; @@ -101,6 +121,8 @@ void intel_prefill_init_worst(struct
> intel_prefill_ctx *ctx,
> ctx->adj.cdclk = intel_cdclk_prefill_adjustment_worst(crtc_state);
>
> ctx->prefill.full = prefill_lines_full(ctx);
> +
> + intel_prefill_dump(ctx, crtc_state);
> }
>
> void intel_prefill_init(struct intel_prefill_ctx *ctx, @@ -112,6 +134,8 @@ void
> intel_prefill_init(struct intel_prefill_ctx *ctx,
> ctx->adj.cdclk = intel_cdclk_prefill_adjustment(crtc_state, cdclk_state);
>
> ctx->prefill.full = prefill_lines_full(ctx);
> +
> + intel_prefill_dump(ctx, crtc_state);
> }
>
> static unsigned int prefill_lines_with_latency(const struct intel_prefill_ctx *ctx,
> @@ -149,9 +173,18 @@ bool intel_prefill_vblank_too_short(const struct
> intel_prefill_ctx *ctx,
> const struct intel_crtc_state *crtc_state,
> unsigned int latency_us)
> {
> + struct intel_display *display = to_intel_display(crtc_state);
> unsigned int guardband = intel_prefill_guardband(crtc_state);
> unsigned int prefill = prefill_lines_with_latency(ctx, crtc_state,
> latency_us);
>
> + drm_dbg_kms(display->drm, " prefill (%d): " FP_FMT "\n", latency_us,
> FP_ARG(prefill));
> + drm_dbg_kms(display->drm, "guardband (%d): " FP_FMT "\n", latency_us,
> +FP_ARG(guardband));
> +
> + drm_dbg_kms(display->drm, "min guardband (%d): %d lines\n",
> latency_us,
> + intel_prefill_min_guardband(ctx, crtc_state, latency_us));
> + drm_dbg_kms(display->drm, "min cdclk (%d): %d khz\n", latency_us,
> + intel_prefill_min_cdclk(ctx, crtc_state));
> +
> return guardband < prefill;
> }
>
> --
> 2.49.1
^ permalink raw reply [flat|nested] 26+ messages in thread
* [RFC][PATCH 11/11] drm/i915/prefill: Also print out the worst case estimates
2025-10-08 18:25 [RFC][PATCH 00/11] drm/i915/prefill: Introduce helpers for prefill latency calculations Ville Syrjala
` (9 preceding siblings ...)
2025-10-08 18:25 ` [RFC][PATCH 10/11] drm/i915/prefill: Print the prefill details Ville Syrjala
@ 2025-10-08 18:25 ` Ville Syrjala
2025-10-13 18:47 ` Shankar, Uma
10 siblings, 1 reply; 26+ messages in thread
From: Ville Syrjala @ 2025-10-08 18:25 UTC (permalink / raw)
To: intel-gfx; +Cc: intel-xe
From: Ville Syrjälä <ville.syrjala@linux.intel.com>
We don't use the worst case prefill estimates yet for anything.
Print them out alongside the actual prefill details to help
conmfirm they are at leat getting computed somewhat sanely.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
---
drivers/gpu/drm/i915/display/skl_watermark.c | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/drivers/gpu/drm/i915/display/skl_watermark.c b/drivers/gpu/drm/i915/display/skl_watermark.c
index b3e9e2a0dab3..7bfd61cb41ca 100644
--- a/drivers/gpu/drm/i915/display/skl_watermark.c
+++ b/drivers/gpu/drm/i915/display/skl_watermark.c
@@ -2322,6 +2322,17 @@ static int skl_wm_check_vblank(struct intel_atomic_state *state,
}
}
+ /* hack to dump the worst case as well */
+ memset(&ctx, 0, sizeof(ctx));
+ intel_prefill_init_worst(&ctx, crtc_state);
+
+ level = skl_max_wm_level_for_vblank(crtc_state, &ctx);
+
+ if (DISPLAY_VER(display) >= 12 &&
+ display->sagv.block_time_us)
+ intel_prefill_vblank_too_short(&ctx, crtc_state,
+ display->sagv.block_time_us);
+
return 0;
}
--
2.49.1
^ permalink raw reply related [flat|nested] 26+ messages in thread* RE: [RFC][PATCH 11/11] drm/i915/prefill: Also print out the worst case estimates
2025-10-08 18:25 ` [RFC][PATCH 11/11] drm/i915/prefill: Also print out the worst case estimates Ville Syrjala
@ 2025-10-13 18:47 ` Shankar, Uma
0 siblings, 0 replies; 26+ messages in thread
From: Shankar, Uma @ 2025-10-13 18:47 UTC (permalink / raw)
To: Ville Syrjala, intel-gfx@lists.freedesktop.org
Cc: intel-xe@lists.freedesktop.org
> -----Original Message-----
> From: Intel-gfx <intel-gfx-bounces@lists.freedesktop.org> On Behalf Of Ville
> Syrjala
> Sent: Wednesday, October 8, 2025 11:56 PM
> To: intel-gfx@lists.freedesktop.org
> Cc: intel-xe@lists.freedesktop.org
> Subject: [RFC][PATCH 11/11] drm/i915/prefill: Also print out the worst case
> estimates
>
> From: Ville Syrjälä <ville.syrjala@linux.intel.com>
>
> We don't use the worst case prefill estimates yet for anything.
> Print them out alongside the actual prefill details to help conmfirm they are at leat
> getting computed somewhat sanely.
Looks Good to me.
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> ---
> drivers/gpu/drm/i915/display/skl_watermark.c | 11 +++++++++++
> 1 file changed, 11 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/display/skl_watermark.c
> b/drivers/gpu/drm/i915/display/skl_watermark.c
> index b3e9e2a0dab3..7bfd61cb41ca 100644
> --- a/drivers/gpu/drm/i915/display/skl_watermark.c
> +++ b/drivers/gpu/drm/i915/display/skl_watermark.c
> @@ -2322,6 +2322,17 @@ static int skl_wm_check_vblank(struct
> intel_atomic_state *state,
> }
> }
>
> + /* hack to dump the worst case as well */
> + memset(&ctx, 0, sizeof(ctx));
> + intel_prefill_init_worst(&ctx, crtc_state);
> +
> + level = skl_max_wm_level_for_vblank(crtc_state, &ctx);
> +
> + if (DISPLAY_VER(display) >= 12 &&
> + display->sagv.block_time_us)
> + intel_prefill_vblank_too_short(&ctx, crtc_state,
> + display->sagv.block_time_us);
> +
> return 0;
> }
>
> --
> 2.49.1
^ permalink raw reply [flat|nested] 26+ messages in thread