* [PATCH 0/5] Haswell watermarks
@ 2013-05-24 14:59 Paulo Zanoni
2013-05-24 14:59 ` [PATCH 1/5] drm/i915: add "enable" argument to intel_update_sprite_watermarks Paulo Zanoni
` (4 more replies)
0 siblings, 5 replies; 29+ messages in thread
From: Paulo Zanoni @ 2013-05-24 14:59 UTC (permalink / raw)
To: intel-gfx; +Cc: Paulo Zanoni
From: Paulo Zanoni <paulo.r.zanoni@intel.com>
Hi
This series is a new version of "drm/i915: replace snb_update_wm with
haswell_update_wm on HSW". Ville asked to split the series into smaller patches,
so here they are. I also implemented the other suggestions made by Ville.
After this series, the only thing missing for correctness of the Haswell
watermark register values will be to use the correct mode clock when calculating
linetime watermarks. I had a patch for this, but Daniel suggested to wait until
we merge "drm/i915: store adjust dotclock in adjustede_mode->clock". I can
already see us reaching PC7 state with this series on eDP 1920x1080 with
138.78MHz pixel clock.
Thanks,
Paulo
Paulo Zanoni (5):
drm/i915: add "enable" argument to intel_update_sprite_watermarks
drm/i915: add haswell_update_sprite_wm
drm/i915: properly set HSW WM_PIPE registers
drm/i915: properly set HSW WM_LP watermarks
drm/i915: add support for 5/6 data buffer partitioning on Haswell
drivers/gpu/drm/i915/i915_drv.h | 3 +-
drivers/gpu/drm/i915/i915_reg.h | 7 +
drivers/gpu/drm/i915/intel_drv.h | 14 +-
drivers/gpu/drm/i915/intel_pm.c | 588 ++++++++++++++++++++++++++++++++++--
drivers/gpu/drm/i915/intel_sprite.c | 8 +-
5 files changed, 593 insertions(+), 27 deletions(-)
--
1.8.1.2
^ permalink raw reply [flat|nested] 29+ messages in thread
* [PATCH 1/5] drm/i915: add "enable" argument to intel_update_sprite_watermarks
2013-05-24 14:59 [PATCH 0/5] Haswell watermarks Paulo Zanoni
@ 2013-05-24 14:59 ` Paulo Zanoni
2013-05-24 16:22 ` Ville Syrjälä
2013-05-24 14:59 ` [PATCH 2/5] drm/i915: add haswell_update_sprite_wm Paulo Zanoni
` (3 subsequent siblings)
4 siblings, 1 reply; 29+ messages in thread
From: Paulo Zanoni @ 2013-05-24 14:59 UTC (permalink / raw)
To: intel-gfx; +Cc: Paulo Zanoni
From: Paulo Zanoni <paulo.r.zanoni@intel.com>
Because we want to call it from the "sprite disable" paths, since on
Haswell we need to update the sprite watermarks when we disable
sprites.
For now, all this patch does is to add the "enable" argument and call
intel_update_sprite_watermarks from inside ivb_disable_plane. This
shouldn't change how the code behaves because on
sandybridge_update_sprite_wm we just ignore the "!enable" case. The
patches that implement Haswell watermarks will make use of the changes
introduced by this patch.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
---
drivers/gpu/drm/i915/i915_drv.h | 3 ++-
drivers/gpu/drm/i915/intel_drv.h | 2 +-
drivers/gpu/drm/i915/intel_pm.c | 11 ++++++++---
drivers/gpu/drm/i915/intel_sprite.c | 8 +++++---
4 files changed, 16 insertions(+), 8 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 7772bb6..e38f8d3 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -315,7 +315,8 @@ struct drm_i915_display_funcs {
int (*get_fifo_size)(struct drm_device *dev, int plane);
void (*update_wm)(struct drm_device *dev);
void (*update_sprite_wm)(struct drm_device *dev, int pipe,
- uint32_t sprite_width, int pixel_size);
+ uint32_t sprite_width, int pixel_size,
+ bool enable);
void (*modeset_global_resources)(struct drm_device *dev);
/* Returns the active state of the crtc, and if the crtc is active,
* fills out the pipe-config with the hw state. */
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index 75a7f22..21427aa 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -731,7 +731,7 @@ extern void intel_ddi_init(struct drm_device *dev, enum port port);
extern void intel_update_watermarks(struct drm_device *dev);
extern void intel_update_sprite_watermarks(struct drm_device *dev, int pipe,
uint32_t sprite_width,
- int pixel_size);
+ int pixel_size, bool enable);
extern unsigned long intel_gen4_compute_page_offset(int *x, int *y,
unsigned int tiling_mode,
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index e198f38..3ebb8e9 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -2195,7 +2195,8 @@ sandybridge_compute_sprite_srwm(struct drm_device *dev, int plane,
}
static void sandybridge_update_sprite_wm(struct drm_device *dev, int pipe,
- uint32_t sprite_width, int pixel_size)
+ uint32_t sprite_width, int pixel_size,
+ bool enable)
{
struct drm_i915_private *dev_priv = dev->dev_private;
int latency = SNB_READ_WM0_LATENCY() * 100; /* In unit 0.1us */
@@ -2203,6 +2204,9 @@ static void sandybridge_update_sprite_wm(struct drm_device *dev, int pipe,
int sprite_wm, reg;
int ret;
+ if (!enable)
+ return;
+
switch (pipe) {
case 0:
reg = WM0_PIPEA_ILK;
@@ -2314,13 +2318,14 @@ void intel_update_watermarks(struct drm_device *dev)
}
void intel_update_sprite_watermarks(struct drm_device *dev, int pipe,
- uint32_t sprite_width, int pixel_size)
+ uint32_t sprite_width, int pixel_size,
+ bool enable)
{
struct drm_i915_private *dev_priv = dev->dev_private;
if (dev_priv->display.update_sprite_wm)
dev_priv->display.update_sprite_wm(dev, pipe, sprite_width,
- pixel_size);
+ pixel_size, enable);
}
static struct drm_i915_gem_object *
diff --git a/drivers/gpu/drm/i915/intel_sprite.c b/drivers/gpu/drm/i915/intel_sprite.c
index 19b9cb9..04d38d4 100644
--- a/drivers/gpu/drm/i915/intel_sprite.c
+++ b/drivers/gpu/drm/i915/intel_sprite.c
@@ -114,7 +114,7 @@ vlv_update_plane(struct drm_plane *dplane, struct drm_framebuffer *fb,
crtc_w--;
crtc_h--;
- intel_update_sprite_watermarks(dev, pipe, crtc_w, pixel_size);
+ intel_update_sprite_watermarks(dev, pipe, crtc_w, pixel_size, true);
I915_WRITE(SPSTRIDE(pipe, plane), fb->pitches[0]);
I915_WRITE(SPPOS(pipe, plane), (crtc_y << 16) | crtc_x);
@@ -268,7 +268,7 @@ ivb_update_plane(struct drm_plane *plane, struct drm_framebuffer *fb,
crtc_w--;
crtc_h--;
- intel_update_sprite_watermarks(dev, pipe, crtc_w, pixel_size);
+ intel_update_sprite_watermarks(dev, pipe, crtc_w, pixel_size, true);
/*
* IVB workaround: must disable low power watermarks for at least
@@ -335,6 +335,8 @@ ivb_disable_plane(struct drm_plane *plane)
dev_priv->sprite_scaling_enabled &= ~(1 << pipe);
+ intel_update_sprite_watermarks(dev, pipe, 0, 0, false);
+
/* potentially re-enable LP watermarks */
if (scaling_was_enabled && !dev_priv->sprite_scaling_enabled)
intel_update_watermarks(dev);
@@ -453,7 +455,7 @@ ilk_update_plane(struct drm_plane *plane, struct drm_framebuffer *fb,
crtc_w--;
crtc_h--;
- intel_update_sprite_watermarks(dev, pipe, crtc_w, pixel_size);
+ intel_update_sprite_watermarks(dev, pipe, crtc_w, pixel_size, true);
dvsscale = 0;
if (IS_GEN5(dev) || crtc_w != src_w || crtc_h != src_h)
--
1.8.1.2
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH 2/5] drm/i915: add haswell_update_sprite_wm
2013-05-24 14:59 [PATCH 0/5] Haswell watermarks Paulo Zanoni
2013-05-24 14:59 ` [PATCH 1/5] drm/i915: add "enable" argument to intel_update_sprite_watermarks Paulo Zanoni
@ 2013-05-24 14:59 ` Paulo Zanoni
2013-05-24 17:00 ` Ville Syrjälä
2013-05-24 14:59 ` [PATCH 3/5] drm/i915: properly set HSW WM_PIPE registers Paulo Zanoni
` (2 subsequent siblings)
4 siblings, 1 reply; 29+ messages in thread
From: Paulo Zanoni @ 2013-05-24 14:59 UTC (permalink / raw)
To: intel-gfx; +Cc: Paulo Zanoni
From: Paulo Zanoni <paulo.r.zanoni@intel.com>
On Haswell, whenever we change the sprites we need to completely
recalculate all the watermarks, because the sprites are one of the
parameters to the LP watermarks, so a change on the sprites may
trigger a change on which LP levels are enabled.
So on this commit we store all the parameters we need to store for
proper recalculation of the Haswell WMs and then call
haswell_update_wm.
Notice that for now our haswell_update_wm function is not really using
these parameters we're storing, but on the next commits we'll use
these parameters.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
---
drivers/gpu/drm/i915/intel_drv.h | 12 ++++++++++++
drivers/gpu/drm/i915/intel_pm.c | 23 ++++++++++++++++++++++-
2 files changed, 34 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index 21427aa..57de0c1 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -326,6 +326,18 @@ struct intel_plane {
unsigned int crtc_w, crtc_h;
uint32_t src_x, src_y;
uint32_t src_w, src_h;
+
+ /* Since we need to change the watermarks before/after
+ * enabling/disabling the planes, we need to store the parameters here
+ * as the other pieces of the struct may not reflect the values we want
+ * for the watermark calculations. Currently only Haswell uses this.
+ */
+ struct {
+ bool enable;
+ uint8_t bytes_per_pixel;
+ uint32_t horiz_pixels;
+ } wm;
+
void (*update_plane)(struct drm_plane *plane,
struct drm_framebuffer *fb,
struct drm_i915_gem_object *obj,
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 3ebb8e9..0b61a0e 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -2118,6 +2118,26 @@ static void haswell_update_wm(struct drm_device *dev)
sandybridge_update_wm(dev);
}
+static void haswell_update_sprite_wm(struct drm_device *dev, int pipe,
+ uint32_t sprite_width, int pixel_size,
+ bool enable)
+{
+ struct drm_plane *plane;
+
+ list_for_each_entry(plane, &dev->mode_config.plane_list, head) {
+ struct intel_plane *intel_plane = to_intel_plane(plane);
+
+ if (intel_plane->pipe == pipe) {
+ intel_plane->wm.enable = enable;
+ intel_plane->wm.horiz_pixels = sprite_width + 1;
+ intel_plane->wm.bytes_per_pixel = pixel_size;
+ break;
+ }
+ }
+
+ haswell_update_wm(dev);
+}
+
static bool
sandybridge_compute_sprite_wm(struct drm_device *dev, int plane,
uint32_t sprite_width, int pixel_size,
@@ -4635,7 +4655,8 @@ void intel_init_pm(struct drm_device *dev)
} else if (IS_HASWELL(dev)) {
if (I915_READ64(MCH_SSKPD)) {
dev_priv->display.update_wm = haswell_update_wm;
- dev_priv->display.update_sprite_wm = sandybridge_update_sprite_wm;
+ dev_priv->display.update_sprite_wm =
+ haswell_update_sprite_wm;
} else {
DRM_DEBUG_KMS("Failed to read display plane latency. "
"Disable CxSR\n");
--
1.8.1.2
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH 3/5] drm/i915: properly set HSW WM_PIPE registers
2013-05-24 14:59 [PATCH 0/5] Haswell watermarks Paulo Zanoni
2013-05-24 14:59 ` [PATCH 1/5] drm/i915: add "enable" argument to intel_update_sprite_watermarks Paulo Zanoni
2013-05-24 14:59 ` [PATCH 2/5] drm/i915: add haswell_update_sprite_wm Paulo Zanoni
@ 2013-05-24 14:59 ` Paulo Zanoni
2013-05-24 16:07 ` Ville Syrjälä
2013-05-24 14:59 ` [PATCH 4/5] drm/i915: properly set HSW WM_LP watermarks Paulo Zanoni
2013-05-24 14:59 ` [PATCH 5/5] drm/i915: add support for 5/6 data buffer partitioning on Haswell Paulo Zanoni
4 siblings, 1 reply; 29+ messages in thread
From: Paulo Zanoni @ 2013-05-24 14:59 UTC (permalink / raw)
To: intel-gfx; +Cc: Paulo Zanoni
From: Paulo Zanoni <paulo.r.zanoni@intel.com>
We were previously calling sandybridge_update_wm on HSW, but the SNB
function didn't really match the HSW specification, so we were just
writing the wrong values.
With this patch, the haswell_update_wm function will set the correct
values for the WM_PIPE registers, but it will still keep all the LP
watermarks disabled.
The patch may look a little bit over-complicated for now, but it's
because much of the infrastructure for setting the LP watermarks is
already in place, so we won't have too much code churn on the patch
that sets the LP watermarks.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
---
drivers/gpu/drm/i915/i915_reg.h | 3 +
drivers/gpu/drm/i915/intel_pm.c | 340 +++++++++++++++++++++++++++++++++++++---
2 files changed, 325 insertions(+), 18 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 55caedb..e86606c 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -4938,6 +4938,9 @@
#define SFUSE_STRAP_DDIC_DETECTED (1<<1)
#define SFUSE_STRAP_DDID_DETECTED (1<<0)
+#define WM_MISC 0x45260
+#define WM_MISC_DATA_PARTITION_5_6 (1 << 0)
+
#define WM_DBG 0x45280
#define WM_DBG_DISALLOW_MULTIPLE_LP (1<<0)
#define WM_DBG_DISALLOW_MAXFIFO (1<<1)
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 0b61a0e..2ee1d01 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -2072,19 +2072,173 @@ static void ivybridge_update_wm(struct drm_device *dev)
cursor_wm);
}
-static void
-haswell_update_linetime_wm(struct drm_device *dev, struct drm_crtc *crtc)
+static uint32_t hsw_wm_get_pixel_rate(struct drm_device *dev,
+ struct drm_crtc *crtc)
+{
+ struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
+ uint32_t pixel_rate, pfit_size;
+
+ if (intel_crtc->config.pixel_target_clock)
+ pixel_rate = intel_crtc->config.pixel_target_clock;
+ else
+ pixel_rate = intel_crtc->config.adjusted_mode.clock;
+
+ /* We only use IF-ID interlacing. If we ever use PF-ID we'll need to
+ * adjust the pixel_rate here. */
+
+ pfit_size = intel_crtc->config.pch_pfit.size;
+ if (pfit_size) {
+ uint64_t x, y, crtc_x, crtc_y, hscale, vscale, totscale;
+
+ x = (pfit_size >> 16) & 0xFFFF;
+ y = pfit_size & 0xFFFF;
+ crtc_x = intel_crtc->config.adjusted_mode.hdisplay;
+ crtc_y = intel_crtc->config.adjusted_mode.vdisplay;
+
+ hscale = crtc_x << 16;
+ vscale = crtc_y << 16;
+ do_div(hscale, x);
+ do_div(vscale, y);
+ hscale = (hscale < (1 << 16)) ? (1 << 16) : hscale;
+ vscale = (vscale < (1 << 16)) ? (1 << 16) : vscale;
+ totscale = (hscale * vscale) >> 16;
+ pixel_rate = (pixel_rate * totscale) >> 16;
+ }
+
+ return pixel_rate;
+}
+
+static uint32_t hsw_wm_method1(uint32_t pixel_rate, uint8_t bytes_per_pixel,
+ uint32_t latency)
+{
+ uint64_t tmp;
+ uint32_t ret;
+
+ tmp = pixel_rate * bytes_per_pixel * latency;
+ ret = DIV_ROUND_UP_ULL(tmp, 64 * 10000) + 2;
+
+ return ret;
+}
+
+static uint32_t hsw_wm_method2(uint32_t pixel_rate, uint32_t pipe_htotal,
+ uint32_t horiz_pixels, uint8_t bytes_per_pixel,
+ uint32_t latency)
+{
+ uint32_t ret;
+
+ ret = DIV_ROUND_UP(pipe_htotal * 1000, pixel_rate);
+ ret = ((latency / (ret * 10)) + 1) * horiz_pixels * bytes_per_pixel;
+ ret = DIV_ROUND_UP(ret, 64) + 2;
+ return ret;
+}
+
+struct hsw_pipe_wm_parameters {
+ bool active;
+ bool sprite_enabled;
+ uint8_t pri_bytes_per_pixel;
+ uint8_t spr_bytes_per_pixel;
+ uint8_t cur_bytes_per_pixel;
+ uint32_t pri_horiz_pixels;
+ uint32_t spr_horiz_pixels;
+ uint32_t cur_horiz_pixels;
+ uint32_t pipe_htotal;
+ uint32_t pixel_rate;
+};
+
+struct hsw_wm_values {
+ uint32_t wm_pipe[3];
+ uint32_t wm_lp[3];
+ uint32_t wm_lp_spr[3];
+ uint32_t wm_linetime[3];
+};
+
+enum hsw_data_buf_partitioning {
+ HSW_DATA_BUF_PART_1_2,
+ HSW_DATA_BUF_PART_5_6,
+};
+
+/* Only for WM_PIPE. */
+static uint32_t hsw_compute_pri_wm_pipe(struct hsw_pipe_wm_parameters *params,
+ uint32_t mem_value)
+{
+ /* TODO: for now, assume the primary plane is always enabled. */
+ if (!params->active)
+ return 0;
+
+ return hsw_wm_method1(params->pixel_rate,
+ params->pri_bytes_per_pixel,
+ mem_value);
+}
+
+/* For both WM_PIPE and WM_LP. */
+static uint32_t hsw_compute_spr_wm(struct hsw_pipe_wm_parameters *params,
+ uint32_t mem_value)
+{
+ uint32_t method1, method2;
+
+ if (!params->active || !params->sprite_enabled)
+ return 0;
+
+ method1 = hsw_wm_method1(params->pixel_rate,
+ params->spr_bytes_per_pixel,
+ mem_value);
+ method2 = hsw_wm_method2(params->pixel_rate,
+ params->pipe_htotal,
+ params->spr_horiz_pixels,
+ params->spr_bytes_per_pixel,
+ mem_value);
+ return min(method1, method2);
+}
+
+/* For both WM_PIPE and WM_LP. */
+static uint32_t hsw_compute_cur_wm(struct hsw_pipe_wm_parameters *params,
+ uint32_t mem_value)
+{
+ if (!params->active)
+ return 0;
+
+ return hsw_wm_method2(params->pixel_rate,
+ params->pipe_htotal,
+ params->cur_horiz_pixels,
+ params->cur_bytes_per_pixel,
+ mem_value);
+}
+
+static uint32_t hsw_compute_wm_pipe(struct drm_i915_private *dev_priv,
+ uint32_t mem_value, enum pipe pipe,
+ struct hsw_pipe_wm_parameters *params)
+{
+ uint32_t pri_val, cur_val, spr_val;
+
+ pri_val = hsw_compute_pri_wm_pipe(params, mem_value);
+ spr_val = hsw_compute_spr_wm(params, mem_value);
+ cur_val = hsw_compute_cur_wm(params, mem_value);
+
+ WARN(pri_val > 127,
+ "Primary WM error, mode not supported for pipe %c\n",
+ pipe_name(pipe));
+ WARN(spr_val > 127,
+ "Sprite WM error, mode not supported for pipe %c\n",
+ pipe_name(pipe));
+ WARN(cur_val > 63,
+ "Cursor WM error, mode not supported for pipe %c\n",
+ pipe_name(pipe));
+
+ return (pri_val << WM0_PIPE_PLANE_SHIFT) |
+ (spr_val << WM0_PIPE_SPRITE_SHIFT) |
+ cur_val;
+}
+
+static uint32_t
+hsw_compute_linetime_wm(struct drm_device *dev, struct drm_crtc *crtc)
{
struct drm_i915_private *dev_priv = dev->dev_private;
struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
- enum pipe pipe = intel_crtc->pipe;
struct drm_display_mode *mode = &intel_crtc->config.adjusted_mode;
u32 linetime, ips_linetime;
- if (!intel_crtc_active(crtc)) {
- I915_WRITE(PIPE_WM_LINETIME(pipe), 0);
- return;
- }
+ if (!intel_crtc_active(crtc))
+ return 0;
/* The WM are computed with base on how long it takes to fill a single
* row at the given clock rate, multiplied by 8.
@@ -2093,29 +2247,179 @@ haswell_update_linetime_wm(struct drm_device *dev, struct drm_crtc *crtc)
ips_linetime = DIV_ROUND_CLOSEST(mode->htotal * 1000 * 8,
intel_ddi_get_cdclk_freq(dev_priv));
- I915_WRITE(PIPE_WM_LINETIME(pipe),
- PIPE_WM_LINETIME_IPS_LINETIME(ips_linetime) |
- PIPE_WM_LINETIME_TIME(linetime));
+ return PIPE_WM_LINETIME_IPS_LINETIME(ips_linetime) |
+ PIPE_WM_LINETIME_TIME(linetime);
}
-static void haswell_update_wm(struct drm_device *dev)
+static void hsw_compute_wm_parameters(struct drm_device *dev,
+ struct hsw_pipe_wm_parameters *params,
+ uint32_t *wm)
{
struct drm_i915_private *dev_priv = dev->dev_private;
struct drm_crtc *crtc;
+ struct drm_plane *plane;
+ uint64_t sskpd = I915_READ64(MCH_SSKPD);
enum pipe pipe;
- /* Disable the LP WMs before changine the linetime registers. This is
- * just a temporary code that will be replaced soon. */
- I915_WRITE(WM3_LP_ILK, 0);
- I915_WRITE(WM2_LP_ILK, 0);
- I915_WRITE(WM1_LP_ILK, 0);
+ if ((sskpd >> 56) & 0xFF)
+ wm[0] = (sskpd >> 56) & 0xFF;
+ else
+ wm[0] = sskpd & 0xF;
+ wm[1] = ((sskpd >> 4) & 0xFF) * 5;
+ wm[2] = ((sskpd >> 12) & 0xFF) * 5;
+ wm[3] = ((sskpd >> 20) & 0x1FF) * 5;
+ wm[4] = ((sskpd >> 32) & 0x1FF) * 5;
+
+ list_for_each_entry(crtc, &dev->mode_config.crtc_list, head) {
+ struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
+ struct hsw_pipe_wm_parameters *p;
+
+ pipe = intel_crtc->pipe;
+ p = ¶ms[pipe];
+
+ p->active = intel_crtc_active(crtc);
+ if (!p->active)
+ continue;
+
+ p->pipe_htotal = intel_crtc->config.adjusted_mode.htotal;
+ p->pixel_rate = hsw_wm_get_pixel_rate(dev, crtc);
+ p->pri_bytes_per_pixel = crtc->fb->bits_per_pixel / 8;
+ p->cur_bytes_per_pixel = 4;
+ p->pri_horiz_pixels = intel_crtc->config.adjusted_mode.hdisplay;
+ p->cur_horiz_pixels = 64;
+ }
+
+ list_for_each_entry(plane, &dev->mode_config.plane_list, head) {
+ struct intel_plane *intel_plane = to_intel_plane(plane);
+ struct hsw_pipe_wm_parameters *p;
+
+ pipe = intel_plane->pipe;
+ p = ¶ms[pipe];
+
+ p->sprite_enabled = intel_plane->wm.enable;
+ p->spr_bytes_per_pixel = intel_plane->wm.bytes_per_pixel;
+ p->spr_horiz_pixels = intel_plane->wm.horiz_pixels;
+ }
+}
+
+static void hsw_compute_wm_results(struct drm_device *dev,
+ struct hsw_pipe_wm_parameters *params,
+ uint32_t *wm,
+ struct hsw_wm_values *results)
+{
+ struct drm_i915_private *dev_priv = dev->dev_private;
+ struct drm_crtc *crtc;
+ enum pipe pipe;
+
+ /* No support for LP WMs yet. */
+ results->wm_lp[2] = 0;
+ results->wm_lp[1] = 0;
+ results->wm_lp[0] = 0;
+ results->wm_lp_spr[2] = 0;
+ results->wm_lp_spr[1] = 0;
+ results->wm_lp_spr[0] = 0;
+
+ for_each_pipe(pipe)
+ results->wm_pipe[pipe] = hsw_compute_wm_pipe(dev_priv, wm[0],
+ pipe,
+ ¶ms[pipe]);
for_each_pipe(pipe) {
crtc = dev_priv->pipe_to_crtc_mapping[pipe];
- haswell_update_linetime_wm(dev, crtc);
+ results->wm_linetime[pipe] = hsw_compute_linetime_wm(dev, crtc);
}
+}
+
+/*
+ * The spec says we shouldn't write when we don't need, because every write
+ * causes WMs to be re-evaluated, expending some power.
+ */
+static void hsw_write_wm_values(struct drm_i915_private *dev_priv,
+ struct hsw_wm_values *results,
+ enum hsw_data_buf_partitioning partitioning)
+{
+ struct hsw_wm_values previous;
+ uint32_t val;
+ enum hsw_data_buf_partitioning prev_partitioning;
+
+ previous.wm_pipe[0] = I915_READ(WM0_PIPEA_ILK);
+ previous.wm_pipe[1] = I915_READ(WM0_PIPEB_ILK);
+ previous.wm_pipe[2] = I915_READ(WM0_PIPEC_IVB);
+ previous.wm_lp[0] = I915_READ(WM1_LP_ILK);
+ previous.wm_lp[1] = I915_READ(WM2_LP_ILK);
+ previous.wm_lp[2] = I915_READ(WM3_LP_ILK);
+ previous.wm_lp_spr[0] = I915_READ(WM1S_LP_ILK);
+ previous.wm_lp_spr[1] = I915_READ(WM2S_LP_IVB);
+ previous.wm_lp_spr[2] = I915_READ(WM3S_LP_IVB);
+ previous.wm_linetime[0] = I915_READ(PIPE_WM_LINETIME(PIPE_A));
+ previous.wm_linetime[1] = I915_READ(PIPE_WM_LINETIME(PIPE_B));
+ previous.wm_linetime[2] = I915_READ(PIPE_WM_LINETIME(PIPE_C));
+
+ prev_partitioning = (I915_READ(WM_MISC) & WM_MISC_DATA_PARTITION_5_6) ?
+ HSW_DATA_BUF_PART_5_6 : HSW_DATA_BUF_PART_1_2;
+
+ if (memcmp(results->wm_pipe, previous.wm_pipe, 3) == 0 &&
+ memcmp(results->wm_lp, previous.wm_lp, 3) == 0 &&
+ memcmp(results->wm_lp_spr, previous.wm_lp_spr, 3) == 0 &&
+ memcmp(results->wm_linetime, previous.wm_linetime, 3) == 0 &&
+ partitioning == prev_partitioning)
+ return;
+
+ if (previous.wm_lp[2] != 0)
+ I915_WRITE(WM3_LP_ILK, 0);
+ if (previous.wm_lp[1] != 0)
+ I915_WRITE(WM2_LP_ILK, 0);
+ if (previous.wm_lp[0] != 0)
+ I915_WRITE(WM1_LP_ILK, 0);
+
+ if (previous.wm_pipe[0] != results->wm_pipe[0])
+ I915_WRITE(WM0_PIPEA_ILK, results->wm_pipe[0]);
+ if (previous.wm_pipe[1] != results->wm_pipe[1])
+ I915_WRITE(WM0_PIPEB_ILK, results->wm_pipe[1]);
+ if (previous.wm_pipe[2] != results->wm_pipe[2])
+ I915_WRITE(WM0_PIPEC_IVB, results->wm_pipe[2]);
+
+ if (previous.wm_linetime[0] != results->wm_linetime[0])
+ I915_WRITE(PIPE_WM_LINETIME(PIPE_A), results->wm_linetime[0]);
+ if (previous.wm_linetime[1] != results->wm_linetime[1])
+ I915_WRITE(PIPE_WM_LINETIME(PIPE_B), results->wm_linetime[1]);
+ if (previous.wm_linetime[2] != results->wm_linetime[2])
+ I915_WRITE(PIPE_WM_LINETIME(PIPE_C), results->wm_linetime[2]);
+
+ if (prev_partitioning != partitioning) {
+ val = I915_READ(WM_MISC);
+ if (partitioning == HSW_DATA_BUF_PART_1_2)
+ val &= ~WM_MISC_DATA_PARTITION_5_6;
+ else
+ val |= WM_MISC_DATA_PARTITION_5_6;
+ I915_WRITE(WM_MISC, val);
+ }
+
+ if (previous.wm_lp_spr[0] != results->wm_lp_spr[0])
+ I915_WRITE(WM1S_LP_ILK, results->wm_lp_spr[0]);
+ if (previous.wm_lp_spr[1] != results->wm_lp_spr[1])
+ I915_WRITE(WM2S_LP_IVB, results->wm_lp_spr[1]);
+ if (previous.wm_lp_spr[2] != results->wm_lp_spr[2])
+ I915_WRITE(WM3S_LP_IVB, results->wm_lp_spr[2]);
+
+ if (results->wm_lp[0] != 0)
+ I915_WRITE(WM1_LP_ILK, results->wm_lp[0]);
+ if (results->wm_lp[1] != 0)
+ I915_WRITE(WM2_LP_ILK, results->wm_lp[1]);
+ if (results->wm_lp[2] != 0)
+ I915_WRITE(WM3_LP_ILK, results->wm_lp[2]);
+}
+
+static void haswell_update_wm(struct drm_device *dev)
+{
+ struct drm_i915_private *dev_priv = dev->dev_private;
+ struct hsw_pipe_wm_parameters params[3];
+ struct hsw_wm_values results;
+ uint32_t wm[5];
- sandybridge_update_wm(dev);
+ hsw_compute_wm_parameters(dev, params, wm);
+ hsw_compute_wm_results(dev, params, wm, &results);
+ hsw_write_wm_values(dev_priv, &results, HSW_DATA_BUF_PART_1_2);
}
static void haswell_update_sprite_wm(struct drm_device *dev, int pipe,
--
1.8.1.2
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH 4/5] drm/i915: properly set HSW WM_LP watermarks
2013-05-24 14:59 [PATCH 0/5] Haswell watermarks Paulo Zanoni
` (2 preceding siblings ...)
2013-05-24 14:59 ` [PATCH 3/5] drm/i915: properly set HSW WM_PIPE registers Paulo Zanoni
@ 2013-05-24 14:59 ` Paulo Zanoni
2013-05-24 16:11 ` Ville Syrjälä
2013-05-24 14:59 ` [PATCH 5/5] drm/i915: add support for 5/6 data buffer partitioning on Haswell Paulo Zanoni
4 siblings, 1 reply; 29+ messages in thread
From: Paulo Zanoni @ 2013-05-24 14:59 UTC (permalink / raw)
To: intel-gfx; +Cc: Paulo Zanoni
From: Paulo Zanoni <paulo.r.zanoni@intel.com>
We were previously only setting the WM_PIPE registers, now we are
setting the LP watermark registers. This should allow deeper PC
states, resulting in power savings.
We're only using 1/2 data buffer partitioning for now.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
---
drivers/gpu/drm/i915/i915_reg.h | 4 +
drivers/gpu/drm/i915/intel_pm.c | 194 +++++++++++++++++++++++++++++++++++++---
2 files changed, 187 insertions(+), 11 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index e86606c..58230ea 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -3057,6 +3057,10 @@
#define WM3S_LP_IVB 0x45128
#define WM1S_LP_EN (1<<31)
+#define HSW_WM_LP_VAL(lat, fbc, pri, cur) \
+ (WM3_LP_EN | ((lat) << WM1_LP_LATENCY_SHIFT) | \
+ ((fbc) << WM1_LP_FBC_SHIFT) | ((pri) << WM1_LP_SR_SHIFT) | (cur))
+
/* Memory latency timer register */
#define MLTR_ILK 0x11222
#define MLTR_WM1_SHIFT 0
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 2ee1d01..9f9eb48 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -2132,6 +2132,12 @@ static uint32_t hsw_wm_method2(uint32_t pixel_rate, uint32_t pipe_htotal,
return ret;
}
+static uint32_t hsw_wm_fbc(uint32_t pri_val, uint32_t horiz_pixels,
+ uint8_t bytes_per_pixel)
+{
+ return DIV_ROUND_UP(pri_val * 64, horiz_pixels * bytes_per_pixel) + 2;
+}
+
struct hsw_pipe_wm_parameters {
bool active;
bool sprite_enabled;
@@ -2145,11 +2151,28 @@ struct hsw_pipe_wm_parameters {
uint32_t pixel_rate;
};
+struct hsw_wm_maximums {
+ uint16_t pri;
+ uint16_t spr;
+ uint16_t cur;
+ uint16_t fbc;
+};
+
+struct hsw_lp_wm_result {
+ bool enable;
+ bool fbc_enable;
+ uint32_t pri_val;
+ uint32_t spr_val;
+ uint32_t cur_val;
+ uint32_t fbc_val;
+};
+
struct hsw_wm_values {
uint32_t wm_pipe[3];
uint32_t wm_lp[3];
uint32_t wm_lp_spr[3];
uint32_t wm_linetime[3];
+ bool enable_fbc_wm;
};
enum hsw_data_buf_partitioning {
@@ -2170,6 +2193,27 @@ static uint32_t hsw_compute_pri_wm_pipe(struct hsw_pipe_wm_parameters *params,
mem_value);
}
+/* Only for WM_LP. */
+static uint32_t hsw_compute_pri_wm_lp(struct hsw_pipe_wm_parameters *params,
+ uint32_t mem_value)
+{
+ uint32_t method1, method2;
+
+ /* TODO: for now, assume the primary plane is always enabled. */
+ if (!params->active)
+ return 0;
+
+ method1 = hsw_wm_method1(params->pixel_rate,
+ params->pri_bytes_per_pixel,
+ mem_value);
+ method2 = hsw_wm_method2(params->pixel_rate,
+ params->pipe_htotal,
+ params->pri_horiz_pixels,
+ params->pri_bytes_per_pixel,
+ mem_value);
+ return min(method1, method2);
+}
+
/* For both WM_PIPE and WM_LP. */
static uint32_t hsw_compute_spr_wm(struct hsw_pipe_wm_parameters *params,
uint32_t mem_value)
@@ -2204,6 +2248,52 @@ static uint32_t hsw_compute_cur_wm(struct hsw_pipe_wm_parameters *params,
mem_value);
}
+/* Only for WM_LP. */
+static uint32_t hsw_compute_fbc_wm(struct hsw_pipe_wm_parameters *params,
+ uint32_t pri_val,
+ uint32_t mem_value)
+{
+ if (!params->active)
+ return 0;
+
+ return hsw_wm_fbc(pri_val,
+ params->pri_horiz_pixels,
+ params->pri_bytes_per_pixel);
+}
+
+static void hsw_compute_lp_wm(uint32_t mem_value, struct hsw_wm_maximums *max,
+ struct hsw_pipe_wm_parameters *params,
+ struct hsw_lp_wm_result *result)
+{
+ enum pipe pipe;
+ uint32_t pri_val[3], spr_val[3], cur_val[3], fbc_val[3];
+
+ for (pipe = PIPE_A; pipe <= PIPE_C; pipe++) {
+ struct hsw_pipe_wm_parameters *p = ¶ms[pipe];
+
+ pri_val[pipe] = hsw_compute_pri_wm_lp(p, mem_value);
+ spr_val[pipe] = hsw_compute_spr_wm(p, mem_value);
+ cur_val[pipe] = hsw_compute_cur_wm(p, mem_value);
+ fbc_val[pipe] = hsw_compute_fbc_wm(p, pri_val[pipe], mem_value);
+ }
+
+ result->pri_val = max3(pri_val[0], pri_val[1], pri_val[2]);
+ result->spr_val = max3(spr_val[0], spr_val[1], spr_val[2]);
+ result->cur_val = max3(cur_val[0], cur_val[1], cur_val[2]);
+ result->fbc_val = max3(fbc_val[0], fbc_val[1], fbc_val[2]);
+
+ if (result->fbc_val > max->fbc) {
+ result->fbc_enable = false;
+ result->fbc_val = 0;
+ } else {
+ result->fbc_enable = true;
+ }
+
+ result->enable = result->pri_val <= max->pri &&
+ result->spr_val <= max->spr &&
+ result->cur_val <= max->cur;
+}
+
static uint32_t hsw_compute_wm_pipe(struct drm_i915_private *dev_priv,
uint32_t mem_value, enum pipe pipe,
struct hsw_pipe_wm_parameters *params)
@@ -2253,13 +2343,15 @@ hsw_compute_linetime_wm(struct drm_device *dev, struct drm_crtc *crtc)
static void hsw_compute_wm_parameters(struct drm_device *dev,
struct hsw_pipe_wm_parameters *params,
- uint32_t *wm)
+ uint32_t *wm,
+ struct hsw_wm_maximums *lp_max_1_2)
{
struct drm_i915_private *dev_priv = dev->dev_private;
struct drm_crtc *crtc;
struct drm_plane *plane;
uint64_t sskpd = I915_READ64(MCH_SSKPD);
enum pipe pipe;
+ int pipes_active = 0, sprites_enabled = 0;
if ((sskpd >> 56) & 0xFF)
wm[0] = (sskpd >> 56) & 0xFF;
@@ -2281,6 +2373,8 @@ static void hsw_compute_wm_parameters(struct drm_device *dev,
if (!p->active)
continue;
+ pipes_active++;
+
p->pipe_htotal = intel_crtc->config.adjusted_mode.htotal;
p->pixel_rate = hsw_wm_get_pixel_rate(dev, crtc);
p->pri_bytes_per_pixel = crtc->fb->bits_per_pixel / 8;
@@ -2299,25 +2393,89 @@ static void hsw_compute_wm_parameters(struct drm_device *dev,
p->sprite_enabled = intel_plane->wm.enable;
p->spr_bytes_per_pixel = intel_plane->wm.bytes_per_pixel;
p->spr_horiz_pixels = intel_plane->wm.horiz_pixels;
+
+ if (p->sprite_enabled)
+ sprites_enabled++;
+ }
+
+ if (pipes_active > 1) {
+ lp_max_1_2->pri = sprites_enabled ? 128 : 256;
+ lp_max_1_2->spr = 128;
+ lp_max_1_2->cur = 64;
+ } else {
+ lp_max_1_2->pri = sprites_enabled ? 384 : 768;
+ lp_max_1_2->spr = 384;
+ lp_max_1_2->cur = 255;
}
+ lp_max_1_2->fbc = 15;
}
static void hsw_compute_wm_results(struct drm_device *dev,
struct hsw_pipe_wm_parameters *params,
uint32_t *wm,
+ struct hsw_wm_maximums *lp_maximums,
struct hsw_wm_values *results)
{
struct drm_i915_private *dev_priv = dev->dev_private;
struct drm_crtc *crtc;
+ struct hsw_lp_wm_result lp_results[4];
enum pipe pipe;
+ int i;
+
+ hsw_compute_lp_wm(wm[1], lp_maximums, params, &lp_results[0]);
+ hsw_compute_lp_wm(wm[2], lp_maximums, params, &lp_results[1]);
+ hsw_compute_lp_wm(wm[3], lp_maximums, params, &lp_results[2]);
+ hsw_compute_lp_wm(wm[4], lp_maximums, params, &lp_results[3]);
+
+ /* The spec says it is preferred to disable FBC WMs instead of disabling
+ * a WM level. */
+ results->enable_fbc_wm = true;
+ for (i = 0; i < 4; i++) {
+ if (lp_results[i].enable && !lp_results[i].fbc_enable) {
+ results->enable_fbc_wm = false;
+ break;
+ }
+ }
+
+ if (lp_results[3].enable) {
+ results->wm_lp[2] = HSW_WM_LP_VAL(8, lp_results[3].fbc_val,
+ lp_results[3].pri_val,
+ lp_results[3].cur_val);
+ results->wm_lp_spr[2] = lp_results[3].spr_val;
+ } else if (lp_results[2].enable) {
+ results->wm_lp[2] = HSW_WM_LP_VAL(6, lp_results[2].fbc_val,
+ lp_results[2].pri_val,
+ lp_results[2].cur_val);
+ results->wm_lp_spr[2] = lp_results[2].spr_val;
+ } else {
+ results->wm_lp[2] = 0;
+ results->wm_lp_spr[2] = 0;
+ }
+
+ if (lp_results[3].enable && lp_results[2].enable) {
+ results->wm_lp[1] = HSW_WM_LP_VAL(6, lp_results[2].fbc_val,
+ lp_results[2].pri_val,
+ lp_results[2].cur_val);
+ results->wm_lp_spr[1] = lp_results[2].spr_val;
+ } else if (!lp_results[3].enable && lp_results[1].enable) {
+ results->wm_lp[1] = HSW_WM_LP_VAL(4, lp_results[1].fbc_val,
+ lp_results[1].pri_val,
+ lp_results[1].cur_val);
+ results->wm_lp_spr[1] = lp_results[1].spr_val;
+ } else {
+ results->wm_lp[1] = 0;
+ results->wm_lp_spr[1] = 0;
+ }
- /* No support for LP WMs yet. */
- results->wm_lp[2] = 0;
- results->wm_lp[1] = 0;
- results->wm_lp[0] = 0;
- results->wm_lp_spr[2] = 0;
- results->wm_lp_spr[1] = 0;
- results->wm_lp_spr[0] = 0;
+ if (lp_results[0].enable) {
+ results->wm_lp[0] = HSW_WM_LP_VAL(2, lp_results[0].fbc_val,
+ lp_results[0].pri_val,
+ lp_results[0].cur_val);
+ results->wm_lp_spr[0] = lp_results[0].spr_val;
+ } else {
+ results->wm_lp[0] = 0;
+ results->wm_lp_spr[0] = 0;
+ }
for_each_pipe(pipe)
results->wm_pipe[pipe] = hsw_compute_wm_pipe(dev_priv, wm[0],
@@ -2341,6 +2499,7 @@ static void hsw_write_wm_values(struct drm_i915_private *dev_priv,
struct hsw_wm_values previous;
uint32_t val;
enum hsw_data_buf_partitioning prev_partitioning;
+ bool prev_enable_fbc_wm;
previous.wm_pipe[0] = I915_READ(WM0_PIPEA_ILK);
previous.wm_pipe[1] = I915_READ(WM0_PIPEB_ILK);
@@ -2358,11 +2517,14 @@ static void hsw_write_wm_values(struct drm_i915_private *dev_priv,
prev_partitioning = (I915_READ(WM_MISC) & WM_MISC_DATA_PARTITION_5_6) ?
HSW_DATA_BUF_PART_5_6 : HSW_DATA_BUF_PART_1_2;
+ prev_enable_fbc_wm = !(I915_READ(DISP_ARB_CTL) & DISP_FBC_WM_DIS);
+
if (memcmp(results->wm_pipe, previous.wm_pipe, 3) == 0 &&
memcmp(results->wm_lp, previous.wm_lp, 3) == 0 &&
memcmp(results->wm_lp_spr, previous.wm_lp_spr, 3) == 0 &&
memcmp(results->wm_linetime, previous.wm_linetime, 3) == 0 &&
- partitioning == prev_partitioning)
+ partitioning == prev_partitioning &&
+ results->enable_fbc_wm == prev_enable_fbc_wm)
return;
if (previous.wm_lp[2] != 0)
@@ -2395,6 +2557,15 @@ static void hsw_write_wm_values(struct drm_i915_private *dev_priv,
I915_WRITE(WM_MISC, val);
}
+ if (prev_enable_fbc_wm != results->enable_fbc_wm) {
+ val = I915_READ(DISP_ARB_CTL);
+ if (results->enable_fbc_wm)
+ val &= ~DISP_FBC_WM_DIS;
+ else
+ val |= DISP_FBC_WM_DIS;
+ I915_WRITE(DISP_ARB_CTL, val);
+ }
+
if (previous.wm_lp_spr[0] != results->wm_lp_spr[0])
I915_WRITE(WM1S_LP_ILK, results->wm_lp_spr[0]);
if (previous.wm_lp_spr[1] != results->wm_lp_spr[1])
@@ -2413,12 +2584,13 @@ static void hsw_write_wm_values(struct drm_i915_private *dev_priv,
static void haswell_update_wm(struct drm_device *dev)
{
struct drm_i915_private *dev_priv = dev->dev_private;
+ struct hsw_wm_maximums lp_max_1_2;
struct hsw_pipe_wm_parameters params[3];
struct hsw_wm_values results;
uint32_t wm[5];
- hsw_compute_wm_parameters(dev, params, wm);
- hsw_compute_wm_results(dev, params, wm, &results);
+ hsw_compute_wm_parameters(dev, params, wm, &lp_max_1_2);
+ hsw_compute_wm_results(dev, params, wm, &lp_max_1_2, &results);
hsw_write_wm_values(dev_priv, &results, HSW_DATA_BUF_PART_1_2);
}
--
1.8.1.2
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH 5/5] drm/i915: add support for 5/6 data buffer partitioning on Haswell
2013-05-24 14:59 [PATCH 0/5] Haswell watermarks Paulo Zanoni
` (3 preceding siblings ...)
2013-05-24 14:59 ` [PATCH 4/5] drm/i915: properly set HSW WM_LP watermarks Paulo Zanoni
@ 2013-05-24 14:59 ` Paulo Zanoni
2013-05-29 16:17 ` Ville Syrjälä
4 siblings, 1 reply; 29+ messages in thread
From: Paulo Zanoni @ 2013-05-24 14:59 UTC (permalink / raw)
To: intel-gfx; +Cc: Paulo Zanoni
From: Paulo Zanoni <paulo.r.zanoni@intel.com>
Now we compute the results for both 1/2 and 5/6 partitioning and then
use hsw_find_best_result to choose which one to use.
With this patch, Haswell watermarks support should be in good shape.
The only improvement we're missing is the case where the primary plane
is disabled: we always assume it's enabled, so we take it into
consideration when calculating the watermarks.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
---
drivers/gpu/drm/i915/intel_pm.c | 64 ++++++++++++++++++++++++++++++++++-------
1 file changed, 53 insertions(+), 11 deletions(-)
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 9f9eb48..6fdfd1a 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -2344,7 +2344,8 @@ hsw_compute_linetime_wm(struct drm_device *dev, struct drm_crtc *crtc)
static void hsw_compute_wm_parameters(struct drm_device *dev,
struct hsw_pipe_wm_parameters *params,
uint32_t *wm,
- struct hsw_wm_maximums *lp_max_1_2)
+ struct hsw_wm_maximums *lp_max_1_2,
+ struct hsw_wm_maximums *lp_max_5_6)
{
struct drm_i915_private *dev_priv = dev->dev_private;
struct drm_crtc *crtc;
@@ -2399,15 +2400,17 @@ static void hsw_compute_wm_parameters(struct drm_device *dev,
}
if (pipes_active > 1) {
- lp_max_1_2->pri = sprites_enabled ? 128 : 256;
- lp_max_1_2->spr = 128;
- lp_max_1_2->cur = 64;
+ lp_max_1_2->pri = lp_max_5_6->pri = sprites_enabled ? 128 : 256;
+ lp_max_1_2->spr = lp_max_5_6->spr = 128;
+ lp_max_1_2->cur = lp_max_5_6->cur = 64;
} else {
lp_max_1_2->pri = sprites_enabled ? 384 : 768;
+ lp_max_5_6->pri = sprites_enabled ? 128 : 768;
lp_max_1_2->spr = 384;
- lp_max_1_2->cur = 255;
+ lp_max_5_6->spr = 640;
+ lp_max_1_2->cur = lp_max_5_6->cur = 255;
}
- lp_max_1_2->fbc = 15;
+ lp_max_1_2->fbc = lp_max_5_6->fbc = 15;
}
static void hsw_compute_wm_results(struct drm_device *dev,
@@ -2488,6 +2491,32 @@ static void hsw_compute_wm_results(struct drm_device *dev,
}
}
+/* Find the result with the highest level enabled. Check for enable_fbc_wm in
+ * case both are at the same level. Prefer r1 in case they're the same. */
+struct hsw_wm_values *hsw_find_best_result(struct hsw_wm_values *r1,
+ struct hsw_wm_values *r2)
+{
+ int i, val_r1 = 0, val_r2 = 0;
+
+ for (i = 0; i < 3; i++) {
+ if (r1->wm_lp[i] & WM3_LP_EN)
+ val_r1 |= (1 << i);
+ if (r2->wm_lp[i] & WM3_LP_EN)
+ val_r2 |= (1 << i);
+ }
+
+ if (val_r1 == val_r2) {
+ if (r2->enable_fbc_wm && !r1->enable_fbc_wm)
+ return r2;
+ else
+ return r1;
+ } else if (val_r1 > val_r2) {
+ return r1;
+ } else {
+ return r2;
+ }
+}
+
/*
* The spec says we shouldn't write when we don't need, because every write
* causes WMs to be re-evaluated, expending some power.
@@ -2584,14 +2613,27 @@ static void hsw_write_wm_values(struct drm_i915_private *dev_priv,
static void haswell_update_wm(struct drm_device *dev)
{
struct drm_i915_private *dev_priv = dev->dev_private;
- struct hsw_wm_maximums lp_max_1_2;
+ struct hsw_wm_maximums lp_max_1_2, lp_max_5_6;
struct hsw_pipe_wm_parameters params[3];
- struct hsw_wm_values results;
+ struct hsw_wm_values results_1_2, results_5_6, *best_results;
uint32_t wm[5];
+ enum hsw_data_buf_partitioning partitioning;
+
+ hsw_compute_wm_parameters(dev, params, wm, &lp_max_1_2, &lp_max_5_6);
+
+ hsw_compute_wm_results(dev, params, wm, &lp_max_1_2, &results_1_2);
+ if (lp_max_1_2.pri != lp_max_5_6.pri) {
+ hsw_compute_wm_results(dev, params, wm, &lp_max_5_6,
+ &results_5_6);
+ best_results = hsw_find_best_result(&results_1_2, &results_5_6);
+ } else {
+ best_results = &results_1_2;
+ }
+
+ partitioning = (best_results == &results_1_2) ?
+ HSW_DATA_BUF_PART_1_2 : HSW_DATA_BUF_PART_5_6;
- hsw_compute_wm_parameters(dev, params, wm, &lp_max_1_2);
- hsw_compute_wm_results(dev, params, wm, &lp_max_1_2, &results);
- hsw_write_wm_values(dev_priv, &results, HSW_DATA_BUF_PART_1_2);
+ hsw_write_wm_values(dev_priv, best_results, partitioning);
}
static void haswell_update_sprite_wm(struct drm_device *dev, int pipe,
--
1.8.1.2
^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [PATCH 3/5] drm/i915: properly set HSW WM_PIPE registers
2013-05-24 14:59 ` [PATCH 3/5] drm/i915: properly set HSW WM_PIPE registers Paulo Zanoni
@ 2013-05-24 16:07 ` Ville Syrjälä
2013-05-24 22:00 ` Paulo Zanoni
0 siblings, 1 reply; 29+ messages in thread
From: Ville Syrjälä @ 2013-05-24 16:07 UTC (permalink / raw)
To: Paulo Zanoni; +Cc: intel-gfx, Paulo Zanoni
On Fri, May 24, 2013 at 11:59:19AM -0300, Paulo Zanoni wrote:
> From: Paulo Zanoni <paulo.r.zanoni@intel.com>
>
> We were previously calling sandybridge_update_wm on HSW, but the SNB
> function didn't really match the HSW specification, so we were just
> writing the wrong values.
>
> With this patch, the haswell_update_wm function will set the correct
> values for the WM_PIPE registers, but it will still keep all the LP
> watermarks disabled.
>
> The patch may look a little bit over-complicated for now, but it's
> because much of the infrastructure for setting the LP watermarks is
> already in place, so we won't have too much code churn on the patch
> that sets the LP watermarks.
>
> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
> ---
> drivers/gpu/drm/i915/i915_reg.h | 3 +
> drivers/gpu/drm/i915/intel_pm.c | 340 +++++++++++++++++++++++++++++++++++++---
> 2 files changed, 325 insertions(+), 18 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index 55caedb..e86606c 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -4938,6 +4938,9 @@
> #define SFUSE_STRAP_DDIC_DETECTED (1<<1)
> #define SFUSE_STRAP_DDID_DETECTED (1<<0)
>
> +#define WM_MISC 0x45260
> +#define WM_MISC_DATA_PARTITION_5_6 (1 << 0)
> +
> #define WM_DBG 0x45280
> #define WM_DBG_DISALLOW_MULTIPLE_LP (1<<0)
> #define WM_DBG_DISALLOW_MAXFIFO (1<<1)
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index 0b61a0e..2ee1d01 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -2072,19 +2072,173 @@ static void ivybridge_update_wm(struct drm_device *dev)
> cursor_wm);
> }
>
> -static void
> -haswell_update_linetime_wm(struct drm_device *dev, struct drm_crtc *crtc)
> +static uint32_t hsw_wm_get_pixel_rate(struct drm_device *dev,
> + struct drm_crtc *crtc)
> +{
> + struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
> + uint32_t pixel_rate, pfit_size;
> +
> + if (intel_crtc->config.pixel_target_clock)
> + pixel_rate = intel_crtc->config.pixel_target_clock;
> + else
> + pixel_rate = intel_crtc->config.adjusted_mode.clock;
> +
> + /* We only use IF-ID interlacing. If we ever use PF-ID we'll need to
> + * adjust the pixel_rate here. */
> +
> + pfit_size = intel_crtc->config.pch_pfit.size;
> + if (pfit_size) {
> + uint64_t x, y, crtc_x, crtc_y, hscale, vscale, totscale;
> +
> + x = (pfit_size >> 16) & 0xFFFF;
> + y = pfit_size & 0xFFFF;
> + crtc_x = intel_crtc->config.adjusted_mode.hdisplay;
> + crtc_y = intel_crtc->config.adjusted_mode.vdisplay;
> +
> + hscale = crtc_x << 16;
> + vscale = crtc_y << 16;
> + do_div(hscale, x);
> + do_div(vscale, y);
> + hscale = (hscale < (1 << 16)) ? (1 << 16) : hscale;
> + vscale = (vscale < (1 << 16)) ? (1 << 16) : vscale;
> + totscale = (hscale * vscale) >> 16;
> + pixel_rate = (pixel_rate * totscale) >> 16;
No need for fixed point math if you go 64bits, and as stated before
the scaling ratio is still being miscaclulated due to the use of
adjusted_mode.
Something like this ought to do it:
in_w = req_mode.hdisplay;
in_h = req_mode.vdisplay;
out_w = (pfit_size >> 16) & 0xffff;
out_h = pfit_size & 0xffff;
if (in_w <= out_w)
in_w = out_w;
if (in_h <= out_h)
in_h = out_h;
pixel_rate = div_u64((uint64_t) pixel_rate * in_w * in_h, out_w * out_h);
> + }
> +
> + return pixel_rate;
> +}
> +
> +static uint32_t hsw_wm_method1(uint32_t pixel_rate, uint8_t bytes_per_pixel,
> + uint32_t latency)
> +{
> + uint64_t tmp;
> + uint32_t ret;
> +
> + tmp = pixel_rate * bytes_per_pixel * latency;
Would need a cast to make the multiplications actually 64bit. 'ret' is
also pointless.
> + ret = DIV_ROUND_UP_ULL(tmp, 64 * 10000) + 2;
> +
> + return ret;
> +}
> +
> +static uint32_t hsw_wm_method2(uint32_t pixel_rate, uint32_t pipe_htotal,
> + uint32_t horiz_pixels, uint8_t bytes_per_pixel,
> + uint32_t latency)
> +{
> + uint32_t ret;
> +
> + ret = DIV_ROUND_UP(pipe_htotal * 1000, pixel_rate);
> + ret = ((latency / (ret * 10)) + 1) * horiz_pixels * bytes_per_pixel;
w/ 64bit maths this could be:
tmp = (uint64_t) latency * pixel_rate * 100;
ret = (div_u64(tmp, pipe_htotal) + 1) * horiz_pixels * bytes_per_pixel
> + ret = DIV_ROUND_UP(ret, 64) + 2;
> + return ret;
> +}
> +
> +struct hsw_pipe_wm_parameters {
> + bool active;
> + bool sprite_enabled;
> + uint8_t pri_bytes_per_pixel;
> + uint8_t spr_bytes_per_pixel;
> + uint8_t cur_bytes_per_pixel;
> + uint32_t pri_horiz_pixels;
> + uint32_t spr_horiz_pixels;
> + uint32_t cur_horiz_pixels;
> + uint32_t pipe_htotal;
> + uint32_t pixel_rate;
> +};
> +
> +struct hsw_wm_values {
> + uint32_t wm_pipe[3];
> + uint32_t wm_lp[3];
> + uint32_t wm_lp_spr[3];
> + uint32_t wm_linetime[3];
> +};
> +
> +enum hsw_data_buf_partitioning {
> + HSW_DATA_BUF_PART_1_2,
> + HSW_DATA_BUF_PART_5_6,
> +};
> +
> +/* Only for WM_PIPE. */
> +static uint32_t hsw_compute_pri_wm_pipe(struct hsw_pipe_wm_parameters *params,
> + uint32_t mem_value)
> +{
> + /* TODO: for now, assume the primary plane is always enabled. */
> + if (!params->active)
> + return 0;
> +
> + return hsw_wm_method1(params->pixel_rate,
> + params->pri_bytes_per_pixel,
> + mem_value);
> +}
> +
> +/* For both WM_PIPE and WM_LP. */
> +static uint32_t hsw_compute_spr_wm(struct hsw_pipe_wm_parameters *params,
> + uint32_t mem_value)
> +{
> + uint32_t method1, method2;
> +
> + if (!params->active || !params->sprite_enabled)
> + return 0;
> +
> + method1 = hsw_wm_method1(params->pixel_rate,
> + params->spr_bytes_per_pixel,
> + mem_value);
> + method2 = hsw_wm_method2(params->pixel_rate,
> + params->pipe_htotal,
> + params->spr_horiz_pixels,
> + params->spr_bytes_per_pixel,
> + mem_value);
> + return min(method1, method2);
> +}
> +
> +/* For both WM_PIPE and WM_LP. */
> +static uint32_t hsw_compute_cur_wm(struct hsw_pipe_wm_parameters *params,
> + uint32_t mem_value)
> +{
> + if (!params->active)
> + return 0;
> +
> + return hsw_wm_method2(params->pixel_rate,
> + params->pipe_htotal,
> + params->cur_horiz_pixels,
> + params->cur_bytes_per_pixel,
> + mem_value);
> +}
> +
> +static uint32_t hsw_compute_wm_pipe(struct drm_i915_private *dev_priv,
> + uint32_t mem_value, enum pipe pipe,
> + struct hsw_pipe_wm_parameters *params)
> +{
> + uint32_t pri_val, cur_val, spr_val;
> +
> + pri_val = hsw_compute_pri_wm_pipe(params, mem_value);
> + spr_val = hsw_compute_spr_wm(params, mem_value);
> + cur_val = hsw_compute_cur_wm(params, mem_value);
> +
> + WARN(pri_val > 127,
> + "Primary WM error, mode not supported for pipe %c\n",
> + pipe_name(pipe));
> + WARN(spr_val > 127,
> + "Sprite WM error, mode not supported for pipe %c\n",
> + pipe_name(pipe));
> + WARN(cur_val > 63,
> + "Cursor WM error, mode not supported for pipe %c\n",
> + pipe_name(pipe));
> +
> + return (pri_val << WM0_PIPE_PLANE_SHIFT) |
> + (spr_val << WM0_PIPE_SPRITE_SHIFT) |
> + cur_val;
> +}
> +
> +static uint32_t
> +hsw_compute_linetime_wm(struct drm_device *dev, struct drm_crtc *crtc)
> {
> struct drm_i915_private *dev_priv = dev->dev_private;
> struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
> - enum pipe pipe = intel_crtc->pipe;
> struct drm_display_mode *mode = &intel_crtc->config.adjusted_mode;
> u32 linetime, ips_linetime;
>
> - if (!intel_crtc_active(crtc)) {
> - I915_WRITE(PIPE_WM_LINETIME(pipe), 0);
> - return;
> - }
> + if (!intel_crtc_active(crtc))
> + return 0;
>
> /* The WM are computed with base on how long it takes to fill a single
> * row at the given clock rate, multiplied by 8.
> @@ -2093,29 +2247,179 @@ haswell_update_linetime_wm(struct drm_device *dev, struct drm_crtc *crtc)
> ips_linetime = DIV_ROUND_CLOSEST(mode->htotal * 1000 * 8,
> intel_ddi_get_cdclk_freq(dev_priv));
>
> - I915_WRITE(PIPE_WM_LINETIME(pipe),
> - PIPE_WM_LINETIME_IPS_LINETIME(ips_linetime) |
> - PIPE_WM_LINETIME_TIME(linetime));
> + return PIPE_WM_LINETIME_IPS_LINETIME(ips_linetime) |
> + PIPE_WM_LINETIME_TIME(linetime);
> }
>
> -static void haswell_update_wm(struct drm_device *dev)
> +static void hsw_compute_wm_parameters(struct drm_device *dev,
> + struct hsw_pipe_wm_parameters *params,
> + uint32_t *wm)
> {
> struct drm_i915_private *dev_priv = dev->dev_private;
> struct drm_crtc *crtc;
> + struct drm_plane *plane;
> + uint64_t sskpd = I915_READ64(MCH_SSKPD);
> enum pipe pipe;
>
> - /* Disable the LP WMs before changine the linetime registers. This is
> - * just a temporary code that will be replaced soon. */
> - I915_WRITE(WM3_LP_ILK, 0);
> - I915_WRITE(WM2_LP_ILK, 0);
> - I915_WRITE(WM1_LP_ILK, 0);
> + if ((sskpd >> 56) & 0xFF)
> + wm[0] = (sskpd >> 56) & 0xFF;
> + else
> + wm[0] = sskpd & 0xF;
> + wm[1] = ((sskpd >> 4) & 0xFF) * 5;
> + wm[2] = ((sskpd >> 12) & 0xFF) * 5;
> + wm[3] = ((sskpd >> 20) & 0x1FF) * 5;
> + wm[4] = ((sskpd >> 32) & 0x1FF) * 5;
> +
> + list_for_each_entry(crtc, &dev->mode_config.crtc_list, head) {
> + struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
> + struct hsw_pipe_wm_parameters *p;
> +
> + pipe = intel_crtc->pipe;
> + p = ¶ms[pipe];
> +
> + p->active = intel_crtc_active(crtc);
> + if (!p->active)
> + continue;
> +
> + p->pipe_htotal = intel_crtc->config.adjusted_mode.htotal;
> + p->pixel_rate = hsw_wm_get_pixel_rate(dev, crtc);
> + p->pri_bytes_per_pixel = crtc->fb->bits_per_pixel / 8;
> + p->cur_bytes_per_pixel = 4;
> + p->pri_horiz_pixels = intel_crtc->config.adjusted_mode.hdisplay;
> + p->cur_horiz_pixels = 64;
> + }
> +
> + list_for_each_entry(plane, &dev->mode_config.plane_list, head) {
> + struct intel_plane *intel_plane = to_intel_plane(plane);
> + struct hsw_pipe_wm_parameters *p;
> +
> + pipe = intel_plane->pipe;
> + p = ¶ms[pipe];
> +
> + p->sprite_enabled = intel_plane->wm.enable;
> + p->spr_bytes_per_pixel = intel_plane->wm.bytes_per_pixel;
> + p->spr_horiz_pixels = intel_plane->wm.horiz_pixels;
> + }
> +}
> +
> +static void hsw_compute_wm_results(struct drm_device *dev,
> + struct hsw_pipe_wm_parameters *params,
> + uint32_t *wm,
> + struct hsw_wm_values *results)
> +{
> + struct drm_i915_private *dev_priv = dev->dev_private;
> + struct drm_crtc *crtc;
> + enum pipe pipe;
> +
> + /* No support for LP WMs yet. */
> + results->wm_lp[2] = 0;
> + results->wm_lp[1] = 0;
> + results->wm_lp[0] = 0;
> + results->wm_lp_spr[2] = 0;
> + results->wm_lp_spr[1] = 0;
> + results->wm_lp_spr[0] = 0;
> +
> + for_each_pipe(pipe)
> + results->wm_pipe[pipe] = hsw_compute_wm_pipe(dev_priv, wm[0],
> + pipe,
> + ¶ms[pipe]);
>
> for_each_pipe(pipe) {
> crtc = dev_priv->pipe_to_crtc_mapping[pipe];
> - haswell_update_linetime_wm(dev, crtc);
> + results->wm_linetime[pipe] = hsw_compute_linetime_wm(dev, crtc);
> }
> +}
> +
> +/*
> + * The spec says we shouldn't write when we don't need, because every write
> + * causes WMs to be re-evaluated, expending some power.
> + */
> +static void hsw_write_wm_values(struct drm_i915_private *dev_priv,
> + struct hsw_wm_values *results,
> + enum hsw_data_buf_partitioning partitioning)
> +{
> + struct hsw_wm_values previous;
> + uint32_t val;
> + enum hsw_data_buf_partitioning prev_partitioning;
> +
> + previous.wm_pipe[0] = I915_READ(WM0_PIPEA_ILK);
> + previous.wm_pipe[1] = I915_READ(WM0_PIPEB_ILK);
> + previous.wm_pipe[2] = I915_READ(WM0_PIPEC_IVB);
> + previous.wm_lp[0] = I915_READ(WM1_LP_ILK);
> + previous.wm_lp[1] = I915_READ(WM2_LP_ILK);
> + previous.wm_lp[2] = I915_READ(WM3_LP_ILK);
> + previous.wm_lp_spr[0] = I915_READ(WM1S_LP_ILK);
> + previous.wm_lp_spr[1] = I915_READ(WM2S_LP_IVB);
> + previous.wm_lp_spr[2] = I915_READ(WM3S_LP_IVB);
> + previous.wm_linetime[0] = I915_READ(PIPE_WM_LINETIME(PIPE_A));
> + previous.wm_linetime[1] = I915_READ(PIPE_WM_LINETIME(PIPE_B));
> + previous.wm_linetime[2] = I915_READ(PIPE_WM_LINETIME(PIPE_C));
> +
> + prev_partitioning = (I915_READ(WM_MISC) & WM_MISC_DATA_PARTITION_5_6) ?
> + HSW_DATA_BUF_PART_5_6 : HSW_DATA_BUF_PART_1_2;
> +
> + if (memcmp(results->wm_pipe, previous.wm_pipe, 3) == 0 &&
> + memcmp(results->wm_lp, previous.wm_lp, 3) == 0 &&
> + memcmp(results->wm_lp_spr, previous.wm_lp_spr, 3) == 0 &&
> + memcmp(results->wm_linetime, previous.wm_linetime, 3) == 0 &&
> + partitioning == prev_partitioning)
> + return;
> +
> + if (previous.wm_lp[2] != 0)
> + I915_WRITE(WM3_LP_ILK, 0);
> + if (previous.wm_lp[1] != 0)
> + I915_WRITE(WM2_LP_ILK, 0);
> + if (previous.wm_lp[0] != 0)
> + I915_WRITE(WM1_LP_ILK, 0);
I don't know if this conditional writing makes sense in such a fine
granularity. We're anyway going to write some of the registeres, so
maybe it's better to just go ahead and write all of them. It would
at least make the code look a bit better.
In any case you'd at least need to make sure that you disable/re-enable
the LP1+ watermarks if linetime WMs or DDB partitioning changes,
regardless of whether the LP1+ watermarks themselves changed.
> +
> + if (previous.wm_pipe[0] != results->wm_pipe[0])
> + I915_WRITE(WM0_PIPEA_ILK, results->wm_pipe[0]);
> + if (previous.wm_pipe[1] != results->wm_pipe[1])
> + I915_WRITE(WM0_PIPEB_ILK, results->wm_pipe[1]);
> + if (previous.wm_pipe[2] != results->wm_pipe[2])
> + I915_WRITE(WM0_PIPEC_IVB, results->wm_pipe[2]);
> +
> + if (previous.wm_linetime[0] != results->wm_linetime[0])
> + I915_WRITE(PIPE_WM_LINETIME(PIPE_A), results->wm_linetime[0]);
> + if (previous.wm_linetime[1] != results->wm_linetime[1])
> + I915_WRITE(PIPE_WM_LINETIME(PIPE_B), results->wm_linetime[1]);
> + if (previous.wm_linetime[2] != results->wm_linetime[2])
> + I915_WRITE(PIPE_WM_LINETIME(PIPE_C), results->wm_linetime[2]);
> +
> + if (prev_partitioning != partitioning) {
> + val = I915_READ(WM_MISC);
> + if (partitioning == HSW_DATA_BUF_PART_1_2)
> + val &= ~WM_MISC_DATA_PARTITION_5_6;
> + else
> + val |= WM_MISC_DATA_PARTITION_5_6;
> + I915_WRITE(WM_MISC, val);
> + }
> +
> + if (previous.wm_lp_spr[0] != results->wm_lp_spr[0])
> + I915_WRITE(WM1S_LP_ILK, results->wm_lp_spr[0]);
> + if (previous.wm_lp_spr[1] != results->wm_lp_spr[1])
> + I915_WRITE(WM2S_LP_IVB, results->wm_lp_spr[1]);
> + if (previous.wm_lp_spr[2] != results->wm_lp_spr[2])
> + I915_WRITE(WM3S_LP_IVB, results->wm_lp_spr[2]);
> +
> + if (results->wm_lp[0] != 0)
> + I915_WRITE(WM1_LP_ILK, results->wm_lp[0]);
> + if (results->wm_lp[1] != 0)
> + I915_WRITE(WM2_LP_ILK, results->wm_lp[1]);
> + if (results->wm_lp[2] != 0)
> + I915_WRITE(WM3_LP_ILK, results->wm_lp[2]);
> +}
> +
> +static void haswell_update_wm(struct drm_device *dev)
> +{
> + struct drm_i915_private *dev_priv = dev->dev_private;
> + struct hsw_pipe_wm_parameters params[3];
> + struct hsw_wm_values results;
> + uint32_t wm[5];
>
> - sandybridge_update_wm(dev);
> + hsw_compute_wm_parameters(dev, params, wm);
> + hsw_compute_wm_results(dev, params, wm, &results);
> + hsw_write_wm_values(dev_priv, &results, HSW_DATA_BUF_PART_1_2);
> }
>
> static void haswell_update_sprite_wm(struct drm_device *dev, int pipe,
> --
> 1.8.1.2
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
--
Ville Syrjälä
Intel OTC
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 4/5] drm/i915: properly set HSW WM_LP watermarks
2013-05-24 14:59 ` [PATCH 4/5] drm/i915: properly set HSW WM_LP watermarks Paulo Zanoni
@ 2013-05-24 16:11 ` Ville Syrjälä
2013-05-24 22:05 ` Paulo Zanoni
0 siblings, 1 reply; 29+ messages in thread
From: Ville Syrjälä @ 2013-05-24 16:11 UTC (permalink / raw)
To: Paulo Zanoni; +Cc: intel-gfx, Paulo Zanoni
On Fri, May 24, 2013 at 11:59:20AM -0300, Paulo Zanoni wrote:
> From: Paulo Zanoni <paulo.r.zanoni@intel.com>
>
> We were previously only setting the WM_PIPE registers, now we are
> setting the LP watermark registers. This should allow deeper PC
> states, resulting in power savings.
>
> We're only using 1/2 data buffer partitioning for now.
>
> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
> ---
> drivers/gpu/drm/i915/i915_reg.h | 4 +
> drivers/gpu/drm/i915/intel_pm.c | 194 +++++++++++++++++++++++++++++++++++++---
> 2 files changed, 187 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index e86606c..58230ea 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -3057,6 +3057,10 @@
> #define WM3S_LP_IVB 0x45128
> #define WM1S_LP_EN (1<<31)
>
> +#define HSW_WM_LP_VAL(lat, fbc, pri, cur) \
> + (WM3_LP_EN | ((lat) << WM1_LP_LATENCY_SHIFT) | \
> + ((fbc) << WM1_LP_FBC_SHIFT) | ((pri) << WM1_LP_SR_SHIFT) | (cur))
> +
> /* Memory latency timer register */
> #define MLTR_ILK 0x11222
> #define MLTR_WM1_SHIFT 0
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index 2ee1d01..9f9eb48 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -2132,6 +2132,12 @@ static uint32_t hsw_wm_method2(uint32_t pixel_rate, uint32_t pipe_htotal,
> return ret;
> }
>
> +static uint32_t hsw_wm_fbc(uint32_t pri_val, uint32_t horiz_pixels,
> + uint8_t bytes_per_pixel)
> +{
> + return DIV_ROUND_UP(pri_val * 64, horiz_pixels * bytes_per_pixel) + 2;
> +}
> +
> struct hsw_pipe_wm_parameters {
> bool active;
> bool sprite_enabled;
> @@ -2145,11 +2151,28 @@ struct hsw_pipe_wm_parameters {
> uint32_t pixel_rate;
> };
>
> +struct hsw_wm_maximums {
> + uint16_t pri;
> + uint16_t spr;
> + uint16_t cur;
> + uint16_t fbc;
> +};
> +
> +struct hsw_lp_wm_result {
> + bool enable;
> + bool fbc_enable;
> + uint32_t pri_val;
> + uint32_t spr_val;
> + uint32_t cur_val;
> + uint32_t fbc_val;
> +};
> +
> struct hsw_wm_values {
> uint32_t wm_pipe[3];
> uint32_t wm_lp[3];
> uint32_t wm_lp_spr[3];
> uint32_t wm_linetime[3];
> + bool enable_fbc_wm;
> };
>
> enum hsw_data_buf_partitioning {
> @@ -2170,6 +2193,27 @@ static uint32_t hsw_compute_pri_wm_pipe(struct hsw_pipe_wm_parameters *params,
> mem_value);
> }
>
> +/* Only for WM_LP. */
> +static uint32_t hsw_compute_pri_wm_lp(struct hsw_pipe_wm_parameters *params,
> + uint32_t mem_value)
> +{
> + uint32_t method1, method2;
> +
> + /* TODO: for now, assume the primary plane is always enabled. */
> + if (!params->active)
> + return 0;
> +
> + method1 = hsw_wm_method1(params->pixel_rate,
> + params->pri_bytes_per_pixel,
> + mem_value);
> + method2 = hsw_wm_method2(params->pixel_rate,
> + params->pipe_htotal,
> + params->pri_horiz_pixels,
> + params->pri_bytes_per_pixel,
> + mem_value);
> + return min(method1, method2);
You could pass the level as parameter to hsw_compute_pri_wm and choose
the correct method inside that function. But that's a minor complaint
and we can improve things later.
> +}
> +
> /* For both WM_PIPE and WM_LP. */
> static uint32_t hsw_compute_spr_wm(struct hsw_pipe_wm_parameters *params,
> uint32_t mem_value)
> @@ -2204,6 +2248,52 @@ static uint32_t hsw_compute_cur_wm(struct hsw_pipe_wm_parameters *params,
> mem_value);
> }
>
> +/* Only for WM_LP. */
> +static uint32_t hsw_compute_fbc_wm(struct hsw_pipe_wm_parameters *params,
> + uint32_t pri_val,
> + uint32_t mem_value)
> +{
> + if (!params->active)
> + return 0;
> +
> + return hsw_wm_fbc(pri_val,
> + params->pri_horiz_pixels,
> + params->pri_bytes_per_pixel);
> +}
> +
> +static void hsw_compute_lp_wm(uint32_t mem_value, struct hsw_wm_maximums *max,
> + struct hsw_pipe_wm_parameters *params,
> + struct hsw_lp_wm_result *result)
> +{
> + enum pipe pipe;
> + uint32_t pri_val[3], spr_val[3], cur_val[3], fbc_val[3];
> +
> + for (pipe = PIPE_A; pipe <= PIPE_C; pipe++) {
> + struct hsw_pipe_wm_parameters *p = ¶ms[pipe];
> +
> + pri_val[pipe] = hsw_compute_pri_wm_lp(p, mem_value);
> + spr_val[pipe] = hsw_compute_spr_wm(p, mem_value);
> + cur_val[pipe] = hsw_compute_cur_wm(p, mem_value);
> + fbc_val[pipe] = hsw_compute_fbc_wm(p, pri_val[pipe], mem_value);
> + }
> +
> + result->pri_val = max3(pri_val[0], pri_val[1], pri_val[2]);
> + result->spr_val = max3(spr_val[0], spr_val[1], spr_val[2]);
> + result->cur_val = max3(cur_val[0], cur_val[1], cur_val[2]);
> + result->fbc_val = max3(fbc_val[0], fbc_val[1], fbc_val[2]);
> +
> + if (result->fbc_val > max->fbc) {
> + result->fbc_enable = false;
> + result->fbc_val = 0;
> + } else {
> + result->fbc_enable = true;
> + }
> +
> + result->enable = result->pri_val <= max->pri &&
> + result->spr_val <= max->spr &&
> + result->cur_val <= max->cur;
> +}
> +
> static uint32_t hsw_compute_wm_pipe(struct drm_i915_private *dev_priv,
> uint32_t mem_value, enum pipe pipe,
> struct hsw_pipe_wm_parameters *params)
> @@ -2253,13 +2343,15 @@ hsw_compute_linetime_wm(struct drm_device *dev, struct drm_crtc *crtc)
>
> static void hsw_compute_wm_parameters(struct drm_device *dev,
> struct hsw_pipe_wm_parameters *params,
> - uint32_t *wm)
> + uint32_t *wm,
> + struct hsw_wm_maximums *lp_max_1_2)
> {
> struct drm_i915_private *dev_priv = dev->dev_private;
> struct drm_crtc *crtc;
> struct drm_plane *plane;
> uint64_t sskpd = I915_READ64(MCH_SSKPD);
> enum pipe pipe;
> + int pipes_active = 0, sprites_enabled = 0;
>
> if ((sskpd >> 56) & 0xFF)
> wm[0] = (sskpd >> 56) & 0xFF;
> @@ -2281,6 +2373,8 @@ static void hsw_compute_wm_parameters(struct drm_device *dev,
> if (!p->active)
> continue;
>
> + pipes_active++;
> +
> p->pipe_htotal = intel_crtc->config.adjusted_mode.htotal;
> p->pixel_rate = hsw_wm_get_pixel_rate(dev, crtc);
> p->pri_bytes_per_pixel = crtc->fb->bits_per_pixel / 8;
> @@ -2299,25 +2393,89 @@ static void hsw_compute_wm_parameters(struct drm_device *dev,
> p->sprite_enabled = intel_plane->wm.enable;
> p->spr_bytes_per_pixel = intel_plane->wm.bytes_per_pixel;
> p->spr_horiz_pixels = intel_plane->wm.horiz_pixels;
> +
> + if (p->sprite_enabled)
> + sprites_enabled++;
> + }
> +
> + if (pipes_active > 1) {
> + lp_max_1_2->pri = sprites_enabled ? 128 : 256;
> + lp_max_1_2->spr = 128;
> + lp_max_1_2->cur = 64;
> + } else {
> + lp_max_1_2->pri = sprites_enabled ? 384 : 768;
> + lp_max_1_2->spr = 384;
> + lp_max_1_2->cur = 255;
> }
> + lp_max_1_2->fbc = 15;
> }
>
> static void hsw_compute_wm_results(struct drm_device *dev,
> struct hsw_pipe_wm_parameters *params,
> uint32_t *wm,
> + struct hsw_wm_maximums *lp_maximums,
> struct hsw_wm_values *results)
> {
> struct drm_i915_private *dev_priv = dev->dev_private;
> struct drm_crtc *crtc;
> + struct hsw_lp_wm_result lp_results[4];
> enum pipe pipe;
> + int i;
> +
> + hsw_compute_lp_wm(wm[1], lp_maximums, params, &lp_results[0]);
> + hsw_compute_lp_wm(wm[2], lp_maximums, params, &lp_results[1]);
> + hsw_compute_lp_wm(wm[3], lp_maximums, params, &lp_results[2]);
> + hsw_compute_lp_wm(wm[4], lp_maximums, params, &lp_results[3]);
> +
> + /* The spec says it is preferred to disable FBC WMs instead of disabling
> + * a WM level. */
> + results->enable_fbc_wm = true;
> + for (i = 0; i < 4; i++) {
> + if (lp_results[i].enable && !lp_results[i].fbc_enable) {
> + results->enable_fbc_wm = false;
> + break;
> + }
> + }
> +
> + if (lp_results[3].enable) {
> + results->wm_lp[2] = HSW_WM_LP_VAL(8, lp_results[3].fbc_val,
> + lp_results[3].pri_val,
> + lp_results[3].cur_val);
> + results->wm_lp_spr[2] = lp_results[3].spr_val;
> + } else if (lp_results[2].enable) {
> + results->wm_lp[2] = HSW_WM_LP_VAL(6, lp_results[2].fbc_val,
> + lp_results[2].pri_val,
> + lp_results[2].cur_val);
> + results->wm_lp_spr[2] = lp_results[2].spr_val;
> + } else {
> + results->wm_lp[2] = 0;
> + results->wm_lp_spr[2] = 0;
> + }
> +
> + if (lp_results[3].enable && lp_results[2].enable) {
> + results->wm_lp[1] = HSW_WM_LP_VAL(6, lp_results[2].fbc_val,
> + lp_results[2].pri_val,
> + lp_results[2].cur_val);
> + results->wm_lp_spr[1] = lp_results[2].spr_val;
> + } else if (!lp_results[3].enable && lp_results[1].enable) {
> + results->wm_lp[1] = HSW_WM_LP_VAL(4, lp_results[1].fbc_val,
> + lp_results[1].pri_val,
> + lp_results[1].cur_val);
> + results->wm_lp_spr[1] = lp_results[1].spr_val;
> + } else {
> + results->wm_lp[1] = 0;
> + results->wm_lp_spr[1] = 0;
> + }
>
> - /* No support for LP WMs yet. */
> - results->wm_lp[2] = 0;
> - results->wm_lp[1] = 0;
> - results->wm_lp[0] = 0;
> - results->wm_lp_spr[2] = 0;
> - results->wm_lp_spr[1] = 0;
> - results->wm_lp_spr[0] = 0;
> + if (lp_results[0].enable) {
> + results->wm_lp[0] = HSW_WM_LP_VAL(2, lp_results[0].fbc_val,
> + lp_results[0].pri_val,
> + lp_results[0].cur_val);
> + results->wm_lp_spr[0] = lp_results[0].spr_val;
> + } else {
> + results->wm_lp[0] = 0;
> + results->wm_lp_spr[0] = 0;
> + }
>
> for_each_pipe(pipe)
> results->wm_pipe[pipe] = hsw_compute_wm_pipe(dev_priv, wm[0],
> @@ -2341,6 +2499,7 @@ static void hsw_write_wm_values(struct drm_i915_private *dev_priv,
> struct hsw_wm_values previous;
> uint32_t val;
> enum hsw_data_buf_partitioning prev_partitioning;
> + bool prev_enable_fbc_wm;
>
> previous.wm_pipe[0] = I915_READ(WM0_PIPEA_ILK);
> previous.wm_pipe[1] = I915_READ(WM0_PIPEB_ILK);
> @@ -2358,11 +2517,14 @@ static void hsw_write_wm_values(struct drm_i915_private *dev_priv,
> prev_partitioning = (I915_READ(WM_MISC) & WM_MISC_DATA_PARTITION_5_6) ?
> HSW_DATA_BUF_PART_5_6 : HSW_DATA_BUF_PART_1_2;
>
> + prev_enable_fbc_wm = !(I915_READ(DISP_ARB_CTL) & DISP_FBC_WM_DIS);
> +
> if (memcmp(results->wm_pipe, previous.wm_pipe, 3) == 0 &&
> memcmp(results->wm_lp, previous.wm_lp, 3) == 0 &&
> memcmp(results->wm_lp_spr, previous.wm_lp_spr, 3) == 0 &&
> memcmp(results->wm_linetime, previous.wm_linetime, 3) == 0 &&
> - partitioning == prev_partitioning)
> + partitioning == prev_partitioning &&
> + results->enable_fbc_wm == prev_enable_fbc_wm)
> return;
>
> if (previous.wm_lp[2] != 0)
> @@ -2395,6 +2557,15 @@ static void hsw_write_wm_values(struct drm_i915_private *dev_priv,
> I915_WRITE(WM_MISC, val);
> }
>
> + if (prev_enable_fbc_wm != results->enable_fbc_wm) {
> + val = I915_READ(DISP_ARB_CTL);
> + if (results->enable_fbc_wm)
> + val &= ~DISP_FBC_WM_DIS;
> + else
> + val |= DISP_FBC_WM_DIS;
> + I915_WRITE(DISP_ARB_CTL, val);
> + }
> +
> if (previous.wm_lp_spr[0] != results->wm_lp_spr[0])
> I915_WRITE(WM1S_LP_ILK, results->wm_lp_spr[0]);
> if (previous.wm_lp_spr[1] != results->wm_lp_spr[1])
> @@ -2413,12 +2584,13 @@ static void hsw_write_wm_values(struct drm_i915_private *dev_priv,
> static void haswell_update_wm(struct drm_device *dev)
> {
> struct drm_i915_private *dev_priv = dev->dev_private;
> + struct hsw_wm_maximums lp_max_1_2;
> struct hsw_pipe_wm_parameters params[3];
> struct hsw_wm_values results;
> uint32_t wm[5];
>
> - hsw_compute_wm_parameters(dev, params, wm);
> - hsw_compute_wm_results(dev, params, wm, &results);
> + hsw_compute_wm_parameters(dev, params, wm, &lp_max_1_2);
> + hsw_compute_wm_results(dev, params, wm, &lp_max_1_2, &results);
> hsw_write_wm_values(dev_priv, &results, HSW_DATA_BUF_PART_1_2);
> }
>
> --
> 1.8.1.2
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
--
Ville Syrjälä
Intel OTC
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 1/5] drm/i915: add "enable" argument to intel_update_sprite_watermarks
2013-05-24 14:59 ` [PATCH 1/5] drm/i915: add "enable" argument to intel_update_sprite_watermarks Paulo Zanoni
@ 2013-05-24 16:22 ` Ville Syrjälä
0 siblings, 0 replies; 29+ messages in thread
From: Ville Syrjälä @ 2013-05-24 16:22 UTC (permalink / raw)
To: Paulo Zanoni; +Cc: intel-gfx, Paulo Zanoni
On Fri, May 24, 2013 at 11:59:17AM -0300, Paulo Zanoni wrote:
> From: Paulo Zanoni <paulo.r.zanoni@intel.com>
>
> Because we want to call it from the "sprite disable" paths, since on
> Haswell we need to update the sprite watermarks when we disable
> sprites.
>
> For now, all this patch does is to add the "enable" argument and call
> intel_update_sprite_watermarks from inside ivb_disable_plane. This
> shouldn't change how the code behaves because on
> sandybridge_update_sprite_wm we just ignore the "!enable" case. The
> patches that implement Haswell watermarks will make use of the changes
> introduced by this patch.
>
> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Looks all right.
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> ---
> drivers/gpu/drm/i915/i915_drv.h | 3 ++-
> drivers/gpu/drm/i915/intel_drv.h | 2 +-
> drivers/gpu/drm/i915/intel_pm.c | 11 ++++++++---
> drivers/gpu/drm/i915/intel_sprite.c | 8 +++++---
> 4 files changed, 16 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 7772bb6..e38f8d3 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -315,7 +315,8 @@ struct drm_i915_display_funcs {
> int (*get_fifo_size)(struct drm_device *dev, int plane);
> void (*update_wm)(struct drm_device *dev);
> void (*update_sprite_wm)(struct drm_device *dev, int pipe,
> - uint32_t sprite_width, int pixel_size);
> + uint32_t sprite_width, int pixel_size,
> + bool enable);
> void (*modeset_global_resources)(struct drm_device *dev);
> /* Returns the active state of the crtc, and if the crtc is active,
> * fills out the pipe-config with the hw state. */
> diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
> index 75a7f22..21427aa 100644
> --- a/drivers/gpu/drm/i915/intel_drv.h
> +++ b/drivers/gpu/drm/i915/intel_drv.h
> @@ -731,7 +731,7 @@ extern void intel_ddi_init(struct drm_device *dev, enum port port);
> extern void intel_update_watermarks(struct drm_device *dev);
> extern void intel_update_sprite_watermarks(struct drm_device *dev, int pipe,
> uint32_t sprite_width,
> - int pixel_size);
> + int pixel_size, bool enable);
>
> extern unsigned long intel_gen4_compute_page_offset(int *x, int *y,
> unsigned int tiling_mode,
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index e198f38..3ebb8e9 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -2195,7 +2195,8 @@ sandybridge_compute_sprite_srwm(struct drm_device *dev, int plane,
> }
>
> static void sandybridge_update_sprite_wm(struct drm_device *dev, int pipe,
> - uint32_t sprite_width, int pixel_size)
> + uint32_t sprite_width, int pixel_size,
> + bool enable)
> {
> struct drm_i915_private *dev_priv = dev->dev_private;
> int latency = SNB_READ_WM0_LATENCY() * 100; /* In unit 0.1us */
> @@ -2203,6 +2204,9 @@ static void sandybridge_update_sprite_wm(struct drm_device *dev, int pipe,
> int sprite_wm, reg;
> int ret;
>
> + if (!enable)
> + return;
> +
> switch (pipe) {
> case 0:
> reg = WM0_PIPEA_ILK;
> @@ -2314,13 +2318,14 @@ void intel_update_watermarks(struct drm_device *dev)
> }
>
> void intel_update_sprite_watermarks(struct drm_device *dev, int pipe,
> - uint32_t sprite_width, int pixel_size)
> + uint32_t sprite_width, int pixel_size,
> + bool enable)
> {
> struct drm_i915_private *dev_priv = dev->dev_private;
>
> if (dev_priv->display.update_sprite_wm)
> dev_priv->display.update_sprite_wm(dev, pipe, sprite_width,
> - pixel_size);
> + pixel_size, enable);
> }
>
> static struct drm_i915_gem_object *
> diff --git a/drivers/gpu/drm/i915/intel_sprite.c b/drivers/gpu/drm/i915/intel_sprite.c
> index 19b9cb9..04d38d4 100644
> --- a/drivers/gpu/drm/i915/intel_sprite.c
> +++ b/drivers/gpu/drm/i915/intel_sprite.c
> @@ -114,7 +114,7 @@ vlv_update_plane(struct drm_plane *dplane, struct drm_framebuffer *fb,
> crtc_w--;
> crtc_h--;
>
> - intel_update_sprite_watermarks(dev, pipe, crtc_w, pixel_size);
> + intel_update_sprite_watermarks(dev, pipe, crtc_w, pixel_size, true);
>
> I915_WRITE(SPSTRIDE(pipe, plane), fb->pitches[0]);
> I915_WRITE(SPPOS(pipe, plane), (crtc_y << 16) | crtc_x);
> @@ -268,7 +268,7 @@ ivb_update_plane(struct drm_plane *plane, struct drm_framebuffer *fb,
> crtc_w--;
> crtc_h--;
>
> - intel_update_sprite_watermarks(dev, pipe, crtc_w, pixel_size);
> + intel_update_sprite_watermarks(dev, pipe, crtc_w, pixel_size, true);
>
> /*
> * IVB workaround: must disable low power watermarks for at least
> @@ -335,6 +335,8 @@ ivb_disable_plane(struct drm_plane *plane)
>
> dev_priv->sprite_scaling_enabled &= ~(1 << pipe);
>
> + intel_update_sprite_watermarks(dev, pipe, 0, 0, false);
> +
> /* potentially re-enable LP watermarks */
> if (scaling_was_enabled && !dev_priv->sprite_scaling_enabled)
> intel_update_watermarks(dev);
> @@ -453,7 +455,7 @@ ilk_update_plane(struct drm_plane *plane, struct drm_framebuffer *fb,
> crtc_w--;
> crtc_h--;
>
> - intel_update_sprite_watermarks(dev, pipe, crtc_w, pixel_size);
> + intel_update_sprite_watermarks(dev, pipe, crtc_w, pixel_size, true);
>
> dvsscale = 0;
> if (IS_GEN5(dev) || crtc_w != src_w || crtc_h != src_h)
> --
> 1.8.1.2
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
--
Ville Syrjälä
Intel OTC
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 2/5] drm/i915: add haswell_update_sprite_wm
2013-05-24 14:59 ` [PATCH 2/5] drm/i915: add haswell_update_sprite_wm Paulo Zanoni
@ 2013-05-24 17:00 ` Ville Syrjälä
2013-05-24 19:35 ` Daniel Vetter
0 siblings, 1 reply; 29+ messages in thread
From: Ville Syrjälä @ 2013-05-24 17:00 UTC (permalink / raw)
To: Paulo Zanoni; +Cc: intel-gfx, Paulo Zanoni
On Fri, May 24, 2013 at 11:59:18AM -0300, Paulo Zanoni wrote:
> From: Paulo Zanoni <paulo.r.zanoni@intel.com>
>
> On Haswell, whenever we change the sprites we need to completely
> recalculate all the watermarks, because the sprites are one of the
> parameters to the LP watermarks, so a change on the sprites may
> trigger a change on which LP levels are enabled.
>
> So on this commit we store all the parameters we need to store for
> proper recalculation of the Haswell WMs and then call
> haswell_update_wm.
>
> Notice that for now our haswell_update_wm function is not really using
> these parameters we're storing, but on the next commits we'll use
> these parameters.
>
> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> ---
> drivers/gpu/drm/i915/intel_drv.h | 12 ++++++++++++
> drivers/gpu/drm/i915/intel_pm.c | 23 ++++++++++++++++++++++-
> 2 files changed, 34 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
> index 21427aa..57de0c1 100644
> --- a/drivers/gpu/drm/i915/intel_drv.h
> +++ b/drivers/gpu/drm/i915/intel_drv.h
> @@ -326,6 +326,18 @@ struct intel_plane {
> unsigned int crtc_w, crtc_h;
> uint32_t src_x, src_y;
> uint32_t src_w, src_h;
> +
> + /* Since we need to change the watermarks before/after
> + * enabling/disabling the planes, we need to store the parameters here
> + * as the other pieces of the struct may not reflect the values we want
> + * for the watermark calculations. Currently only Haswell uses this.
> + */
> + struct {
> + bool enable;
> + uint8_t bytes_per_pixel;
> + uint32_t horiz_pixels;
> + } wm;
> +
> void (*update_plane)(struct drm_plane *plane,
> struct drm_framebuffer *fb,
> struct drm_i915_gem_object *obj,
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index 3ebb8e9..0b61a0e 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -2118,6 +2118,26 @@ static void haswell_update_wm(struct drm_device *dev)
> sandybridge_update_wm(dev);
> }
>
> +static void haswell_update_sprite_wm(struct drm_device *dev, int pipe,
> + uint32_t sprite_width, int pixel_size,
> + bool enable)
> +{
> + struct drm_plane *plane;
> +
> + list_for_each_entry(plane, &dev->mode_config.plane_list, head) {
> + struct intel_plane *intel_plane = to_intel_plane(plane);
> +
> + if (intel_plane->pipe == pipe) {
> + intel_plane->wm.enable = enable;
> + intel_plane->wm.horiz_pixels = sprite_width + 1;
> + intel_plane->wm.bytes_per_pixel = pixel_size;
> + break;
> + }
> + }
> +
> + haswell_update_wm(dev);
> +}
> +
> static bool
> sandybridge_compute_sprite_wm(struct drm_device *dev, int plane,
> uint32_t sprite_width, int pixel_size,
> @@ -4635,7 +4655,8 @@ void intel_init_pm(struct drm_device *dev)
> } else if (IS_HASWELL(dev)) {
> if (I915_READ64(MCH_SSKPD)) {
> dev_priv->display.update_wm = haswell_update_wm;
> - dev_priv->display.update_sprite_wm = sandybridge_update_sprite_wm;
> + dev_priv->display.update_sprite_wm =
> + haswell_update_sprite_wm;
> } else {
> DRM_DEBUG_KMS("Failed to read display plane latency. "
> "Disable CxSR\n");
> --
> 1.8.1.2
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
--
Ville Syrjälä
Intel OTC
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 2/5] drm/i915: add haswell_update_sprite_wm
2013-05-24 17:00 ` Ville Syrjälä
@ 2013-05-24 19:35 ` Daniel Vetter
0 siblings, 0 replies; 29+ messages in thread
From: Daniel Vetter @ 2013-05-24 19:35 UTC (permalink / raw)
To: Ville Syrjälä; +Cc: intel-gfx, Paulo Zanoni
On Fri, May 24, 2013 at 08:00:26PM +0300, Ville Syrjälä wrote:
> On Fri, May 24, 2013 at 11:59:18AM -0300, Paulo Zanoni wrote:
> > From: Paulo Zanoni <paulo.r.zanoni@intel.com>
> >
> > On Haswell, whenever we change the sprites we need to completely
> > recalculate all the watermarks, because the sprites are one of the
> > parameters to the LP watermarks, so a change on the sprites may
> > trigger a change on which LP levels are enabled.
> >
> > So on this commit we store all the parameters we need to store for
> > proper recalculation of the Haswell WMs and then call
> > haswell_update_wm.
> >
> > Notice that for now our haswell_update_wm function is not really using
> > these parameters we're storing, but on the next commits we'll use
> > these parameters.
> >
> > Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
>
> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
First 2 patches merged to dinq, thanks. Although I do need to whine a bit
about how the state tracking in our sprite code seems to be bong-hits here
... ;-) But I guess Ville will tackle this with his plane config rework.
Cheers, Daniel
>
> > ---
> > drivers/gpu/drm/i915/intel_drv.h | 12 ++++++++++++
> > drivers/gpu/drm/i915/intel_pm.c | 23 ++++++++++++++++++++++-
> > 2 files changed, 34 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
> > index 21427aa..57de0c1 100644
> > --- a/drivers/gpu/drm/i915/intel_drv.h
> > +++ b/drivers/gpu/drm/i915/intel_drv.h
> > @@ -326,6 +326,18 @@ struct intel_plane {
> > unsigned int crtc_w, crtc_h;
> > uint32_t src_x, src_y;
> > uint32_t src_w, src_h;
> > +
> > + /* Since we need to change the watermarks before/after
> > + * enabling/disabling the planes, we need to store the parameters here
> > + * as the other pieces of the struct may not reflect the values we want
> > + * for the watermark calculations. Currently only Haswell uses this.
> > + */
> > + struct {
> > + bool enable;
> > + uint8_t bytes_per_pixel;
> > + uint32_t horiz_pixels;
> > + } wm;
> > +
> > void (*update_plane)(struct drm_plane *plane,
> > struct drm_framebuffer *fb,
> > struct drm_i915_gem_object *obj,
> > diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> > index 3ebb8e9..0b61a0e 100644
> > --- a/drivers/gpu/drm/i915/intel_pm.c
> > +++ b/drivers/gpu/drm/i915/intel_pm.c
> > @@ -2118,6 +2118,26 @@ static void haswell_update_wm(struct drm_device *dev)
> > sandybridge_update_wm(dev);
> > }
> >
> > +static void haswell_update_sprite_wm(struct drm_device *dev, int pipe,
> > + uint32_t sprite_width, int pixel_size,
> > + bool enable)
> > +{
> > + struct drm_plane *plane;
> > +
> > + list_for_each_entry(plane, &dev->mode_config.plane_list, head) {
> > + struct intel_plane *intel_plane = to_intel_plane(plane);
> > +
> > + if (intel_plane->pipe == pipe) {
> > + intel_plane->wm.enable = enable;
> > + intel_plane->wm.horiz_pixels = sprite_width + 1;
> > + intel_plane->wm.bytes_per_pixel = pixel_size;
> > + break;
> > + }
> > + }
> > +
> > + haswell_update_wm(dev);
> > +}
> > +
> > static bool
> > sandybridge_compute_sprite_wm(struct drm_device *dev, int plane,
> > uint32_t sprite_width, int pixel_size,
> > @@ -4635,7 +4655,8 @@ void intel_init_pm(struct drm_device *dev)
> > } else if (IS_HASWELL(dev)) {
> > if (I915_READ64(MCH_SSKPD)) {
> > dev_priv->display.update_wm = haswell_update_wm;
> > - dev_priv->display.update_sprite_wm = sandybridge_update_sprite_wm;
> > + dev_priv->display.update_sprite_wm =
> > + haswell_update_sprite_wm;
> > } else {
> > DRM_DEBUG_KMS("Failed to read display plane latency. "
> > "Disable CxSR\n");
> > --
> > 1.8.1.2
> >
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
>
> --
> Ville Syrjälä
> Intel OTC
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 3/5] drm/i915: properly set HSW WM_PIPE registers
2013-05-24 16:07 ` Ville Syrjälä
@ 2013-05-24 22:00 ` Paulo Zanoni
2013-05-24 22:02 ` Paulo Zanoni
2013-05-27 11:07 ` Ville Syrjälä
0 siblings, 2 replies; 29+ messages in thread
From: Paulo Zanoni @ 2013-05-24 22:00 UTC (permalink / raw)
To: Ville Syrjälä; +Cc: intel-gfx, Paulo Zanoni
2013/5/24 Ville Syrjälä <ville.syrjala@linux.intel.com>:
> On Fri, May 24, 2013 at 11:59:19AM -0300, Paulo Zanoni wrote:
>> From: Paulo Zanoni <paulo.r.zanoni@intel.com>
>>
>> We were previously calling sandybridge_update_wm on HSW, but the SNB
>> function didn't really match the HSW specification, so we were just
>> writing the wrong values.
>>
>> With this patch, the haswell_update_wm function will set the correct
>> values for the WM_PIPE registers, but it will still keep all the LP
>> watermarks disabled.
>>
>> The patch may look a little bit over-complicated for now, but it's
>> because much of the infrastructure for setting the LP watermarks is
>> already in place, so we won't have too much code churn on the patch
>> that sets the LP watermarks.
>>
>> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
>> ---
>> drivers/gpu/drm/i915/i915_reg.h | 3 +
>> drivers/gpu/drm/i915/intel_pm.c | 340 +++++++++++++++++++++++++++++++++++++---
>> 2 files changed, 325 insertions(+), 18 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
>> index 55caedb..e86606c 100644
>> --- a/drivers/gpu/drm/i915/i915_reg.h
>> +++ b/drivers/gpu/drm/i915/i915_reg.h
>> @@ -4938,6 +4938,9 @@
>> #define SFUSE_STRAP_DDIC_DETECTED (1<<1)
>> #define SFUSE_STRAP_DDID_DETECTED (1<<0)
>>
>> +#define WM_MISC 0x45260
>> +#define WM_MISC_DATA_PARTITION_5_6 (1 << 0)
>> +
>> #define WM_DBG 0x45280
>> #define WM_DBG_DISALLOW_MULTIPLE_LP (1<<0)
>> #define WM_DBG_DISALLOW_MAXFIFO (1<<1)
>> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
>> index 0b61a0e..2ee1d01 100644
>> --- a/drivers/gpu/drm/i915/intel_pm.c
>> +++ b/drivers/gpu/drm/i915/intel_pm.c
>> @@ -2072,19 +2072,173 @@ static void ivybridge_update_wm(struct drm_device *dev)
>> cursor_wm);
>> }
>>
>> -static void
>> -haswell_update_linetime_wm(struct drm_device *dev, struct drm_crtc *crtc)
>> +static uint32_t hsw_wm_get_pixel_rate(struct drm_device *dev,
>> + struct drm_crtc *crtc)
>> +{
>> + struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
>> + uint32_t pixel_rate, pfit_size;
>> +
>> + if (intel_crtc->config.pixel_target_clock)
>> + pixel_rate = intel_crtc->config.pixel_target_clock;
>> + else
>> + pixel_rate = intel_crtc->config.adjusted_mode.clock;
>> +
>> + /* We only use IF-ID interlacing. If we ever use PF-ID we'll need to
>> + * adjust the pixel_rate here. */
>> +
>> + pfit_size = intel_crtc->config.pch_pfit.size;
>> + if (pfit_size) {
>> + uint64_t x, y, crtc_x, crtc_y, hscale, vscale, totscale;
>> +
>> + x = (pfit_size >> 16) & 0xFFFF;
>> + y = pfit_size & 0xFFFF;
>> + crtc_x = intel_crtc->config.adjusted_mode.hdisplay;
>> + crtc_y = intel_crtc->config.adjusted_mode.vdisplay;
>> +
>> + hscale = crtc_x << 16;
>> + vscale = crtc_y << 16;
>> + do_div(hscale, x);
>> + do_div(vscale, y);
>> + hscale = (hscale < (1 << 16)) ? (1 << 16) : hscale;
>> + vscale = (vscale < (1 << 16)) ? (1 << 16) : vscale;
>> + totscale = (hscale * vscale) >> 16;
>> + pixel_rate = (pixel_rate * totscale) >> 16;
>
> No need for fixed point math if you go 64bits, and as stated before
> the scaling ratio is still being miscaclulated due to the use of
> adjusted_mode.
>
> Something like this ought to do it:
>
> in_w = req_mode.hdisplay;
> in_h = req_mode.vdisplay;
> out_w = (pfit_size >> 16) & 0xffff;
> out_h = pfit_size & 0xffff;
> if (in_w <= out_w)
> in_w = out_w;
> if (in_h <= out_h)
> in_h = out_h;
>
> pixel_rate = div_u64((uint64_t) pixel_rate * in_w * in_h, out_w * out_h);
Ok, I re-checked and you were right. Fixed. Sorry for insisting :(
>
>> + }
>> +
>> + return pixel_rate;
>> +}
>> +
>> +static uint32_t hsw_wm_method1(uint32_t pixel_rate, uint8_t bytes_per_pixel,
>> + uint32_t latency)
>> +{
>> + uint64_t tmp;
>> + uint32_t ret;
>> +
>> + tmp = pixel_rate * bytes_per_pixel * latency;
>
> Would need a cast to make the multiplications actually 64bit. 'ret' is
> also pointless.
Oops... Fixed.
>
>> + ret = DIV_ROUND_UP_ULL(tmp, 64 * 10000) + 2;
>> +
>> + return ret;
>> +}
>> +
>> +static uint32_t hsw_wm_method2(uint32_t pixel_rate, uint32_t pipe_htotal,
>> + uint32_t horiz_pixels, uint8_t bytes_per_pixel,
>> + uint32_t latency)
>> +{
>> + uint32_t ret;
>> +
>> + ret = DIV_ROUND_UP(pipe_htotal * 1000, pixel_rate);
>> + ret = ((latency / (ret * 10)) + 1) * horiz_pixels * bytes_per_pixel;
>
> w/ 64bit maths this could be:
>
> tmp = (uint64_t) latency * pixel_rate * 100;
> ret = (div_u64(tmp, pipe_htotal) + 1) * horiz_pixels * bytes_per_pixel
I did the math on a paper and your formula doesn't look correct. For
latency=10 rate=120000 pipe_htotal=2000 horiz_pixels=1500 bpp=4 the
correct value should be 96, but your formula gives me a really huge
value. Besides, I like having the formula match BSpec exactly. And I
can't see how the current code would give us overflows, that's why I
kept it using uint32_t.
>
>> + ret = DIV_ROUND_UP(ret, 64) + 2;
>> + return ret;
>> +}
>> +
>> +struct hsw_pipe_wm_parameters {
>> + bool active;
>> + bool sprite_enabled;
>> + uint8_t pri_bytes_per_pixel;
>> + uint8_t spr_bytes_per_pixel;
>> + uint8_t cur_bytes_per_pixel;
>> + uint32_t pri_horiz_pixels;
>> + uint32_t spr_horiz_pixels;
>> + uint32_t cur_horiz_pixels;
>> + uint32_t pipe_htotal;
>> + uint32_t pixel_rate;
>> +};
>> +
>> +struct hsw_wm_values {
>> + uint32_t wm_pipe[3];
>> + uint32_t wm_lp[3];
>> + uint32_t wm_lp_spr[3];
>> + uint32_t wm_linetime[3];
>> +};
>> +
>> +enum hsw_data_buf_partitioning {
>> + HSW_DATA_BUF_PART_1_2,
>> + HSW_DATA_BUF_PART_5_6,
>> +};
>> +
>> +/* Only for WM_PIPE. */
>> +static uint32_t hsw_compute_pri_wm_pipe(struct hsw_pipe_wm_parameters *params,
>> + uint32_t mem_value)
>> +{
>> + /* TODO: for now, assume the primary plane is always enabled. */
>> + if (!params->active)
>> + return 0;
>> +
>> + return hsw_wm_method1(params->pixel_rate,
>> + params->pri_bytes_per_pixel,
>> + mem_value);
>> +}
>> +
>> +/* For both WM_PIPE and WM_LP. */
>> +static uint32_t hsw_compute_spr_wm(struct hsw_pipe_wm_parameters *params,
>> + uint32_t mem_value)
>> +{
>> + uint32_t method1, method2;
>> +
>> + if (!params->active || !params->sprite_enabled)
>> + return 0;
>> +
>> + method1 = hsw_wm_method1(params->pixel_rate,
>> + params->spr_bytes_per_pixel,
>> + mem_value);
>> + method2 = hsw_wm_method2(params->pixel_rate,
>> + params->pipe_htotal,
>> + params->spr_horiz_pixels,
>> + params->spr_bytes_per_pixel,
>> + mem_value);
>> + return min(method1, method2);
>> +}
>> +
>> +/* For both WM_PIPE and WM_LP. */
>> +static uint32_t hsw_compute_cur_wm(struct hsw_pipe_wm_parameters *params,
>> + uint32_t mem_value)
>> +{
>> + if (!params->active)
>> + return 0;
>> +
>> + return hsw_wm_method2(params->pixel_rate,
>> + params->pipe_htotal,
>> + params->cur_horiz_pixels,
>> + params->cur_bytes_per_pixel,
>> + mem_value);
>> +}
>> +
>> +static uint32_t hsw_compute_wm_pipe(struct drm_i915_private *dev_priv,
>> + uint32_t mem_value, enum pipe pipe,
>> + struct hsw_pipe_wm_parameters *params)
>> +{
>> + uint32_t pri_val, cur_val, spr_val;
>> +
>> + pri_val = hsw_compute_pri_wm_pipe(params, mem_value);
>> + spr_val = hsw_compute_spr_wm(params, mem_value);
>> + cur_val = hsw_compute_cur_wm(params, mem_value);
>> +
>> + WARN(pri_val > 127,
>> + "Primary WM error, mode not supported for pipe %c\n",
>> + pipe_name(pipe));
>> + WARN(spr_val > 127,
>> + "Sprite WM error, mode not supported for pipe %c\n",
>> + pipe_name(pipe));
>> + WARN(cur_val > 63,
>> + "Cursor WM error, mode not supported for pipe %c\n",
>> + pipe_name(pipe));
>> +
>> + return (pri_val << WM0_PIPE_PLANE_SHIFT) |
>> + (spr_val << WM0_PIPE_SPRITE_SHIFT) |
>> + cur_val;
>> +}
>> +
>> +static uint32_t
>> +hsw_compute_linetime_wm(struct drm_device *dev, struct drm_crtc *crtc)
>> {
>> struct drm_i915_private *dev_priv = dev->dev_private;
>> struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
>> - enum pipe pipe = intel_crtc->pipe;
>> struct drm_display_mode *mode = &intel_crtc->config.adjusted_mode;
>> u32 linetime, ips_linetime;
>>
>> - if (!intel_crtc_active(crtc)) {
>> - I915_WRITE(PIPE_WM_LINETIME(pipe), 0);
>> - return;
>> - }
>> + if (!intel_crtc_active(crtc))
>> + return 0;
>>
>> /* The WM are computed with base on how long it takes to fill a single
>> * row at the given clock rate, multiplied by 8.
>> @@ -2093,29 +2247,179 @@ haswell_update_linetime_wm(struct drm_device *dev, struct drm_crtc *crtc)
>> ips_linetime = DIV_ROUND_CLOSEST(mode->htotal * 1000 * 8,
>> intel_ddi_get_cdclk_freq(dev_priv));
>>
>> - I915_WRITE(PIPE_WM_LINETIME(pipe),
>> - PIPE_WM_LINETIME_IPS_LINETIME(ips_linetime) |
>> - PIPE_WM_LINETIME_TIME(linetime));
>> + return PIPE_WM_LINETIME_IPS_LINETIME(ips_linetime) |
>> + PIPE_WM_LINETIME_TIME(linetime);
>> }
>>
>> -static void haswell_update_wm(struct drm_device *dev)
>> +static void hsw_compute_wm_parameters(struct drm_device *dev,
>> + struct hsw_pipe_wm_parameters *params,
>> + uint32_t *wm)
>> {
>> struct drm_i915_private *dev_priv = dev->dev_private;
>> struct drm_crtc *crtc;
>> + struct drm_plane *plane;
>> + uint64_t sskpd = I915_READ64(MCH_SSKPD);
>> enum pipe pipe;
>>
>> - /* Disable the LP WMs before changine the linetime registers. This is
>> - * just a temporary code that will be replaced soon. */
>> - I915_WRITE(WM3_LP_ILK, 0);
>> - I915_WRITE(WM2_LP_ILK, 0);
>> - I915_WRITE(WM1_LP_ILK, 0);
>> + if ((sskpd >> 56) & 0xFF)
>> + wm[0] = (sskpd >> 56) & 0xFF;
>> + else
>> + wm[0] = sskpd & 0xF;
>> + wm[1] = ((sskpd >> 4) & 0xFF) * 5;
>> + wm[2] = ((sskpd >> 12) & 0xFF) * 5;
>> + wm[3] = ((sskpd >> 20) & 0x1FF) * 5;
>> + wm[4] = ((sskpd >> 32) & 0x1FF) * 5;
>> +
>> + list_for_each_entry(crtc, &dev->mode_config.crtc_list, head) {
>> + struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
>> + struct hsw_pipe_wm_parameters *p;
>> +
>> + pipe = intel_crtc->pipe;
>> + p = ¶ms[pipe];
>> +
>> + p->active = intel_crtc_active(crtc);
>> + if (!p->active)
>> + continue;
>> +
>> + p->pipe_htotal = intel_crtc->config.adjusted_mode.htotal;
>> + p->pixel_rate = hsw_wm_get_pixel_rate(dev, crtc);
>> + p->pri_bytes_per_pixel = crtc->fb->bits_per_pixel / 8;
>> + p->cur_bytes_per_pixel = 4;
>> + p->pri_horiz_pixels = intel_crtc->config.adjusted_mode.hdisplay;
>> + p->cur_horiz_pixels = 64;
>> + }
>> +
>> + list_for_each_entry(plane, &dev->mode_config.plane_list, head) {
>> + struct intel_plane *intel_plane = to_intel_plane(plane);
>> + struct hsw_pipe_wm_parameters *p;
>> +
>> + pipe = intel_plane->pipe;
>> + p = ¶ms[pipe];
>> +
>> + p->sprite_enabled = intel_plane->wm.enable;
>> + p->spr_bytes_per_pixel = intel_plane->wm.bytes_per_pixel;
>> + p->spr_horiz_pixels = intel_plane->wm.horiz_pixels;
>> + }
>> +}
>> +
>> +static void hsw_compute_wm_results(struct drm_device *dev,
>> + struct hsw_pipe_wm_parameters *params,
>> + uint32_t *wm,
>> + struct hsw_wm_values *results)
>> +{
>> + struct drm_i915_private *dev_priv = dev->dev_private;
>> + struct drm_crtc *crtc;
>> + enum pipe pipe;
>> +
>> + /* No support for LP WMs yet. */
>> + results->wm_lp[2] = 0;
>> + results->wm_lp[1] = 0;
>> + results->wm_lp[0] = 0;
>> + results->wm_lp_spr[2] = 0;
>> + results->wm_lp_spr[1] = 0;
>> + results->wm_lp_spr[0] = 0;
>> +
>> + for_each_pipe(pipe)
>> + results->wm_pipe[pipe] = hsw_compute_wm_pipe(dev_priv, wm[0],
>> + pipe,
>> + ¶ms[pipe]);
>>
>> for_each_pipe(pipe) {
>> crtc = dev_priv->pipe_to_crtc_mapping[pipe];
>> - haswell_update_linetime_wm(dev, crtc);
>> + results->wm_linetime[pipe] = hsw_compute_linetime_wm(dev, crtc);
>> }
>> +}
>> +
>> +/*
>> + * The spec says we shouldn't write when we don't need, because every write
>> + * causes WMs to be re-evaluated, expending some power.
>> + */
>> +static void hsw_write_wm_values(struct drm_i915_private *dev_priv,
>> + struct hsw_wm_values *results,
>> + enum hsw_data_buf_partitioning partitioning)
>> +{
>> + struct hsw_wm_values previous;
>> + uint32_t val;
>> + enum hsw_data_buf_partitioning prev_partitioning;
>> +
>> + previous.wm_pipe[0] = I915_READ(WM0_PIPEA_ILK);
>> + previous.wm_pipe[1] = I915_READ(WM0_PIPEB_ILK);
>> + previous.wm_pipe[2] = I915_READ(WM0_PIPEC_IVB);
>> + previous.wm_lp[0] = I915_READ(WM1_LP_ILK);
>> + previous.wm_lp[1] = I915_READ(WM2_LP_ILK);
>> + previous.wm_lp[2] = I915_READ(WM3_LP_ILK);
>> + previous.wm_lp_spr[0] = I915_READ(WM1S_LP_ILK);
>> + previous.wm_lp_spr[1] = I915_READ(WM2S_LP_IVB);
>> + previous.wm_lp_spr[2] = I915_READ(WM3S_LP_IVB);
>> + previous.wm_linetime[0] = I915_READ(PIPE_WM_LINETIME(PIPE_A));
>> + previous.wm_linetime[1] = I915_READ(PIPE_WM_LINETIME(PIPE_B));
>> + previous.wm_linetime[2] = I915_READ(PIPE_WM_LINETIME(PIPE_C));
>> +
>> + prev_partitioning = (I915_READ(WM_MISC) & WM_MISC_DATA_PARTITION_5_6) ?
>> + HSW_DATA_BUF_PART_5_6 : HSW_DATA_BUF_PART_1_2;
>> +
>> + if (memcmp(results->wm_pipe, previous.wm_pipe, 3) == 0 &&
>> + memcmp(results->wm_lp, previous.wm_lp, 3) == 0 &&
>> + memcmp(results->wm_lp_spr, previous.wm_lp_spr, 3) == 0 &&
>> + memcmp(results->wm_linetime, previous.wm_linetime, 3) == 0 &&
>> + partitioning == prev_partitioning)
>> + return;
>> +
>> + if (previous.wm_lp[2] != 0)
>> + I915_WRITE(WM3_LP_ILK, 0);
>> + if (previous.wm_lp[1] != 0)
>> + I915_WRITE(WM2_LP_ILK, 0);
>> + if (previous.wm_lp[0] != 0)
>> + I915_WRITE(WM1_LP_ILK, 0);
>
> I don't know if this conditional writing makes sense in such a fine
> granularity. We're anyway going to write some of the registeres, so
> maybe it's better to just go ahead and write all of them. It would
> at least make the code look a bit better.
The documentation says "Do not write the watermark registers when
there is no need to change a value, as every write will cause the
watermarks to be re-evaluated, expending some power.". I do recognize
the function looks a little bit ugly, but I think it's worth the cost,
especially since I imagine we're not going to change it too much in
the future.
>
> In any case you'd at least need to make sure that you disable/re-enable
> the LP1+ watermarks if linetime WMs or DDB partitioning changes,
> regardless of whether the LP1+ watermarks themselves changed.
We already do this.
Thanks again for the review,
Paulo
>
>> +
>> + if (previous.wm_pipe[0] != results->wm_pipe[0])
>> + I915_WRITE(WM0_PIPEA_ILK, results->wm_pipe[0]);
>> + if (previous.wm_pipe[1] != results->wm_pipe[1])
>> + I915_WRITE(WM0_PIPEB_ILK, results->wm_pipe[1]);
>> + if (previous.wm_pipe[2] != results->wm_pipe[2])
>> + I915_WRITE(WM0_PIPEC_IVB, results->wm_pipe[2]);
>> +
>> + if (previous.wm_linetime[0] != results->wm_linetime[0])
>> + I915_WRITE(PIPE_WM_LINETIME(PIPE_A), results->wm_linetime[0]);
>> + if (previous.wm_linetime[1] != results->wm_linetime[1])
>> + I915_WRITE(PIPE_WM_LINETIME(PIPE_B), results->wm_linetime[1]);
>> + if (previous.wm_linetime[2] != results->wm_linetime[2])
>> + I915_WRITE(PIPE_WM_LINETIME(PIPE_C), results->wm_linetime[2]);
>> +
>> + if (prev_partitioning != partitioning) {
>> + val = I915_READ(WM_MISC);
>> + if (partitioning == HSW_DATA_BUF_PART_1_2)
>> + val &= ~WM_MISC_DATA_PARTITION_5_6;
>> + else
>> + val |= WM_MISC_DATA_PARTITION_5_6;
>> + I915_WRITE(WM_MISC, val);
>> + }
>> +
>> + if (previous.wm_lp_spr[0] != results->wm_lp_spr[0])
>> + I915_WRITE(WM1S_LP_ILK, results->wm_lp_spr[0]);
>> + if (previous.wm_lp_spr[1] != results->wm_lp_spr[1])
>> + I915_WRITE(WM2S_LP_IVB, results->wm_lp_spr[1]);
>> + if (previous.wm_lp_spr[2] != results->wm_lp_spr[2])
>> + I915_WRITE(WM3S_LP_IVB, results->wm_lp_spr[2]);
>> +
>> + if (results->wm_lp[0] != 0)
>> + I915_WRITE(WM1_LP_ILK, results->wm_lp[0]);
>> + if (results->wm_lp[1] != 0)
>> + I915_WRITE(WM2_LP_ILK, results->wm_lp[1]);
>> + if (results->wm_lp[2] != 0)
>> + I915_WRITE(WM3_LP_ILK, results->wm_lp[2]);
>> +}
>> +
>> +static void haswell_update_wm(struct drm_device *dev)
>> +{
>> + struct drm_i915_private *dev_priv = dev->dev_private;
>> + struct hsw_pipe_wm_parameters params[3];
>> + struct hsw_wm_values results;
>> + uint32_t wm[5];
>>
>> - sandybridge_update_wm(dev);
>> + hsw_compute_wm_parameters(dev, params, wm);
>> + hsw_compute_wm_results(dev, params, wm, &results);
>> + hsw_write_wm_values(dev_priv, &results, HSW_DATA_BUF_PART_1_2);
>> }
>>
>> static void haswell_update_sprite_wm(struct drm_device *dev, int pipe,
>> --
>> 1.8.1.2
>>
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
>
> --
> Ville Syrjälä
> Intel OTC
--
Paulo Zanoni
^ permalink raw reply [flat|nested] 29+ messages in thread
* [PATCH 3/5] drm/i915: properly set HSW WM_PIPE registers
2013-05-24 22:00 ` Paulo Zanoni
@ 2013-05-24 22:02 ` Paulo Zanoni
2013-05-27 11:07 ` Ville Syrjälä
1 sibling, 0 replies; 29+ messages in thread
From: Paulo Zanoni @ 2013-05-24 22:02 UTC (permalink / raw)
To: intel-gfx; +Cc: Paulo Zanoni
From: Paulo Zanoni <paulo.r.zanoni@intel.com>
We were previously calling sandybridge_update_wm on HSW, but the SNB
function didn't really match the HSW specification, so we were just
writing the wrong values.
With this patch, the haswell_update_wm function will set the correct
values for the WM_PIPE registers, but it will still keep all the LP
watermarks disabled.
The patch may look a little bit over-complicated for now, but it's
because much of the infrastructure for setting the LP watermarks is
already in place, so we won't have too much code churn on the patch
that sets the LP watermarks.
v2: - Fix pixel_rate on panel fitter case (Ville)
- Try to not overflow (Ville)
- Remove useless variable (Ville)
- Fix p->pri_horiz_pixels (Paulo)
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
---
drivers/gpu/drm/i915/i915_reg.h | 3 +
drivers/gpu/drm/i915/intel_pm.c | 338 +++++++++++++++++++++++++++++++++++++---
2 files changed, 323 insertions(+), 18 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 55caedb..e86606c 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -4938,6 +4938,9 @@
#define SFUSE_STRAP_DDIC_DETECTED (1<<1)
#define SFUSE_STRAP_DDID_DETECTED (1<<0)
+#define WM_MISC 0x45260
+#define WM_MISC_DATA_PARTITION_5_6 (1 << 0)
+
#define WM_DBG 0x45280
#define WM_DBG_DISALLOW_MULTIPLE_LP (1<<0)
#define WM_DBG_DISALLOW_MAXFIFO (1<<1)
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 0b61a0e..ef58a1a 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -2072,19 +2072,170 @@ static void ivybridge_update_wm(struct drm_device *dev)
cursor_wm);
}
-static void
-haswell_update_linetime_wm(struct drm_device *dev, struct drm_crtc *crtc)
+static uint32_t hsw_wm_get_pixel_rate(struct drm_device *dev,
+ struct drm_crtc *crtc)
+{
+ struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
+ uint32_t pixel_rate, pfit_size;
+
+ if (intel_crtc->config.pixel_target_clock)
+ pixel_rate = intel_crtc->config.pixel_target_clock;
+ else
+ pixel_rate = intel_crtc->config.adjusted_mode.clock;
+
+ /* We only use IF-ID interlacing. If we ever use PF-ID we'll need to
+ * adjust the pixel_rate here. */
+
+ pfit_size = intel_crtc->config.pch_pfit.size;
+ if (pfit_size) {
+ uint64_t pipe_w, pipe_h, pfit_w, pfit_h;
+
+ pipe_w = intel_crtc->config.requested_mode.hdisplay;
+ pipe_h = intel_crtc->config.requested_mode.vdisplay;
+ pfit_w = (pfit_size >> 16) & 0xFFFF;
+ pfit_h = pfit_size & 0xFFFF;
+ if (pipe_w < pfit_w)
+ pipe_w = pfit_w;
+ if (pipe_h < pfit_h)
+ pipe_h = pfit_h;
+
+ pixel_rate = div_u64((uint64_t) pixel_rate * pipe_w * pipe_h,
+ pfit_w * pfit_h);
+ }
+
+ return pixel_rate;
+}
+
+static uint32_t hsw_wm_method1(uint32_t pixel_rate, uint8_t bytes_per_pixel,
+ uint32_t latency)
+{
+ uint64_t ret;
+
+ ret = (uint64_t) pixel_rate * bytes_per_pixel * latency;
+ ret = DIV_ROUND_UP_ULL(ret, 64 * 10000) + 2;
+
+ return ret;
+}
+
+static uint32_t hsw_wm_method2(uint32_t pixel_rate, uint32_t pipe_htotal,
+ uint32_t horiz_pixels, uint8_t bytes_per_pixel,
+ uint32_t latency)
+{
+ uint32_t ret;
+
+ ret = DIV_ROUND_UP(pipe_htotal * 1000, pixel_rate);
+ ret = ((latency / (ret * 10)) + 1) * horiz_pixels * bytes_per_pixel;
+ ret = DIV_ROUND_UP(ret, 64) + 2;
+ return ret;
+}
+
+struct hsw_pipe_wm_parameters {
+ bool active;
+ bool sprite_enabled;
+ uint8_t pri_bytes_per_pixel;
+ uint8_t spr_bytes_per_pixel;
+ uint8_t cur_bytes_per_pixel;
+ uint32_t pri_horiz_pixels;
+ uint32_t spr_horiz_pixels;
+ uint32_t cur_horiz_pixels;
+ uint32_t pipe_htotal;
+ uint32_t pixel_rate;
+};
+
+struct hsw_wm_values {
+ uint32_t wm_pipe[3];
+ uint32_t wm_lp[3];
+ uint32_t wm_lp_spr[3];
+ uint32_t wm_linetime[3];
+};
+
+enum hsw_data_buf_partitioning {
+ HSW_DATA_BUF_PART_1_2,
+ HSW_DATA_BUF_PART_5_6,
+};
+
+/* Only for WM_PIPE. */
+static uint32_t hsw_compute_pri_wm_pipe(struct hsw_pipe_wm_parameters *params,
+ uint32_t mem_value)
+{
+ /* TODO: for now, assume the primary plane is always enabled. */
+ if (!params->active)
+ return 0;
+
+ return hsw_wm_method1(params->pixel_rate,
+ params->pri_bytes_per_pixel,
+ mem_value);
+}
+
+/* For both WM_PIPE and WM_LP. */
+static uint32_t hsw_compute_spr_wm(struct hsw_pipe_wm_parameters *params,
+ uint32_t mem_value)
+{
+ uint32_t method1, method2;
+
+ if (!params->active || !params->sprite_enabled)
+ return 0;
+
+ method1 = hsw_wm_method1(params->pixel_rate,
+ params->spr_bytes_per_pixel,
+ mem_value);
+ method2 = hsw_wm_method2(params->pixel_rate,
+ params->pipe_htotal,
+ params->spr_horiz_pixels,
+ params->spr_bytes_per_pixel,
+ mem_value);
+ return min(method1, method2);
+}
+
+/* For both WM_PIPE and WM_LP. */
+static uint32_t hsw_compute_cur_wm(struct hsw_pipe_wm_parameters *params,
+ uint32_t mem_value)
+{
+ if (!params->active)
+ return 0;
+
+ return hsw_wm_method2(params->pixel_rate,
+ params->pipe_htotal,
+ params->cur_horiz_pixels,
+ params->cur_bytes_per_pixel,
+ mem_value);
+}
+
+static uint32_t hsw_compute_wm_pipe(struct drm_i915_private *dev_priv,
+ uint32_t mem_value, enum pipe pipe,
+ struct hsw_pipe_wm_parameters *params)
+{
+ uint32_t pri_val, cur_val, spr_val;
+
+ pri_val = hsw_compute_pri_wm_pipe(params, mem_value);
+ spr_val = hsw_compute_spr_wm(params, mem_value);
+ cur_val = hsw_compute_cur_wm(params, mem_value);
+
+ WARN(pri_val > 127,
+ "Primary WM error, mode not supported for pipe %c\n",
+ pipe_name(pipe));
+ WARN(spr_val > 127,
+ "Sprite WM error, mode not supported for pipe %c\n",
+ pipe_name(pipe));
+ WARN(cur_val > 63,
+ "Cursor WM error, mode not supported for pipe %c\n",
+ pipe_name(pipe));
+
+ return (pri_val << WM0_PIPE_PLANE_SHIFT) |
+ (spr_val << WM0_PIPE_SPRITE_SHIFT) |
+ cur_val;
+}
+
+static uint32_t
+hsw_compute_linetime_wm(struct drm_device *dev, struct drm_crtc *crtc)
{
struct drm_i915_private *dev_priv = dev->dev_private;
struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
- enum pipe pipe = intel_crtc->pipe;
struct drm_display_mode *mode = &intel_crtc->config.adjusted_mode;
u32 linetime, ips_linetime;
- if (!intel_crtc_active(crtc)) {
- I915_WRITE(PIPE_WM_LINETIME(pipe), 0);
- return;
- }
+ if (!intel_crtc_active(crtc))
+ return 0;
/* The WM are computed with base on how long it takes to fill a single
* row at the given clock rate, multiplied by 8.
@@ -2093,29 +2244,180 @@ haswell_update_linetime_wm(struct drm_device *dev, struct drm_crtc *crtc)
ips_linetime = DIV_ROUND_CLOSEST(mode->htotal * 1000 * 8,
intel_ddi_get_cdclk_freq(dev_priv));
- I915_WRITE(PIPE_WM_LINETIME(pipe),
- PIPE_WM_LINETIME_IPS_LINETIME(ips_linetime) |
- PIPE_WM_LINETIME_TIME(linetime));
+ return PIPE_WM_LINETIME_IPS_LINETIME(ips_linetime) |
+ PIPE_WM_LINETIME_TIME(linetime);
}
-static void haswell_update_wm(struct drm_device *dev)
+static void hsw_compute_wm_parameters(struct drm_device *dev,
+ struct hsw_pipe_wm_parameters *params,
+ uint32_t *wm)
{
struct drm_i915_private *dev_priv = dev->dev_private;
struct drm_crtc *crtc;
+ struct drm_plane *plane;
+ uint64_t sskpd = I915_READ64(MCH_SSKPD);
enum pipe pipe;
- /* Disable the LP WMs before changine the linetime registers. This is
- * just a temporary code that will be replaced soon. */
- I915_WRITE(WM3_LP_ILK, 0);
- I915_WRITE(WM2_LP_ILK, 0);
- I915_WRITE(WM1_LP_ILK, 0);
+ if ((sskpd >> 56) & 0xFF)
+ wm[0] = (sskpd >> 56) & 0xFF;
+ else
+ wm[0] = sskpd & 0xF;
+ wm[1] = ((sskpd >> 4) & 0xFF) * 5;
+ wm[2] = ((sskpd >> 12) & 0xFF) * 5;
+ wm[3] = ((sskpd >> 20) & 0x1FF) * 5;
+ wm[4] = ((sskpd >> 32) & 0x1FF) * 5;
+
+ list_for_each_entry(crtc, &dev->mode_config.crtc_list, head) {
+ struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
+ struct hsw_pipe_wm_parameters *p;
+
+ pipe = intel_crtc->pipe;
+ p = ¶ms[pipe];
+
+ p->active = intel_crtc_active(crtc);
+ if (!p->active)
+ continue;
+
+ p->pipe_htotal = intel_crtc->config.adjusted_mode.htotal;
+ p->pixel_rate = hsw_wm_get_pixel_rate(dev, crtc);
+ p->pri_bytes_per_pixel = crtc->fb->bits_per_pixel / 8;
+ p->cur_bytes_per_pixel = 4;
+ p->pri_horiz_pixels =
+ intel_crtc->config.requested_mode.hdisplay;
+ p->cur_horiz_pixels = 64;
+ }
+
+ list_for_each_entry(plane, &dev->mode_config.plane_list, head) {
+ struct intel_plane *intel_plane = to_intel_plane(plane);
+ struct hsw_pipe_wm_parameters *p;
+
+ pipe = intel_plane->pipe;
+ p = ¶ms[pipe];
+
+ p->sprite_enabled = intel_plane->wm.enable;
+ p->spr_bytes_per_pixel = intel_plane->wm.bytes_per_pixel;
+ p->spr_horiz_pixels = intel_plane->wm.horiz_pixels;
+ }
+}
+
+static void hsw_compute_wm_results(struct drm_device *dev,
+ struct hsw_pipe_wm_parameters *params,
+ uint32_t *wm,
+ struct hsw_wm_values *results)
+{
+ struct drm_i915_private *dev_priv = dev->dev_private;
+ struct drm_crtc *crtc;
+ enum pipe pipe;
+
+ /* No support for LP WMs yet. */
+ results->wm_lp[2] = 0;
+ results->wm_lp[1] = 0;
+ results->wm_lp[0] = 0;
+ results->wm_lp_spr[2] = 0;
+ results->wm_lp_spr[1] = 0;
+ results->wm_lp_spr[0] = 0;
+
+ for_each_pipe(pipe)
+ results->wm_pipe[pipe] = hsw_compute_wm_pipe(dev_priv, wm[0],
+ pipe,
+ ¶ms[pipe]);
for_each_pipe(pipe) {
crtc = dev_priv->pipe_to_crtc_mapping[pipe];
- haswell_update_linetime_wm(dev, crtc);
+ results->wm_linetime[pipe] = hsw_compute_linetime_wm(dev, crtc);
}
+}
+
+/*
+ * The spec says we shouldn't write when we don't need, because every write
+ * causes WMs to be re-evaluated, expending some power.
+ */
+static void hsw_write_wm_values(struct drm_i915_private *dev_priv,
+ struct hsw_wm_values *results,
+ enum hsw_data_buf_partitioning partitioning)
+{
+ struct hsw_wm_values previous;
+ uint32_t val;
+ enum hsw_data_buf_partitioning prev_partitioning;
+
+ previous.wm_pipe[0] = I915_READ(WM0_PIPEA_ILK);
+ previous.wm_pipe[1] = I915_READ(WM0_PIPEB_ILK);
+ previous.wm_pipe[2] = I915_READ(WM0_PIPEC_IVB);
+ previous.wm_lp[0] = I915_READ(WM1_LP_ILK);
+ previous.wm_lp[1] = I915_READ(WM2_LP_ILK);
+ previous.wm_lp[2] = I915_READ(WM3_LP_ILK);
+ previous.wm_lp_spr[0] = I915_READ(WM1S_LP_ILK);
+ previous.wm_lp_spr[1] = I915_READ(WM2S_LP_IVB);
+ previous.wm_lp_spr[2] = I915_READ(WM3S_LP_IVB);
+ previous.wm_linetime[0] = I915_READ(PIPE_WM_LINETIME(PIPE_A));
+ previous.wm_linetime[1] = I915_READ(PIPE_WM_LINETIME(PIPE_B));
+ previous.wm_linetime[2] = I915_READ(PIPE_WM_LINETIME(PIPE_C));
+
+ prev_partitioning = (I915_READ(WM_MISC) & WM_MISC_DATA_PARTITION_5_6) ?
+ HSW_DATA_BUF_PART_5_6 : HSW_DATA_BUF_PART_1_2;
+
+ if (memcmp(results->wm_pipe, previous.wm_pipe, 3) == 0 &&
+ memcmp(results->wm_lp, previous.wm_lp, 3) == 0 &&
+ memcmp(results->wm_lp_spr, previous.wm_lp_spr, 3) == 0 &&
+ memcmp(results->wm_linetime, previous.wm_linetime, 3) == 0 &&
+ partitioning == prev_partitioning)
+ return;
+
+ if (previous.wm_lp[2] != 0)
+ I915_WRITE(WM3_LP_ILK, 0);
+ if (previous.wm_lp[1] != 0)
+ I915_WRITE(WM2_LP_ILK, 0);
+ if (previous.wm_lp[0] != 0)
+ I915_WRITE(WM1_LP_ILK, 0);
+
+ if (previous.wm_pipe[0] != results->wm_pipe[0])
+ I915_WRITE(WM0_PIPEA_ILK, results->wm_pipe[0]);
+ if (previous.wm_pipe[1] != results->wm_pipe[1])
+ I915_WRITE(WM0_PIPEB_ILK, results->wm_pipe[1]);
+ if (previous.wm_pipe[2] != results->wm_pipe[2])
+ I915_WRITE(WM0_PIPEC_IVB, results->wm_pipe[2]);
+
+ if (previous.wm_linetime[0] != results->wm_linetime[0])
+ I915_WRITE(PIPE_WM_LINETIME(PIPE_A), results->wm_linetime[0]);
+ if (previous.wm_linetime[1] != results->wm_linetime[1])
+ I915_WRITE(PIPE_WM_LINETIME(PIPE_B), results->wm_linetime[1]);
+ if (previous.wm_linetime[2] != results->wm_linetime[2])
+ I915_WRITE(PIPE_WM_LINETIME(PIPE_C), results->wm_linetime[2]);
+
+ if (prev_partitioning != partitioning) {
+ val = I915_READ(WM_MISC);
+ if (partitioning == HSW_DATA_BUF_PART_1_2)
+ val &= ~WM_MISC_DATA_PARTITION_5_6;
+ else
+ val |= WM_MISC_DATA_PARTITION_5_6;
+ I915_WRITE(WM_MISC, val);
+ }
+
+ if (previous.wm_lp_spr[0] != results->wm_lp_spr[0])
+ I915_WRITE(WM1S_LP_ILK, results->wm_lp_spr[0]);
+ if (previous.wm_lp_spr[1] != results->wm_lp_spr[1])
+ I915_WRITE(WM2S_LP_IVB, results->wm_lp_spr[1]);
+ if (previous.wm_lp_spr[2] != results->wm_lp_spr[2])
+ I915_WRITE(WM3S_LP_IVB, results->wm_lp_spr[2]);
+
+ if (results->wm_lp[0] != 0)
+ I915_WRITE(WM1_LP_ILK, results->wm_lp[0]);
+ if (results->wm_lp[1] != 0)
+ I915_WRITE(WM2_LP_ILK, results->wm_lp[1]);
+ if (results->wm_lp[2] != 0)
+ I915_WRITE(WM3_LP_ILK, results->wm_lp[2]);
+}
+
+static void haswell_update_wm(struct drm_device *dev)
+{
+ struct drm_i915_private *dev_priv = dev->dev_private;
+ struct hsw_pipe_wm_parameters params[3];
+ struct hsw_wm_values results;
+ uint32_t wm[5];
- sandybridge_update_wm(dev);
+ hsw_compute_wm_parameters(dev, params, wm);
+ hsw_compute_wm_results(dev, params, wm, &results);
+ hsw_write_wm_values(dev_priv, &results, HSW_DATA_BUF_PART_1_2);
}
static void haswell_update_sprite_wm(struct drm_device *dev, int pipe,
--
1.8.1.2
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH 4/5] drm/i915: properly set HSW WM_LP watermarks
2013-05-24 16:11 ` Ville Syrjälä
@ 2013-05-24 22:05 ` Paulo Zanoni
2013-05-29 16:06 ` Ville Syrjälä
0 siblings, 1 reply; 29+ messages in thread
From: Paulo Zanoni @ 2013-05-24 22:05 UTC (permalink / raw)
To: intel-gfx; +Cc: Paulo Zanoni
From: Paulo Zanoni <paulo.r.zanoni@intel.com>
We were previously only setting the WM_PIPE registers, now we are
setting the LP watermark registers. This should allow deeper PC
states, resulting in power savings.
We're only using 1/2 data buffer partitioning for now.
v2: Merge both hsw_compute_pri_wm_* functions (Ville)
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
---
drivers/gpu/drm/i915/i915_reg.h | 4 +
drivers/gpu/drm/i915/intel_pm.c | 201 ++++++++++++++++++++++++++++++++++++----
2 files changed, 187 insertions(+), 18 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index e86606c..58230ea 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -3057,6 +3057,10 @@
#define WM3S_LP_IVB 0x45128
#define WM1S_LP_EN (1<<31)
+#define HSW_WM_LP_VAL(lat, fbc, pri, cur) \
+ (WM3_LP_EN | ((lat) << WM1_LP_LATENCY_SHIFT) | \
+ ((fbc) << WM1_LP_FBC_SHIFT) | ((pri) << WM1_LP_SR_SHIFT) | (cur))
+
/* Memory latency timer register */
#define MLTR_ILK 0x11222
#define MLTR_WM1_SHIFT 0
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index ef58a1a..872e2a8 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -2129,6 +2129,12 @@ static uint32_t hsw_wm_method2(uint32_t pixel_rate, uint32_t pipe_htotal,
return ret;
}
+static uint32_t hsw_wm_fbc(uint32_t pri_val, uint32_t horiz_pixels,
+ uint8_t bytes_per_pixel)
+{
+ return DIV_ROUND_UP(pri_val * 64, horiz_pixels * bytes_per_pixel) + 2;
+}
+
struct hsw_pipe_wm_parameters {
bool active;
bool sprite_enabled;
@@ -2142,11 +2148,28 @@ struct hsw_pipe_wm_parameters {
uint32_t pixel_rate;
};
+struct hsw_wm_maximums {
+ uint16_t pri;
+ uint16_t spr;
+ uint16_t cur;
+ uint16_t fbc;
+};
+
+struct hsw_lp_wm_result {
+ bool enable;
+ bool fbc_enable;
+ uint32_t pri_val;
+ uint32_t spr_val;
+ uint32_t cur_val;
+ uint32_t fbc_val;
+};
+
struct hsw_wm_values {
uint32_t wm_pipe[3];
uint32_t wm_lp[3];
uint32_t wm_lp_spr[3];
uint32_t wm_linetime[3];
+ bool enable_fbc_wm;
};
enum hsw_data_buf_partitioning {
@@ -2154,17 +2177,31 @@ enum hsw_data_buf_partitioning {
HSW_DATA_BUF_PART_5_6,
};
-/* Only for WM_PIPE. */
-static uint32_t hsw_compute_pri_wm_pipe(struct hsw_pipe_wm_parameters *params,
- uint32_t mem_value)
+/* For both WM_PIPE and WM_LP. */
+static uint32_t hsw_compute_pri_wm(struct hsw_pipe_wm_parameters *params,
+ uint32_t mem_value,
+ bool is_lp)
{
+ uint32_t method1, method2;
+
/* TODO: for now, assume the primary plane is always enabled. */
if (!params->active)
return 0;
- return hsw_wm_method1(params->pixel_rate,
- params->pri_bytes_per_pixel,
- mem_value);
+ method1 = hsw_wm_method1(params->pixel_rate,
+ params->pri_bytes_per_pixel,
+ mem_value);
+
+ if (!is_lp)
+ return method1;
+
+ method2 = hsw_wm_method2(params->pixel_rate,
+ params->pipe_htotal,
+ params->pri_horiz_pixels,
+ params->pri_bytes_per_pixel,
+ mem_value);
+
+ return min(method1, method2);
}
/* For both WM_PIPE and WM_LP. */
@@ -2201,13 +2238,59 @@ static uint32_t hsw_compute_cur_wm(struct hsw_pipe_wm_parameters *params,
mem_value);
}
+/* Only for WM_LP. */
+static uint32_t hsw_compute_fbc_wm(struct hsw_pipe_wm_parameters *params,
+ uint32_t pri_val,
+ uint32_t mem_value)
+{
+ if (!params->active)
+ return 0;
+
+ return hsw_wm_fbc(pri_val,
+ params->pri_horiz_pixels,
+ params->pri_bytes_per_pixel);
+}
+
+static void hsw_compute_lp_wm(uint32_t mem_value, struct hsw_wm_maximums *max,
+ struct hsw_pipe_wm_parameters *params,
+ struct hsw_lp_wm_result *result)
+{
+ enum pipe pipe;
+ uint32_t pri_val[3], spr_val[3], cur_val[3], fbc_val[3];
+
+ for (pipe = PIPE_A; pipe <= PIPE_C; pipe++) {
+ struct hsw_pipe_wm_parameters *p = ¶ms[pipe];
+
+ pri_val[pipe] = hsw_compute_pri_wm(p, mem_value, true);
+ spr_val[pipe] = hsw_compute_spr_wm(p, mem_value);
+ cur_val[pipe] = hsw_compute_cur_wm(p, mem_value);
+ fbc_val[pipe] = hsw_compute_fbc_wm(p, pri_val[pipe], mem_value);
+ }
+
+ result->pri_val = max3(pri_val[0], pri_val[1], pri_val[2]);
+ result->spr_val = max3(spr_val[0], spr_val[1], spr_val[2]);
+ result->cur_val = max3(cur_val[0], cur_val[1], cur_val[2]);
+ result->fbc_val = max3(fbc_val[0], fbc_val[1], fbc_val[2]);
+
+ if (result->fbc_val > max->fbc) {
+ result->fbc_enable = false;
+ result->fbc_val = 0;
+ } else {
+ result->fbc_enable = true;
+ }
+
+ result->enable = result->pri_val <= max->pri &&
+ result->spr_val <= max->spr &&
+ result->cur_val <= max->cur;
+}
+
static uint32_t hsw_compute_wm_pipe(struct drm_i915_private *dev_priv,
uint32_t mem_value, enum pipe pipe,
struct hsw_pipe_wm_parameters *params)
{
uint32_t pri_val, cur_val, spr_val;
- pri_val = hsw_compute_pri_wm_pipe(params, mem_value);
+ pri_val = hsw_compute_pri_wm(params, mem_value, false);
spr_val = hsw_compute_spr_wm(params, mem_value);
cur_val = hsw_compute_cur_wm(params, mem_value);
@@ -2250,13 +2333,15 @@ hsw_compute_linetime_wm(struct drm_device *dev, struct drm_crtc *crtc)
static void hsw_compute_wm_parameters(struct drm_device *dev,
struct hsw_pipe_wm_parameters *params,
- uint32_t *wm)
+ uint32_t *wm,
+ struct hsw_wm_maximums *lp_max_1_2)
{
struct drm_i915_private *dev_priv = dev->dev_private;
struct drm_crtc *crtc;
struct drm_plane *plane;
uint64_t sskpd = I915_READ64(MCH_SSKPD);
enum pipe pipe;
+ int pipes_active = 0, sprites_enabled = 0;
if ((sskpd >> 56) & 0xFF)
wm[0] = (sskpd >> 56) & 0xFF;
@@ -2278,6 +2363,8 @@ static void hsw_compute_wm_parameters(struct drm_device *dev,
if (!p->active)
continue;
+ pipes_active++;
+
p->pipe_htotal = intel_crtc->config.adjusted_mode.htotal;
p->pixel_rate = hsw_wm_get_pixel_rate(dev, crtc);
p->pri_bytes_per_pixel = crtc->fb->bits_per_pixel / 8;
@@ -2297,25 +2384,89 @@ static void hsw_compute_wm_parameters(struct drm_device *dev,
p->sprite_enabled = intel_plane->wm.enable;
p->spr_bytes_per_pixel = intel_plane->wm.bytes_per_pixel;
p->spr_horiz_pixels = intel_plane->wm.horiz_pixels;
+
+ if (p->sprite_enabled)
+ sprites_enabled++;
+ }
+
+ if (pipes_active > 1) {
+ lp_max_1_2->pri = sprites_enabled ? 128 : 256;
+ lp_max_1_2->spr = 128;
+ lp_max_1_2->cur = 64;
+ } else {
+ lp_max_1_2->pri = sprites_enabled ? 384 : 768;
+ lp_max_1_2->spr = 384;
+ lp_max_1_2->cur = 255;
}
+ lp_max_1_2->fbc = 15;
}
static void hsw_compute_wm_results(struct drm_device *dev,
struct hsw_pipe_wm_parameters *params,
uint32_t *wm,
+ struct hsw_wm_maximums *lp_maximums,
struct hsw_wm_values *results)
{
struct drm_i915_private *dev_priv = dev->dev_private;
struct drm_crtc *crtc;
+ struct hsw_lp_wm_result lp_results[4];
enum pipe pipe;
+ int i;
- /* No support for LP WMs yet. */
- results->wm_lp[2] = 0;
- results->wm_lp[1] = 0;
- results->wm_lp[0] = 0;
- results->wm_lp_spr[2] = 0;
- results->wm_lp_spr[1] = 0;
- results->wm_lp_spr[0] = 0;
+ hsw_compute_lp_wm(wm[1], lp_maximums, params, &lp_results[0]);
+ hsw_compute_lp_wm(wm[2], lp_maximums, params, &lp_results[1]);
+ hsw_compute_lp_wm(wm[3], lp_maximums, params, &lp_results[2]);
+ hsw_compute_lp_wm(wm[4], lp_maximums, params, &lp_results[3]);
+
+ /* The spec says it is preferred to disable FBC WMs instead of disabling
+ * a WM level. */
+ results->enable_fbc_wm = true;
+ for (i = 0; i < 4; i++) {
+ if (lp_results[i].enable && !lp_results[i].fbc_enable) {
+ results->enable_fbc_wm = false;
+ break;
+ }
+ }
+
+ if (lp_results[3].enable) {
+ results->wm_lp[2] = HSW_WM_LP_VAL(8, lp_results[3].fbc_val,
+ lp_results[3].pri_val,
+ lp_results[3].cur_val);
+ results->wm_lp_spr[2] = lp_results[3].spr_val;
+ } else if (lp_results[2].enable) {
+ results->wm_lp[2] = HSW_WM_LP_VAL(6, lp_results[2].fbc_val,
+ lp_results[2].pri_val,
+ lp_results[2].cur_val);
+ results->wm_lp_spr[2] = lp_results[2].spr_val;
+ } else {
+ results->wm_lp[2] = 0;
+ results->wm_lp_spr[2] = 0;
+ }
+
+ if (lp_results[3].enable && lp_results[2].enable) {
+ results->wm_lp[1] = HSW_WM_LP_VAL(6, lp_results[2].fbc_val,
+ lp_results[2].pri_val,
+ lp_results[2].cur_val);
+ results->wm_lp_spr[1] = lp_results[2].spr_val;
+ } else if (!lp_results[3].enable && lp_results[1].enable) {
+ results->wm_lp[1] = HSW_WM_LP_VAL(4, lp_results[1].fbc_val,
+ lp_results[1].pri_val,
+ lp_results[1].cur_val);
+ results->wm_lp_spr[1] = lp_results[1].spr_val;
+ } else {
+ results->wm_lp[1] = 0;
+ results->wm_lp_spr[1] = 0;
+ }
+
+ if (lp_results[0].enable) {
+ results->wm_lp[0] = HSW_WM_LP_VAL(2, lp_results[0].fbc_val,
+ lp_results[0].pri_val,
+ lp_results[0].cur_val);
+ results->wm_lp_spr[0] = lp_results[0].spr_val;
+ } else {
+ results->wm_lp[0] = 0;
+ results->wm_lp_spr[0] = 0;
+ }
for_each_pipe(pipe)
results->wm_pipe[pipe] = hsw_compute_wm_pipe(dev_priv, wm[0],
@@ -2339,6 +2490,7 @@ static void hsw_write_wm_values(struct drm_i915_private *dev_priv,
struct hsw_wm_values previous;
uint32_t val;
enum hsw_data_buf_partitioning prev_partitioning;
+ bool prev_enable_fbc_wm;
previous.wm_pipe[0] = I915_READ(WM0_PIPEA_ILK);
previous.wm_pipe[1] = I915_READ(WM0_PIPEB_ILK);
@@ -2356,11 +2508,14 @@ static void hsw_write_wm_values(struct drm_i915_private *dev_priv,
prev_partitioning = (I915_READ(WM_MISC) & WM_MISC_DATA_PARTITION_5_6) ?
HSW_DATA_BUF_PART_5_6 : HSW_DATA_BUF_PART_1_2;
+ prev_enable_fbc_wm = !(I915_READ(DISP_ARB_CTL) & DISP_FBC_WM_DIS);
+
if (memcmp(results->wm_pipe, previous.wm_pipe, 3) == 0 &&
memcmp(results->wm_lp, previous.wm_lp, 3) == 0 &&
memcmp(results->wm_lp_spr, previous.wm_lp_spr, 3) == 0 &&
memcmp(results->wm_linetime, previous.wm_linetime, 3) == 0 &&
- partitioning == prev_partitioning)
+ partitioning == prev_partitioning &&
+ results->enable_fbc_wm == prev_enable_fbc_wm)
return;
if (previous.wm_lp[2] != 0)
@@ -2393,6 +2548,15 @@ static void hsw_write_wm_values(struct drm_i915_private *dev_priv,
I915_WRITE(WM_MISC, val);
}
+ if (prev_enable_fbc_wm != results->enable_fbc_wm) {
+ val = I915_READ(DISP_ARB_CTL);
+ if (results->enable_fbc_wm)
+ val &= ~DISP_FBC_WM_DIS;
+ else
+ val |= DISP_FBC_WM_DIS;
+ I915_WRITE(DISP_ARB_CTL, val);
+ }
+
if (previous.wm_lp_spr[0] != results->wm_lp_spr[0])
I915_WRITE(WM1S_LP_ILK, results->wm_lp_spr[0]);
if (previous.wm_lp_spr[1] != results->wm_lp_spr[1])
@@ -2411,12 +2575,13 @@ static void hsw_write_wm_values(struct drm_i915_private *dev_priv,
static void haswell_update_wm(struct drm_device *dev)
{
struct drm_i915_private *dev_priv = dev->dev_private;
+ struct hsw_wm_maximums lp_max_1_2;
struct hsw_pipe_wm_parameters params[3];
struct hsw_wm_values results;
uint32_t wm[5];
- hsw_compute_wm_parameters(dev, params, wm);
- hsw_compute_wm_results(dev, params, wm, &results);
+ hsw_compute_wm_parameters(dev, params, wm, &lp_max_1_2);
+ hsw_compute_wm_results(dev, params, wm, &lp_max_1_2, &results);
hsw_write_wm_values(dev_priv, &results, HSW_DATA_BUF_PART_1_2);
}
--
1.8.1.2
^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [PATCH 3/5] drm/i915: properly set HSW WM_PIPE registers
2013-05-24 22:00 ` Paulo Zanoni
2013-05-24 22:02 ` Paulo Zanoni
@ 2013-05-27 11:07 ` Ville Syrjälä
2013-05-27 19:21 ` Paulo Zanoni
1 sibling, 1 reply; 29+ messages in thread
From: Ville Syrjälä @ 2013-05-27 11:07 UTC (permalink / raw)
To: Paulo Zanoni; +Cc: intel-gfx, Paulo Zanoni
On Fri, May 24, 2013 at 07:00:42PM -0300, Paulo Zanoni wrote:
> 2013/5/24 Ville Syrjälä <ville.syrjala@linux.intel.com>:
> > On Fri, May 24, 2013 at 11:59:19AM -0300, Paulo Zanoni wrote:
> >> From: Paulo Zanoni <paulo.r.zanoni@intel.com>
> >>
> >> We were previously calling sandybridge_update_wm on HSW, but the SNB
> >> function didn't really match the HSW specification, so we were just
> >> writing the wrong values.
> >>
> >> With this patch, the haswell_update_wm function will set the correct
> >> values for the WM_PIPE registers, but it will still keep all the LP
> >> watermarks disabled.
> >>
> >> The patch may look a little bit over-complicated for now, but it's
> >> because much of the infrastructure for setting the LP watermarks is
> >> already in place, so we won't have too much code churn on the patch
> >> that sets the LP watermarks.
> >>
> >> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
> >> ---
> >> drivers/gpu/drm/i915/i915_reg.h | 3 +
> >> drivers/gpu/drm/i915/intel_pm.c | 340 +++++++++++++++++++++++++++++++++++++---
> >> 2 files changed, 325 insertions(+), 18 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> >> index 55caedb..e86606c 100644
> >> --- a/drivers/gpu/drm/i915/i915_reg.h
> >> +++ b/drivers/gpu/drm/i915/i915_reg.h
> >> @@ -4938,6 +4938,9 @@
> >> #define SFUSE_STRAP_DDIC_DETECTED (1<<1)
> >> #define SFUSE_STRAP_DDID_DETECTED (1<<0)
> >>
> >> +#define WM_MISC 0x45260
> >> +#define WM_MISC_DATA_PARTITION_5_6 (1 << 0)
> >> +
> >> #define WM_DBG 0x45280
> >> #define WM_DBG_DISALLOW_MULTIPLE_LP (1<<0)
> >> #define WM_DBG_DISALLOW_MAXFIFO (1<<1)
> >> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> >> index 0b61a0e..2ee1d01 100644
> >> --- a/drivers/gpu/drm/i915/intel_pm.c
> >> +++ b/drivers/gpu/drm/i915/intel_pm.c
> >> @@ -2072,19 +2072,173 @@ static void ivybridge_update_wm(struct drm_device *dev)
> >> cursor_wm);
> >> }
> >>
> >> -static void
> >> -haswell_update_linetime_wm(struct drm_device *dev, struct drm_crtc *crtc)
> >> +static uint32_t hsw_wm_get_pixel_rate(struct drm_device *dev,
> >> + struct drm_crtc *crtc)
> >> +{
> >> + struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
> >> + uint32_t pixel_rate, pfit_size;
> >> +
> >> + if (intel_crtc->config.pixel_target_clock)
> >> + pixel_rate = intel_crtc->config.pixel_target_clock;
> >> + else
> >> + pixel_rate = intel_crtc->config.adjusted_mode.clock;
> >> +
> >> + /* We only use IF-ID interlacing. If we ever use PF-ID we'll need to
> >> + * adjust the pixel_rate here. */
> >> +
> >> + pfit_size = intel_crtc->config.pch_pfit.size;
> >> + if (pfit_size) {
> >> + uint64_t x, y, crtc_x, crtc_y, hscale, vscale, totscale;
> >> +
> >> + x = (pfit_size >> 16) & 0xFFFF;
> >> + y = pfit_size & 0xFFFF;
> >> + crtc_x = intel_crtc->config.adjusted_mode.hdisplay;
> >> + crtc_y = intel_crtc->config.adjusted_mode.vdisplay;
> >> +
> >> + hscale = crtc_x << 16;
> >> + vscale = crtc_y << 16;
> >> + do_div(hscale, x);
> >> + do_div(vscale, y);
> >> + hscale = (hscale < (1 << 16)) ? (1 << 16) : hscale;
> >> + vscale = (vscale < (1 << 16)) ? (1 << 16) : vscale;
> >> + totscale = (hscale * vscale) >> 16;
> >> + pixel_rate = (pixel_rate * totscale) >> 16;
> >
> > No need for fixed point math if you go 64bits, and as stated before
> > the scaling ratio is still being miscaclulated due to the use of
> > adjusted_mode.
> >
> > Something like this ought to do it:
> >
> > in_w = req_mode.hdisplay;
> > in_h = req_mode.vdisplay;
> > out_w = (pfit_size >> 16) & 0xffff;
> > out_h = pfit_size & 0xffff;
> > if (in_w <= out_w)
> > in_w = out_w;
> > if (in_h <= out_h)
> > in_h = out_h;
> >
> > pixel_rate = div_u64((uint64_t) pixel_rate * in_w * in_h, out_w * out_h);
>
> Ok, I re-checked and you were right. Fixed. Sorry for insisting :(
>
>
> >
> >> + }
> >> +
> >> + return pixel_rate;
> >> +}
> >> +
> >> +static uint32_t hsw_wm_method1(uint32_t pixel_rate, uint8_t bytes_per_pixel,
> >> + uint32_t latency)
> >> +{
> >> + uint64_t tmp;
> >> + uint32_t ret;
> >> +
> >> + tmp = pixel_rate * bytes_per_pixel * latency;
> >
> > Would need a cast to make the multiplications actually 64bit. 'ret' is
> > also pointless.
>
> Oops... Fixed.
>
>
> >
> >> + ret = DIV_ROUND_UP_ULL(tmp, 64 * 10000) + 2;
> >> +
> >> + return ret;
> >> +}
> >> +
> >> +static uint32_t hsw_wm_method2(uint32_t pixel_rate, uint32_t pipe_htotal,
> >> + uint32_t horiz_pixels, uint8_t bytes_per_pixel,
> >> + uint32_t latency)
> >> +{
> >> + uint32_t ret;
> >> +
> >> + ret = DIV_ROUND_UP(pipe_htotal * 1000, pixel_rate);
> >> + ret = ((latency / (ret * 10)) + 1) * horiz_pixels * bytes_per_pixel;
> >
> > w/ 64bit maths this could be:
> >
> > tmp = (uint64_t) latency * pixel_rate * 100;
> > ret = (div_u64(tmp, pipe_htotal) + 1) * horiz_pixels * bytes_per_pixel
>
> I did the math on a paper and your formula doesn't look correct. For
> latency=10 rate=120000 pipe_htotal=2000 horiz_pixels=1500 bpp=4 the
> correct value should be 96, but your formula gives me a really huge
> value.
Yeah looks like I had the *100 at the wrong place.
> Besides, I like having the formula match BSpec exactly. And I
> can't see how the current code would give us overflows, that's why I
> kept it using uint32_t.
It doesn't overflow, but it does round up the the line time value,
which the bspec formula doesn't do.
> >> + ret = DIV_ROUND_UP(ret, 64) + 2;
> >> + return ret;
> >> +}
> >> +
> >> +struct hsw_pipe_wm_parameters {
> >> + bool active;
> >> + bool sprite_enabled;
> >> + uint8_t pri_bytes_per_pixel;
> >> + uint8_t spr_bytes_per_pixel;
> >> + uint8_t cur_bytes_per_pixel;
> >> + uint32_t pri_horiz_pixels;
> >> + uint32_t spr_horiz_pixels;
> >> + uint32_t cur_horiz_pixels;
> >> + uint32_t pipe_htotal;
> >> + uint32_t pixel_rate;
> >> +};
> >> +
> >> +struct hsw_wm_values {
> >> + uint32_t wm_pipe[3];
> >> + uint32_t wm_lp[3];
> >> + uint32_t wm_lp_spr[3];
> >> + uint32_t wm_linetime[3];
> >> +};
> >> +
> >> +enum hsw_data_buf_partitioning {
> >> + HSW_DATA_BUF_PART_1_2,
> >> + HSW_DATA_BUF_PART_5_6,
> >> +};
> >> +
> >> +/* Only for WM_PIPE. */
> >> +static uint32_t hsw_compute_pri_wm_pipe(struct hsw_pipe_wm_parameters *params,
> >> + uint32_t mem_value)
> >> +{
> >> + /* TODO: for now, assume the primary plane is always enabled. */
> >> + if (!params->active)
> >> + return 0;
> >> +
> >> + return hsw_wm_method1(params->pixel_rate,
> >> + params->pri_bytes_per_pixel,
> >> + mem_value);
> >> +}
> >> +
> >> +/* For both WM_PIPE and WM_LP. */
> >> +static uint32_t hsw_compute_spr_wm(struct hsw_pipe_wm_parameters *params,
> >> + uint32_t mem_value)
> >> +{
> >> + uint32_t method1, method2;
> >> +
> >> + if (!params->active || !params->sprite_enabled)
> >> + return 0;
> >> +
> >> + method1 = hsw_wm_method1(params->pixel_rate,
> >> + params->spr_bytes_per_pixel,
> >> + mem_value);
> >> + method2 = hsw_wm_method2(params->pixel_rate,
> >> + params->pipe_htotal,
> >> + params->spr_horiz_pixels,
> >> + params->spr_bytes_per_pixel,
> >> + mem_value);
> >> + return min(method1, method2);
> >> +}
> >> +
> >> +/* For both WM_PIPE and WM_LP. */
> >> +static uint32_t hsw_compute_cur_wm(struct hsw_pipe_wm_parameters *params,
> >> + uint32_t mem_value)
> >> +{
> >> + if (!params->active)
> >> + return 0;
> >> +
> >> + return hsw_wm_method2(params->pixel_rate,
> >> + params->pipe_htotal,
> >> + params->cur_horiz_pixels,
> >> + params->cur_bytes_per_pixel,
> >> + mem_value);
> >> +}
> >> +
> >> +static uint32_t hsw_compute_wm_pipe(struct drm_i915_private *dev_priv,
> >> + uint32_t mem_value, enum pipe pipe,
> >> + struct hsw_pipe_wm_parameters *params)
> >> +{
> >> + uint32_t pri_val, cur_val, spr_val;
> >> +
> >> + pri_val = hsw_compute_pri_wm_pipe(params, mem_value);
> >> + spr_val = hsw_compute_spr_wm(params, mem_value);
> >> + cur_val = hsw_compute_cur_wm(params, mem_value);
> >> +
> >> + WARN(pri_val > 127,
> >> + "Primary WM error, mode not supported for pipe %c\n",
> >> + pipe_name(pipe));
> >> + WARN(spr_val > 127,
> >> + "Sprite WM error, mode not supported for pipe %c\n",
> >> + pipe_name(pipe));
> >> + WARN(cur_val > 63,
> >> + "Cursor WM error, mode not supported for pipe %c\n",
> >> + pipe_name(pipe));
> >> +
> >> + return (pri_val << WM0_PIPE_PLANE_SHIFT) |
> >> + (spr_val << WM0_PIPE_SPRITE_SHIFT) |
> >> + cur_val;
> >> +}
> >> +
> >> +static uint32_t
> >> +hsw_compute_linetime_wm(struct drm_device *dev, struct drm_crtc *crtc)
> >> {
> >> struct drm_i915_private *dev_priv = dev->dev_private;
> >> struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
> >> - enum pipe pipe = intel_crtc->pipe;
> >> struct drm_display_mode *mode = &intel_crtc->config.adjusted_mode;
> >> u32 linetime, ips_linetime;
> >>
> >> - if (!intel_crtc_active(crtc)) {
> >> - I915_WRITE(PIPE_WM_LINETIME(pipe), 0);
> >> - return;
> >> - }
> >> + if (!intel_crtc_active(crtc))
> >> + return 0;
> >>
> >> /* The WM are computed with base on how long it takes to fill a single
> >> * row at the given clock rate, multiplied by 8.
> >> @@ -2093,29 +2247,179 @@ haswell_update_linetime_wm(struct drm_device *dev, struct drm_crtc *crtc)
> >> ips_linetime = DIV_ROUND_CLOSEST(mode->htotal * 1000 * 8,
> >> intel_ddi_get_cdclk_freq(dev_priv));
> >>
> >> - I915_WRITE(PIPE_WM_LINETIME(pipe),
> >> - PIPE_WM_LINETIME_IPS_LINETIME(ips_linetime) |
> >> - PIPE_WM_LINETIME_TIME(linetime));
> >> + return PIPE_WM_LINETIME_IPS_LINETIME(ips_linetime) |
> >> + PIPE_WM_LINETIME_TIME(linetime);
> >> }
> >>
> >> -static void haswell_update_wm(struct drm_device *dev)
> >> +static void hsw_compute_wm_parameters(struct drm_device *dev,
> >> + struct hsw_pipe_wm_parameters *params,
> >> + uint32_t *wm)
> >> {
> >> struct drm_i915_private *dev_priv = dev->dev_private;
> >> struct drm_crtc *crtc;
> >> + struct drm_plane *plane;
> >> + uint64_t sskpd = I915_READ64(MCH_SSKPD);
> >> enum pipe pipe;
> >>
> >> - /* Disable the LP WMs before changine the linetime registers. This is
> >> - * just a temporary code that will be replaced soon. */
> >> - I915_WRITE(WM3_LP_ILK, 0);
> >> - I915_WRITE(WM2_LP_ILK, 0);
> >> - I915_WRITE(WM1_LP_ILK, 0);
> >> + if ((sskpd >> 56) & 0xFF)
> >> + wm[0] = (sskpd >> 56) & 0xFF;
> >> + else
> >> + wm[0] = sskpd & 0xF;
> >> + wm[1] = ((sskpd >> 4) & 0xFF) * 5;
> >> + wm[2] = ((sskpd >> 12) & 0xFF) * 5;
> >> + wm[3] = ((sskpd >> 20) & 0x1FF) * 5;
> >> + wm[4] = ((sskpd >> 32) & 0x1FF) * 5;
> >> +
> >> + list_for_each_entry(crtc, &dev->mode_config.crtc_list, head) {
> >> + struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
> >> + struct hsw_pipe_wm_parameters *p;
> >> +
> >> + pipe = intel_crtc->pipe;
> >> + p = ¶ms[pipe];
> >> +
> >> + p->active = intel_crtc_active(crtc);
> >> + if (!p->active)
> >> + continue;
> >> +
> >> + p->pipe_htotal = intel_crtc->config.adjusted_mode.htotal;
> >> + p->pixel_rate = hsw_wm_get_pixel_rate(dev, crtc);
> >> + p->pri_bytes_per_pixel = crtc->fb->bits_per_pixel / 8;
> >> + p->cur_bytes_per_pixel = 4;
> >> + p->pri_horiz_pixels = intel_crtc->config.adjusted_mode.hdisplay;
> >> + p->cur_horiz_pixels = 64;
> >> + }
> >> +
> >> + list_for_each_entry(plane, &dev->mode_config.plane_list, head) {
> >> + struct intel_plane *intel_plane = to_intel_plane(plane);
> >> + struct hsw_pipe_wm_parameters *p;
> >> +
> >> + pipe = intel_plane->pipe;
> >> + p = ¶ms[pipe];
> >> +
> >> + p->sprite_enabled = intel_plane->wm.enable;
> >> + p->spr_bytes_per_pixel = intel_plane->wm.bytes_per_pixel;
> >> + p->spr_horiz_pixels = intel_plane->wm.horiz_pixels;
> >> + }
> >> +}
> >> +
> >> +static void hsw_compute_wm_results(struct drm_device *dev,
> >> + struct hsw_pipe_wm_parameters *params,
> >> + uint32_t *wm,
> >> + struct hsw_wm_values *results)
> >> +{
> >> + struct drm_i915_private *dev_priv = dev->dev_private;
> >> + struct drm_crtc *crtc;
> >> + enum pipe pipe;
> >> +
> >> + /* No support for LP WMs yet. */
> >> + results->wm_lp[2] = 0;
> >> + results->wm_lp[1] = 0;
> >> + results->wm_lp[0] = 0;
> >> + results->wm_lp_spr[2] = 0;
> >> + results->wm_lp_spr[1] = 0;
> >> + results->wm_lp_spr[0] = 0;
> >> +
> >> + for_each_pipe(pipe)
> >> + results->wm_pipe[pipe] = hsw_compute_wm_pipe(dev_priv, wm[0],
> >> + pipe,
> >> + ¶ms[pipe]);
> >>
> >> for_each_pipe(pipe) {
> >> crtc = dev_priv->pipe_to_crtc_mapping[pipe];
> >> - haswell_update_linetime_wm(dev, crtc);
> >> + results->wm_linetime[pipe] = hsw_compute_linetime_wm(dev, crtc);
> >> }
> >> +}
> >> +
> >> +/*
> >> + * The spec says we shouldn't write when we don't need, because every write
> >> + * causes WMs to be re-evaluated, expending some power.
> >> + */
> >> +static void hsw_write_wm_values(struct drm_i915_private *dev_priv,
> >> + struct hsw_wm_values *results,
> >> + enum hsw_data_buf_partitioning partitioning)
> >> +{
> >> + struct hsw_wm_values previous;
> >> + uint32_t val;
> >> + enum hsw_data_buf_partitioning prev_partitioning;
> >> +
> >> + previous.wm_pipe[0] = I915_READ(WM0_PIPEA_ILK);
> >> + previous.wm_pipe[1] = I915_READ(WM0_PIPEB_ILK);
> >> + previous.wm_pipe[2] = I915_READ(WM0_PIPEC_IVB);
> >> + previous.wm_lp[0] = I915_READ(WM1_LP_ILK);
> >> + previous.wm_lp[1] = I915_READ(WM2_LP_ILK);
> >> + previous.wm_lp[2] = I915_READ(WM3_LP_ILK);
> >> + previous.wm_lp_spr[0] = I915_READ(WM1S_LP_ILK);
> >> + previous.wm_lp_spr[1] = I915_READ(WM2S_LP_IVB);
> >> + previous.wm_lp_spr[2] = I915_READ(WM3S_LP_IVB);
> >> + previous.wm_linetime[0] = I915_READ(PIPE_WM_LINETIME(PIPE_A));
> >> + previous.wm_linetime[1] = I915_READ(PIPE_WM_LINETIME(PIPE_B));
> >> + previous.wm_linetime[2] = I915_READ(PIPE_WM_LINETIME(PIPE_C));
> >> +
> >> + prev_partitioning = (I915_READ(WM_MISC) & WM_MISC_DATA_PARTITION_5_6) ?
> >> + HSW_DATA_BUF_PART_5_6 : HSW_DATA_BUF_PART_1_2;
> >> +
> >> + if (memcmp(results->wm_pipe, previous.wm_pipe, 3) == 0 &&
> >> + memcmp(results->wm_lp, previous.wm_lp, 3) == 0 &&
> >> + memcmp(results->wm_lp_spr, previous.wm_lp_spr, 3) == 0 &&
> >> + memcmp(results->wm_linetime, previous.wm_linetime, 3) == 0 &&
> >> + partitioning == prev_partitioning)
> >> + return;
> >> +
> >> + if (previous.wm_lp[2] != 0)
> >> + I915_WRITE(WM3_LP_ILK, 0);
> >> + if (previous.wm_lp[1] != 0)
> >> + I915_WRITE(WM2_LP_ILK, 0);
> >> + if (previous.wm_lp[0] != 0)
> >> + I915_WRITE(WM1_LP_ILK, 0);
> >
> > I don't know if this conditional writing makes sense in such a fine
> > granularity. We're anyway going to write some of the registeres, so
> > maybe it's better to just go ahead and write all of them. It would
> > at least make the code look a bit better.
>
> The documentation says "Do not write the watermark registers when
> there is no need to change a value, as every write will cause the
> watermarks to be re-evaluated, expending some power.".
That comment is quite ambigous. It doesn't really tell you whether
there's a significant difference in writing just one register or all
of them.
> I do recognize
> the function looks a little bit ugly, but I think it's worth the cost,
> especially since I imagine we're not going to change it too much in
> the future.
How can you know if it's worth the cost if you haven't measured it?
> > In any case you'd at least need to make sure that you disable/re-enable
> > the LP1+ watermarks if linetime WMs or DDB partitioning changes,
> > regardless of whether the LP1+ watermarks themselves changed.
>
> We already do this.
Oh right. My eyes glaced over when looking at that function, so I
misread what it did.
>
> Thanks again for the review,
> Paulo
>
>
> >
> >> +
> >> + if (previous.wm_pipe[0] != results->wm_pipe[0])
> >> + I915_WRITE(WM0_PIPEA_ILK, results->wm_pipe[0]);
> >> + if (previous.wm_pipe[1] != results->wm_pipe[1])
> >> + I915_WRITE(WM0_PIPEB_ILK, results->wm_pipe[1]);
> >> + if (previous.wm_pipe[2] != results->wm_pipe[2])
> >> + I915_WRITE(WM0_PIPEC_IVB, results->wm_pipe[2]);
> >> +
> >> + if (previous.wm_linetime[0] != results->wm_linetime[0])
> >> + I915_WRITE(PIPE_WM_LINETIME(PIPE_A), results->wm_linetime[0]);
> >> + if (previous.wm_linetime[1] != results->wm_linetime[1])
> >> + I915_WRITE(PIPE_WM_LINETIME(PIPE_B), results->wm_linetime[1]);
> >> + if (previous.wm_linetime[2] != results->wm_linetime[2])
> >> + I915_WRITE(PIPE_WM_LINETIME(PIPE_C), results->wm_linetime[2]);
> >> +
> >> + if (prev_partitioning != partitioning) {
> >> + val = I915_READ(WM_MISC);
> >> + if (partitioning == HSW_DATA_BUF_PART_1_2)
> >> + val &= ~WM_MISC_DATA_PARTITION_5_6;
> >> + else
> >> + val |= WM_MISC_DATA_PARTITION_5_6;
> >> + I915_WRITE(WM_MISC, val);
> >> + }
> >> +
> >> + if (previous.wm_lp_spr[0] != results->wm_lp_spr[0])
> >> + I915_WRITE(WM1S_LP_ILK, results->wm_lp_spr[0]);
> >> + if (previous.wm_lp_spr[1] != results->wm_lp_spr[1])
> >> + I915_WRITE(WM2S_LP_IVB, results->wm_lp_spr[1]);
> >> + if (previous.wm_lp_spr[2] != results->wm_lp_spr[2])
> >> + I915_WRITE(WM3S_LP_IVB, results->wm_lp_spr[2]);
> >> +
> >> + if (results->wm_lp[0] != 0)
> >> + I915_WRITE(WM1_LP_ILK, results->wm_lp[0]);
> >> + if (results->wm_lp[1] != 0)
> >> + I915_WRITE(WM2_LP_ILK, results->wm_lp[1]);
> >> + if (results->wm_lp[2] != 0)
> >> + I915_WRITE(WM3_LP_ILK, results->wm_lp[2]);
> >> +}
> >> +
> >> +static void haswell_update_wm(struct drm_device *dev)
> >> +{
> >> + struct drm_i915_private *dev_priv = dev->dev_private;
> >> + struct hsw_pipe_wm_parameters params[3];
> >> + struct hsw_wm_values results;
> >> + uint32_t wm[5];
> >>
> >> - sandybridge_update_wm(dev);
> >> + hsw_compute_wm_parameters(dev, params, wm);
> >> + hsw_compute_wm_results(dev, params, wm, &results);
> >> + hsw_write_wm_values(dev_priv, &results, HSW_DATA_BUF_PART_1_2);
> >> }
> >>
> >> static void haswell_update_sprite_wm(struct drm_device *dev, int pipe,
> >> --
> >> 1.8.1.2
> >>
> >> _______________________________________________
> >> Intel-gfx mailing list
> >> Intel-gfx@lists.freedesktop.org
> >> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> >
> > --
> > Ville Syrjälä
> > Intel OTC
>
>
>
> --
> Paulo Zanoni
--
Ville Syrjälä
Intel OTC
^ permalink raw reply [flat|nested] 29+ messages in thread
* [PATCH 3/5] drm/i915: properly set HSW WM_PIPE registers
2013-05-27 11:07 ` Ville Syrjälä
@ 2013-05-27 19:21 ` Paulo Zanoni
2013-05-29 15:39 ` Ville Syrjälä
0 siblings, 1 reply; 29+ messages in thread
From: Paulo Zanoni @ 2013-05-27 19:21 UTC (permalink / raw)
To: intel-gfx; +Cc: Paulo Zanoni
From: Paulo Zanoni <paulo.r.zanoni@intel.com>
We were previously calling sandybridge_update_wm on HSW, but the SNB
function didn't really match the HSW specification, so we were just
writing the wrong values.
With this patch, the haswell_update_wm function will set the correct
values for the WM_PIPE registers, but it will still keep all the LP
watermarks disabled.
The patch may look a little bit over-complicated for now, but it's
because much of the infrastructure for setting the LP watermarks is
already in place, so we won't have too much code churn on the patch
that sets the LP watermarks.
v2: - Fix pixel_rate on panel fitter case (Ville)
- Try to not overflow (Ville)
- Remove useless variable (Ville)
- Fix p->pri_horiz_pixels (Paulo)
v3: - Fix rounding errors on hsw_wm_method2 (Ville)
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
---
drivers/gpu/drm/i915/i915_reg.h | 3 +
drivers/gpu/drm/i915/intel_pm.c | 338 +++++++++++++++++++++++++++++++++++++---
2 files changed, 323 insertions(+), 18 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index dbd9de5..5a49f8a 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -4931,6 +4931,9 @@
#define SFUSE_STRAP_DDIC_DETECTED (1<<1)
#define SFUSE_STRAP_DDID_DETECTED (1<<0)
+#define WM_MISC 0x45260
+#define WM_MISC_DATA_PARTITION_5_6 (1 << 0)
+
#define WM_DBG 0x45280
#define WM_DBG_DISALLOW_MULTIPLE_LP (1<<0)
#define WM_DBG_DISALLOW_MAXFIFO (1<<1)
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 9328ed9..5460409 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -2072,19 +2072,170 @@ static void ivybridge_update_wm(struct drm_device *dev)
cursor_wm);
}
-static void
-haswell_update_linetime_wm(struct drm_device *dev, struct drm_crtc *crtc)
+static uint32_t hsw_wm_get_pixel_rate(struct drm_device *dev,
+ struct drm_crtc *crtc)
+{
+ struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
+ uint32_t pixel_rate, pfit_size;
+
+ if (intel_crtc->config.pixel_target_clock)
+ pixel_rate = intel_crtc->config.pixel_target_clock;
+ else
+ pixel_rate = intel_crtc->config.adjusted_mode.clock;
+
+ /* We only use IF-ID interlacing. If we ever use PF-ID we'll need to
+ * adjust the pixel_rate here. */
+
+ pfit_size = intel_crtc->config.pch_pfit.size;
+ if (pfit_size) {
+ uint64_t pipe_w, pipe_h, pfit_w, pfit_h;
+
+ pipe_w = intel_crtc->config.requested_mode.hdisplay;
+ pipe_h = intel_crtc->config.requested_mode.vdisplay;
+ pfit_w = (pfit_size >> 16) & 0xFFFF;
+ pfit_h = pfit_size & 0xFFFF;
+ if (pipe_w < pfit_w)
+ pipe_w = pfit_w;
+ if (pipe_h < pfit_h)
+ pipe_h = pfit_h;
+
+ pixel_rate = div_u64((uint64_t) pixel_rate * pipe_w * pipe_h,
+ pfit_w * pfit_h);
+ }
+
+ return pixel_rate;
+}
+
+static uint32_t hsw_wm_method1(uint32_t pixel_rate, uint8_t bytes_per_pixel,
+ uint32_t latency)
+{
+ uint64_t ret;
+
+ ret = (uint64_t) pixel_rate * bytes_per_pixel * latency;
+ ret = DIV_ROUND_UP_ULL(ret, 64 * 10000) + 2;
+
+ return ret;
+}
+
+static uint32_t hsw_wm_method2(uint32_t pixel_rate, uint32_t pipe_htotal,
+ uint32_t horiz_pixels, uint8_t bytes_per_pixel,
+ uint32_t latency)
+{
+ uint32_t ret;
+
+ ret = (latency * pixel_rate) / (pipe_htotal * 10000);
+ ret = (ret + 1) * horiz_pixels * bytes_per_pixel;
+ ret = DIV_ROUND_UP(ret, 64) + 2;
+ return ret;
+}
+
+struct hsw_pipe_wm_parameters {
+ bool active;
+ bool sprite_enabled;
+ uint8_t pri_bytes_per_pixel;
+ uint8_t spr_bytes_per_pixel;
+ uint8_t cur_bytes_per_pixel;
+ uint32_t pri_horiz_pixels;
+ uint32_t spr_horiz_pixels;
+ uint32_t cur_horiz_pixels;
+ uint32_t pipe_htotal;
+ uint32_t pixel_rate;
+};
+
+struct hsw_wm_values {
+ uint32_t wm_pipe[3];
+ uint32_t wm_lp[3];
+ uint32_t wm_lp_spr[3];
+ uint32_t wm_linetime[3];
+};
+
+enum hsw_data_buf_partitioning {
+ HSW_DATA_BUF_PART_1_2,
+ HSW_DATA_BUF_PART_5_6,
+};
+
+/* Only for WM_PIPE. */
+static uint32_t hsw_compute_pri_wm_pipe(struct hsw_pipe_wm_parameters *params,
+ uint32_t mem_value)
+{
+ /* TODO: for now, assume the primary plane is always enabled. */
+ if (!params->active)
+ return 0;
+
+ return hsw_wm_method1(params->pixel_rate,
+ params->pri_bytes_per_pixel,
+ mem_value);
+}
+
+/* For both WM_PIPE and WM_LP. */
+static uint32_t hsw_compute_spr_wm(struct hsw_pipe_wm_parameters *params,
+ uint32_t mem_value)
+{
+ uint32_t method1, method2;
+
+ if (!params->active || !params->sprite_enabled)
+ return 0;
+
+ method1 = hsw_wm_method1(params->pixel_rate,
+ params->spr_bytes_per_pixel,
+ mem_value);
+ method2 = hsw_wm_method2(params->pixel_rate,
+ params->pipe_htotal,
+ params->spr_horiz_pixels,
+ params->spr_bytes_per_pixel,
+ mem_value);
+ return min(method1, method2);
+}
+
+/* For both WM_PIPE and WM_LP. */
+static uint32_t hsw_compute_cur_wm(struct hsw_pipe_wm_parameters *params,
+ uint32_t mem_value)
+{
+ if (!params->active)
+ return 0;
+
+ return hsw_wm_method2(params->pixel_rate,
+ params->pipe_htotal,
+ params->cur_horiz_pixels,
+ params->cur_bytes_per_pixel,
+ mem_value);
+}
+
+static uint32_t hsw_compute_wm_pipe(struct drm_i915_private *dev_priv,
+ uint32_t mem_value, enum pipe pipe,
+ struct hsw_pipe_wm_parameters *params)
+{
+ uint32_t pri_val, cur_val, spr_val;
+
+ pri_val = hsw_compute_pri_wm_pipe(params, mem_value);
+ spr_val = hsw_compute_spr_wm(params, mem_value);
+ cur_val = hsw_compute_cur_wm(params, mem_value);
+
+ WARN(pri_val > 127,
+ "Primary WM error, mode not supported for pipe %c\n",
+ pipe_name(pipe));
+ WARN(spr_val > 127,
+ "Sprite WM error, mode not supported for pipe %c\n",
+ pipe_name(pipe));
+ WARN(cur_val > 63,
+ "Cursor WM error, mode not supported for pipe %c\n",
+ pipe_name(pipe));
+
+ return (pri_val << WM0_PIPE_PLANE_SHIFT) |
+ (spr_val << WM0_PIPE_SPRITE_SHIFT) |
+ cur_val;
+}
+
+static uint32_t
+hsw_compute_linetime_wm(struct drm_device *dev, struct drm_crtc *crtc)
{
struct drm_i915_private *dev_priv = dev->dev_private;
struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
- enum pipe pipe = intel_crtc->pipe;
struct drm_display_mode *mode = &intel_crtc->config.adjusted_mode;
u32 linetime, ips_linetime;
- if (!intel_crtc_active(crtc)) {
- I915_WRITE(PIPE_WM_LINETIME(pipe), 0);
- return;
- }
+ if (!intel_crtc_active(crtc))
+ return 0;
/* The WM are computed with base on how long it takes to fill a single
* row at the given clock rate, multiplied by 8.
@@ -2093,29 +2244,180 @@ haswell_update_linetime_wm(struct drm_device *dev, struct drm_crtc *crtc)
ips_linetime = DIV_ROUND_CLOSEST(mode->htotal * 1000 * 8,
intel_ddi_get_cdclk_freq(dev_priv));
- I915_WRITE(PIPE_WM_LINETIME(pipe),
- PIPE_WM_LINETIME_IPS_LINETIME(ips_linetime) |
- PIPE_WM_LINETIME_TIME(linetime));
+ return PIPE_WM_LINETIME_IPS_LINETIME(ips_linetime) |
+ PIPE_WM_LINETIME_TIME(linetime);
}
-static void haswell_update_wm(struct drm_device *dev)
+static void hsw_compute_wm_parameters(struct drm_device *dev,
+ struct hsw_pipe_wm_parameters *params,
+ uint32_t *wm)
{
struct drm_i915_private *dev_priv = dev->dev_private;
struct drm_crtc *crtc;
+ struct drm_plane *plane;
+ uint64_t sskpd = I915_READ64(MCH_SSKPD);
enum pipe pipe;
- /* Disable the LP WMs before changine the linetime registers. This is
- * just a temporary code that will be replaced soon. */
- I915_WRITE(WM3_LP_ILK, 0);
- I915_WRITE(WM2_LP_ILK, 0);
- I915_WRITE(WM1_LP_ILK, 0);
+ if ((sskpd >> 56) & 0xFF)
+ wm[0] = (sskpd >> 56) & 0xFF;
+ else
+ wm[0] = sskpd & 0xF;
+ wm[1] = ((sskpd >> 4) & 0xFF) * 5;
+ wm[2] = ((sskpd >> 12) & 0xFF) * 5;
+ wm[3] = ((sskpd >> 20) & 0x1FF) * 5;
+ wm[4] = ((sskpd >> 32) & 0x1FF) * 5;
+
+ list_for_each_entry(crtc, &dev->mode_config.crtc_list, head) {
+ struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
+ struct hsw_pipe_wm_parameters *p;
+
+ pipe = intel_crtc->pipe;
+ p = ¶ms[pipe];
+
+ p->active = intel_crtc_active(crtc);
+ if (!p->active)
+ continue;
+
+ p->pipe_htotal = intel_crtc->config.adjusted_mode.htotal;
+ p->pixel_rate = hsw_wm_get_pixel_rate(dev, crtc);
+ p->pri_bytes_per_pixel = crtc->fb->bits_per_pixel / 8;
+ p->cur_bytes_per_pixel = 4;
+ p->pri_horiz_pixels =
+ intel_crtc->config.requested_mode.hdisplay;
+ p->cur_horiz_pixels = 64;
+ }
+
+ list_for_each_entry(plane, &dev->mode_config.plane_list, head) {
+ struct intel_plane *intel_plane = to_intel_plane(plane);
+ struct hsw_pipe_wm_parameters *p;
+
+ pipe = intel_plane->pipe;
+ p = ¶ms[pipe];
+
+ p->sprite_enabled = intel_plane->wm.enable;
+ p->spr_bytes_per_pixel = intel_plane->wm.bytes_per_pixel;
+ p->spr_horiz_pixels = intel_plane->wm.horiz_pixels;
+ }
+}
+
+static void hsw_compute_wm_results(struct drm_device *dev,
+ struct hsw_pipe_wm_parameters *params,
+ uint32_t *wm,
+ struct hsw_wm_values *results)
+{
+ struct drm_i915_private *dev_priv = dev->dev_private;
+ struct drm_crtc *crtc;
+ enum pipe pipe;
+
+ /* No support for LP WMs yet. */
+ results->wm_lp[2] = 0;
+ results->wm_lp[1] = 0;
+ results->wm_lp[0] = 0;
+ results->wm_lp_spr[2] = 0;
+ results->wm_lp_spr[1] = 0;
+ results->wm_lp_spr[0] = 0;
+
+ for_each_pipe(pipe)
+ results->wm_pipe[pipe] = hsw_compute_wm_pipe(dev_priv, wm[0],
+ pipe,
+ ¶ms[pipe]);
for_each_pipe(pipe) {
crtc = dev_priv->pipe_to_crtc_mapping[pipe];
- haswell_update_linetime_wm(dev, crtc);
+ results->wm_linetime[pipe] = hsw_compute_linetime_wm(dev, crtc);
}
+}
+
+/*
+ * The spec says we shouldn't write when we don't need, because every write
+ * causes WMs to be re-evaluated, expending some power.
+ */
+static void hsw_write_wm_values(struct drm_i915_private *dev_priv,
+ struct hsw_wm_values *results,
+ enum hsw_data_buf_partitioning partitioning)
+{
+ struct hsw_wm_values previous;
+ uint32_t val;
+ enum hsw_data_buf_partitioning prev_partitioning;
+
+ previous.wm_pipe[0] = I915_READ(WM0_PIPEA_ILK);
+ previous.wm_pipe[1] = I915_READ(WM0_PIPEB_ILK);
+ previous.wm_pipe[2] = I915_READ(WM0_PIPEC_IVB);
+ previous.wm_lp[0] = I915_READ(WM1_LP_ILK);
+ previous.wm_lp[1] = I915_READ(WM2_LP_ILK);
+ previous.wm_lp[2] = I915_READ(WM3_LP_ILK);
+ previous.wm_lp_spr[0] = I915_READ(WM1S_LP_ILK);
+ previous.wm_lp_spr[1] = I915_READ(WM2S_LP_IVB);
+ previous.wm_lp_spr[2] = I915_READ(WM3S_LP_IVB);
+ previous.wm_linetime[0] = I915_READ(PIPE_WM_LINETIME(PIPE_A));
+ previous.wm_linetime[1] = I915_READ(PIPE_WM_LINETIME(PIPE_B));
+ previous.wm_linetime[2] = I915_READ(PIPE_WM_LINETIME(PIPE_C));
+
+ prev_partitioning = (I915_READ(WM_MISC) & WM_MISC_DATA_PARTITION_5_6) ?
+ HSW_DATA_BUF_PART_5_6 : HSW_DATA_BUF_PART_1_2;
+
+ if (memcmp(results->wm_pipe, previous.wm_pipe, 3) == 0 &&
+ memcmp(results->wm_lp, previous.wm_lp, 3) == 0 &&
+ memcmp(results->wm_lp_spr, previous.wm_lp_spr, 3) == 0 &&
+ memcmp(results->wm_linetime, previous.wm_linetime, 3) == 0 &&
+ partitioning == prev_partitioning)
+ return;
+
+ if (previous.wm_lp[2] != 0)
+ I915_WRITE(WM3_LP_ILK, 0);
+ if (previous.wm_lp[1] != 0)
+ I915_WRITE(WM2_LP_ILK, 0);
+ if (previous.wm_lp[0] != 0)
+ I915_WRITE(WM1_LP_ILK, 0);
+
+ if (previous.wm_pipe[0] != results->wm_pipe[0])
+ I915_WRITE(WM0_PIPEA_ILK, results->wm_pipe[0]);
+ if (previous.wm_pipe[1] != results->wm_pipe[1])
+ I915_WRITE(WM0_PIPEB_ILK, results->wm_pipe[1]);
+ if (previous.wm_pipe[2] != results->wm_pipe[2])
+ I915_WRITE(WM0_PIPEC_IVB, results->wm_pipe[2]);
+
+ if (previous.wm_linetime[0] != results->wm_linetime[0])
+ I915_WRITE(PIPE_WM_LINETIME(PIPE_A), results->wm_linetime[0]);
+ if (previous.wm_linetime[1] != results->wm_linetime[1])
+ I915_WRITE(PIPE_WM_LINETIME(PIPE_B), results->wm_linetime[1]);
+ if (previous.wm_linetime[2] != results->wm_linetime[2])
+ I915_WRITE(PIPE_WM_LINETIME(PIPE_C), results->wm_linetime[2]);
+
+ if (prev_partitioning != partitioning) {
+ val = I915_READ(WM_MISC);
+ if (partitioning == HSW_DATA_BUF_PART_1_2)
+ val &= ~WM_MISC_DATA_PARTITION_5_6;
+ else
+ val |= WM_MISC_DATA_PARTITION_5_6;
+ I915_WRITE(WM_MISC, val);
+ }
+
+ if (previous.wm_lp_spr[0] != results->wm_lp_spr[0])
+ I915_WRITE(WM1S_LP_ILK, results->wm_lp_spr[0]);
+ if (previous.wm_lp_spr[1] != results->wm_lp_spr[1])
+ I915_WRITE(WM2S_LP_IVB, results->wm_lp_spr[1]);
+ if (previous.wm_lp_spr[2] != results->wm_lp_spr[2])
+ I915_WRITE(WM3S_LP_IVB, results->wm_lp_spr[2]);
+
+ if (results->wm_lp[0] != 0)
+ I915_WRITE(WM1_LP_ILK, results->wm_lp[0]);
+ if (results->wm_lp[1] != 0)
+ I915_WRITE(WM2_LP_ILK, results->wm_lp[1]);
+ if (results->wm_lp[2] != 0)
+ I915_WRITE(WM3_LP_ILK, results->wm_lp[2]);
+}
+
+static void haswell_update_wm(struct drm_device *dev)
+{
+ struct drm_i915_private *dev_priv = dev->dev_private;
+ struct hsw_pipe_wm_parameters params[3];
+ struct hsw_wm_values results;
+ uint32_t wm[5];
- sandybridge_update_wm(dev);
+ hsw_compute_wm_parameters(dev, params, wm);
+ hsw_compute_wm_results(dev, params, wm, &results);
+ hsw_write_wm_values(dev_priv, &results, HSW_DATA_BUF_PART_1_2);
}
static void haswell_update_sprite_wm(struct drm_device *dev, int pipe,
--
1.8.1.2
^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [PATCH 3/5] drm/i915: properly set HSW WM_PIPE registers
2013-05-27 19:21 ` Paulo Zanoni
@ 2013-05-29 15:39 ` Ville Syrjälä
2013-05-31 13:08 ` [PATCH 1/3] " Paulo Zanoni
0 siblings, 1 reply; 29+ messages in thread
From: Ville Syrjälä @ 2013-05-29 15:39 UTC (permalink / raw)
To: Paulo Zanoni; +Cc: intel-gfx, Paulo Zanoni
On Mon, May 27, 2013 at 04:21:19PM -0300, Paulo Zanoni wrote:
> From: Paulo Zanoni <paulo.r.zanoni@intel.com>
>
> We were previously calling sandybridge_update_wm on HSW, but the SNB
> function didn't really match the HSW specification, so we were just
> writing the wrong values.
>
> With this patch, the haswell_update_wm function will set the correct
> values for the WM_PIPE registers, but it will still keep all the LP
> watermarks disabled.
>
> The patch may look a little bit over-complicated for now, but it's
> because much of the infrastructure for setting the LP watermarks is
> already in place, so we won't have too much code churn on the patch
> that sets the LP watermarks.
>
> v2: - Fix pixel_rate on panel fitter case (Ville)
> - Try to not overflow (Ville)
> - Remove useless variable (Ville)
> - Fix p->pri_horiz_pixels (Paulo)
> v3: - Fix rounding errors on hsw_wm_method2 (Ville)
>
> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
> ---
> drivers/gpu/drm/i915/i915_reg.h | 3 +
> drivers/gpu/drm/i915/intel_pm.c | 338 +++++++++++++++++++++++++++++++++++++---
> 2 files changed, 323 insertions(+), 18 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index dbd9de5..5a49f8a 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -4931,6 +4931,9 @@
> #define SFUSE_STRAP_DDIC_DETECTED (1<<1)
> #define SFUSE_STRAP_DDID_DETECTED (1<<0)
>
> +#define WM_MISC 0x45260
> +#define WM_MISC_DATA_PARTITION_5_6 (1 << 0)
> +
> #define WM_DBG 0x45280
> #define WM_DBG_DISALLOW_MULTIPLE_LP (1<<0)
> #define WM_DBG_DISALLOW_MAXFIFO (1<<1)
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index 9328ed9..5460409 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -2072,19 +2072,170 @@ static void ivybridge_update_wm(struct drm_device *dev)
> cursor_wm);
> }
>
> -static void
> -haswell_update_linetime_wm(struct drm_device *dev, struct drm_crtc *crtc)
> +static uint32_t hsw_wm_get_pixel_rate(struct drm_device *dev,
> + struct drm_crtc *crtc)
> +{
> + struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
> + uint32_t pixel_rate, pfit_size;
> +
> + if (intel_crtc->config.pixel_target_clock)
> + pixel_rate = intel_crtc->config.pixel_target_clock;
> + else
> + pixel_rate = intel_crtc->config.adjusted_mode.clock;
> +
> + /* We only use IF-ID interlacing. If we ever use PF-ID we'll need to
> + * adjust the pixel_rate here. */
> +
> + pfit_size = intel_crtc->config.pch_pfit.size;
> + if (pfit_size) {
> + uint64_t pipe_w, pipe_h, pfit_w, pfit_h;
> +
> + pipe_w = intel_crtc->config.requested_mode.hdisplay;
> + pipe_h = intel_crtc->config.requested_mode.vdisplay;
> + pfit_w = (pfit_size >> 16) & 0xFFFF;
> + pfit_h = pfit_size & 0xFFFF;
> + if (pipe_w < pfit_w)
> + pipe_w = pfit_w;
> + if (pipe_h < pfit_h)
> + pipe_h = pfit_h;
> +
> + pixel_rate = div_u64((uint64_t) pixel_rate * pipe_w * pipe_h,
> + pfit_w * pfit_h);
> + }
> +
> + return pixel_rate;
> +}
> +
> +static uint32_t hsw_wm_method1(uint32_t pixel_rate, uint8_t bytes_per_pixel,
> + uint32_t latency)
> +{
> + uint64_t ret;
> +
> + ret = (uint64_t) pixel_rate * bytes_per_pixel * latency;
> + ret = DIV_ROUND_UP_ULL(ret, 64 * 10000) + 2;
> +
> + return ret;
> +}
> +
> +static uint32_t hsw_wm_method2(uint32_t pixel_rate, uint32_t pipe_htotal,
> + uint32_t horiz_pixels, uint8_t bytes_per_pixel,
> + uint32_t latency)
> +{
> + uint32_t ret;
> +
> + ret = (latency * pixel_rate) / (pipe_htotal * 10000);
> + ret = (ret + 1) * horiz_pixels * bytes_per_pixel;
> + ret = DIV_ROUND_UP(ret, 64) + 2;
> + return ret;
That leaves us w/ 20 bits for pixel_rate, which I guess should be enough
for the forseeable future.
> +}
> +
> +struct hsw_pipe_wm_parameters {
> + bool active;
> + bool sprite_enabled;
> + uint8_t pri_bytes_per_pixel;
> + uint8_t spr_bytes_per_pixel;
> + uint8_t cur_bytes_per_pixel;
> + uint32_t pri_horiz_pixels;
> + uint32_t spr_horiz_pixels;
> + uint32_t cur_horiz_pixels;
> + uint32_t pipe_htotal;
> + uint32_t pixel_rate;
> +};
> +
> +struct hsw_wm_values {
> + uint32_t wm_pipe[3];
> + uint32_t wm_lp[3];
> + uint32_t wm_lp_spr[3];
> + uint32_t wm_linetime[3];
> +};
> +
> +enum hsw_data_buf_partitioning {
> + HSW_DATA_BUF_PART_1_2,
> + HSW_DATA_BUF_PART_5_6,
> +};
> +
> +/* Only for WM_PIPE. */
> +static uint32_t hsw_compute_pri_wm_pipe(struct hsw_pipe_wm_parameters *params,
> + uint32_t mem_value)
> +{
> + /* TODO: for now, assume the primary plane is always enabled. */
> + if (!params->active)
> + return 0;
> +
> + return hsw_wm_method1(params->pixel_rate,
> + params->pri_bytes_per_pixel,
> + mem_value);
> +}
> +
> +/* For both WM_PIPE and WM_LP. */
> +static uint32_t hsw_compute_spr_wm(struct hsw_pipe_wm_parameters *params,
> + uint32_t mem_value)
> +{
> + uint32_t method1, method2;
> +
> + if (!params->active || !params->sprite_enabled)
> + return 0;
> +
> + method1 = hsw_wm_method1(params->pixel_rate,
> + params->spr_bytes_per_pixel,
> + mem_value);
> + method2 = hsw_wm_method2(params->pixel_rate,
> + params->pipe_htotal,
> + params->spr_horiz_pixels,
> + params->spr_bytes_per_pixel,
> + mem_value);
> + return min(method1, method2);
> +}
> +
> +/* For both WM_PIPE and WM_LP. */
> +static uint32_t hsw_compute_cur_wm(struct hsw_pipe_wm_parameters *params,
> + uint32_t mem_value)
> +{
> + if (!params->active)
> + return 0;
> +
> + return hsw_wm_method2(params->pixel_rate,
> + params->pipe_htotal,
> + params->cur_horiz_pixels,
> + params->cur_bytes_per_pixel,
> + mem_value);
> +}
> +
> +static uint32_t hsw_compute_wm_pipe(struct drm_i915_private *dev_priv,
> + uint32_t mem_value, enum pipe pipe,
> + struct hsw_pipe_wm_parameters *params)
> +{
> + uint32_t pri_val, cur_val, spr_val;
> +
> + pri_val = hsw_compute_pri_wm_pipe(params, mem_value);
> + spr_val = hsw_compute_spr_wm(params, mem_value);
> + cur_val = hsw_compute_cur_wm(params, mem_value);
> +
> + WARN(pri_val > 127,
> + "Primary WM error, mode not supported for pipe %c\n",
> + pipe_name(pipe));
> + WARN(spr_val > 127,
> + "Sprite WM error, mode not supported for pipe %c\n",
> + pipe_name(pipe));
> + WARN(cur_val > 63,
> + "Cursor WM error, mode not supported for pipe %c\n",
> + pipe_name(pipe));
> +
> + return (pri_val << WM0_PIPE_PLANE_SHIFT) |
> + (spr_val << WM0_PIPE_SPRITE_SHIFT) |
> + cur_val;
> +}
Up to this point most of the functions/structs could use ilk_ prefix
instead of hsw_, but I suppose we can rename it all later...
The rest look OK to me.
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> +
> +static uint32_t
> +hsw_compute_linetime_wm(struct drm_device *dev, struct drm_crtc *crtc)
> {
> struct drm_i915_private *dev_priv = dev->dev_private;
> struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
> - enum pipe pipe = intel_crtc->pipe;
> struct drm_display_mode *mode = &intel_crtc->config.adjusted_mode;
> u32 linetime, ips_linetime;
>
> - if (!intel_crtc_active(crtc)) {
> - I915_WRITE(PIPE_WM_LINETIME(pipe), 0);
> - return;
> - }
> + if (!intel_crtc_active(crtc))
> + return 0;
>
> /* The WM are computed with base on how long it takes to fill a single
> * row at the given clock rate, multiplied by 8.
> @@ -2093,29 +2244,180 @@ haswell_update_linetime_wm(struct drm_device *dev, struct drm_crtc *crtc)
> ips_linetime = DIV_ROUND_CLOSEST(mode->htotal * 1000 * 8,
> intel_ddi_get_cdclk_freq(dev_priv));
>
> - I915_WRITE(PIPE_WM_LINETIME(pipe),
> - PIPE_WM_LINETIME_IPS_LINETIME(ips_linetime) |
> - PIPE_WM_LINETIME_TIME(linetime));
> + return PIPE_WM_LINETIME_IPS_LINETIME(ips_linetime) |
> + PIPE_WM_LINETIME_TIME(linetime);
> }
>
> -static void haswell_update_wm(struct drm_device *dev)
> +static void hsw_compute_wm_parameters(struct drm_device *dev,
> + struct hsw_pipe_wm_parameters *params,
> + uint32_t *wm)
> {
> struct drm_i915_private *dev_priv = dev->dev_private;
> struct drm_crtc *crtc;
> + struct drm_plane *plane;
> + uint64_t sskpd = I915_READ64(MCH_SSKPD);
> enum pipe pipe;
>
> - /* Disable the LP WMs before changine the linetime registers. This is
> - * just a temporary code that will be replaced soon. */
> - I915_WRITE(WM3_LP_ILK, 0);
> - I915_WRITE(WM2_LP_ILK, 0);
> - I915_WRITE(WM1_LP_ILK, 0);
> + if ((sskpd >> 56) & 0xFF)
> + wm[0] = (sskpd >> 56) & 0xFF;
> + else
> + wm[0] = sskpd & 0xF;
> + wm[1] = ((sskpd >> 4) & 0xFF) * 5;
> + wm[2] = ((sskpd >> 12) & 0xFF) * 5;
> + wm[3] = ((sskpd >> 20) & 0x1FF) * 5;
> + wm[4] = ((sskpd >> 32) & 0x1FF) * 5;
> +
> + list_for_each_entry(crtc, &dev->mode_config.crtc_list, head) {
> + struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
> + struct hsw_pipe_wm_parameters *p;
> +
> + pipe = intel_crtc->pipe;
> + p = ¶ms[pipe];
> +
> + p->active = intel_crtc_active(crtc);
> + if (!p->active)
> + continue;
> +
> + p->pipe_htotal = intel_crtc->config.adjusted_mode.htotal;
> + p->pixel_rate = hsw_wm_get_pixel_rate(dev, crtc);
> + p->pri_bytes_per_pixel = crtc->fb->bits_per_pixel / 8;
> + p->cur_bytes_per_pixel = 4;
> + p->pri_horiz_pixels =
> + intel_crtc->config.requested_mode.hdisplay;
> + p->cur_horiz_pixels = 64;
> + }
> +
> + list_for_each_entry(plane, &dev->mode_config.plane_list, head) {
> + struct intel_plane *intel_plane = to_intel_plane(plane);
> + struct hsw_pipe_wm_parameters *p;
> +
> + pipe = intel_plane->pipe;
> + p = ¶ms[pipe];
> +
> + p->sprite_enabled = intel_plane->wm.enable;
> + p->spr_bytes_per_pixel = intel_plane->wm.bytes_per_pixel;
> + p->spr_horiz_pixels = intel_plane->wm.horiz_pixels;
> + }
> +}
> +
> +static void hsw_compute_wm_results(struct drm_device *dev,
> + struct hsw_pipe_wm_parameters *params,
> + uint32_t *wm,
> + struct hsw_wm_values *results)
> +{
> + struct drm_i915_private *dev_priv = dev->dev_private;
> + struct drm_crtc *crtc;
> + enum pipe pipe;
> +
> + /* No support for LP WMs yet. */
> + results->wm_lp[2] = 0;
> + results->wm_lp[1] = 0;
> + results->wm_lp[0] = 0;
> + results->wm_lp_spr[2] = 0;
> + results->wm_lp_spr[1] = 0;
> + results->wm_lp_spr[0] = 0;
> +
> + for_each_pipe(pipe)
> + results->wm_pipe[pipe] = hsw_compute_wm_pipe(dev_priv, wm[0],
> + pipe,
> + ¶ms[pipe]);
>
> for_each_pipe(pipe) {
> crtc = dev_priv->pipe_to_crtc_mapping[pipe];
> - haswell_update_linetime_wm(dev, crtc);
> + results->wm_linetime[pipe] = hsw_compute_linetime_wm(dev, crtc);
> }
> +}
> +
> +/*
> + * The spec says we shouldn't write when we don't need, because every write
> + * causes WMs to be re-evaluated, expending some power.
> + */
> +static void hsw_write_wm_values(struct drm_i915_private *dev_priv,
> + struct hsw_wm_values *results,
> + enum hsw_data_buf_partitioning partitioning)
> +{
> + struct hsw_wm_values previous;
> + uint32_t val;
> + enum hsw_data_buf_partitioning prev_partitioning;
> +
> + previous.wm_pipe[0] = I915_READ(WM0_PIPEA_ILK);
> + previous.wm_pipe[1] = I915_READ(WM0_PIPEB_ILK);
> + previous.wm_pipe[2] = I915_READ(WM0_PIPEC_IVB);
> + previous.wm_lp[0] = I915_READ(WM1_LP_ILK);
> + previous.wm_lp[1] = I915_READ(WM2_LP_ILK);
> + previous.wm_lp[2] = I915_READ(WM3_LP_ILK);
> + previous.wm_lp_spr[0] = I915_READ(WM1S_LP_ILK);
> + previous.wm_lp_spr[1] = I915_READ(WM2S_LP_IVB);
> + previous.wm_lp_spr[2] = I915_READ(WM3S_LP_IVB);
> + previous.wm_linetime[0] = I915_READ(PIPE_WM_LINETIME(PIPE_A));
> + previous.wm_linetime[1] = I915_READ(PIPE_WM_LINETIME(PIPE_B));
> + previous.wm_linetime[2] = I915_READ(PIPE_WM_LINETIME(PIPE_C));
> +
> + prev_partitioning = (I915_READ(WM_MISC) & WM_MISC_DATA_PARTITION_5_6) ?
> + HSW_DATA_BUF_PART_5_6 : HSW_DATA_BUF_PART_1_2;
> +
> + if (memcmp(results->wm_pipe, previous.wm_pipe, 3) == 0 &&
> + memcmp(results->wm_lp, previous.wm_lp, 3) == 0 &&
> + memcmp(results->wm_lp_spr, previous.wm_lp_spr, 3) == 0 &&
> + memcmp(results->wm_linetime, previous.wm_linetime, 3) == 0 &&
> + partitioning == prev_partitioning)
> + return;
> +
> + if (previous.wm_lp[2] != 0)
> + I915_WRITE(WM3_LP_ILK, 0);
> + if (previous.wm_lp[1] != 0)
> + I915_WRITE(WM2_LP_ILK, 0);
> + if (previous.wm_lp[0] != 0)
> + I915_WRITE(WM1_LP_ILK, 0);
> +
> + if (previous.wm_pipe[0] != results->wm_pipe[0])
> + I915_WRITE(WM0_PIPEA_ILK, results->wm_pipe[0]);
> + if (previous.wm_pipe[1] != results->wm_pipe[1])
> + I915_WRITE(WM0_PIPEB_ILK, results->wm_pipe[1]);
> + if (previous.wm_pipe[2] != results->wm_pipe[2])
> + I915_WRITE(WM0_PIPEC_IVB, results->wm_pipe[2]);
> +
> + if (previous.wm_linetime[0] != results->wm_linetime[0])
> + I915_WRITE(PIPE_WM_LINETIME(PIPE_A), results->wm_linetime[0]);
> + if (previous.wm_linetime[1] != results->wm_linetime[1])
> + I915_WRITE(PIPE_WM_LINETIME(PIPE_B), results->wm_linetime[1]);
> + if (previous.wm_linetime[2] != results->wm_linetime[2])
> + I915_WRITE(PIPE_WM_LINETIME(PIPE_C), results->wm_linetime[2]);
> +
> + if (prev_partitioning != partitioning) {
> + val = I915_READ(WM_MISC);
> + if (partitioning == HSW_DATA_BUF_PART_1_2)
> + val &= ~WM_MISC_DATA_PARTITION_5_6;
> + else
> + val |= WM_MISC_DATA_PARTITION_5_6;
> + I915_WRITE(WM_MISC, val);
> + }
> +
> + if (previous.wm_lp_spr[0] != results->wm_lp_spr[0])
> + I915_WRITE(WM1S_LP_ILK, results->wm_lp_spr[0]);
> + if (previous.wm_lp_spr[1] != results->wm_lp_spr[1])
> + I915_WRITE(WM2S_LP_IVB, results->wm_lp_spr[1]);
> + if (previous.wm_lp_spr[2] != results->wm_lp_spr[2])
> + I915_WRITE(WM3S_LP_IVB, results->wm_lp_spr[2]);
> +
> + if (results->wm_lp[0] != 0)
> + I915_WRITE(WM1_LP_ILK, results->wm_lp[0]);
> + if (results->wm_lp[1] != 0)
> + I915_WRITE(WM2_LP_ILK, results->wm_lp[1]);
> + if (results->wm_lp[2] != 0)
> + I915_WRITE(WM3_LP_ILK, results->wm_lp[2]);
> +}
> +
> +static void haswell_update_wm(struct drm_device *dev)
> +{
> + struct drm_i915_private *dev_priv = dev->dev_private;
> + struct hsw_pipe_wm_parameters params[3];
> + struct hsw_wm_values results;
> + uint32_t wm[5];
>
> - sandybridge_update_wm(dev);
> + hsw_compute_wm_parameters(dev, params, wm);
> + hsw_compute_wm_results(dev, params, wm, &results);
> + hsw_write_wm_values(dev_priv, &results, HSW_DATA_BUF_PART_1_2);
> }
>
> static void haswell_update_sprite_wm(struct drm_device *dev, int pipe,
> --
> 1.8.1.2
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
--
Ville Syrjälä
Intel OTC
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 4/5] drm/i915: properly set HSW WM_LP watermarks
2013-05-24 22:05 ` Paulo Zanoni
@ 2013-05-29 16:06 ` Ville Syrjälä
2013-05-29 16:24 ` Ville Syrjälä
0 siblings, 1 reply; 29+ messages in thread
From: Ville Syrjälä @ 2013-05-29 16:06 UTC (permalink / raw)
To: Paulo Zanoni; +Cc: intel-gfx, Paulo Zanoni
On Fri, May 24, 2013 at 07:05:12PM -0300, Paulo Zanoni wrote:
> From: Paulo Zanoni <paulo.r.zanoni@intel.com>
>
> We were previously only setting the WM_PIPE registers, now we are
> setting the LP watermark registers. This should allow deeper PC
> states, resulting in power savings.
>
> We're only using 1/2 data buffer partitioning for now.
>
> v2: Merge both hsw_compute_pri_wm_* functions (Ville)
>
> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
> ---
> drivers/gpu/drm/i915/i915_reg.h | 4 +
> drivers/gpu/drm/i915/intel_pm.c | 201 ++++++++++++++++++++++++++++++++++++----
> 2 files changed, 187 insertions(+), 18 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index e86606c..58230ea 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -3057,6 +3057,10 @@
> #define WM3S_LP_IVB 0x45128
> #define WM1S_LP_EN (1<<31)
>
> +#define HSW_WM_LP_VAL(lat, fbc, pri, cur) \
> + (WM3_LP_EN | ((lat) << WM1_LP_LATENCY_SHIFT) | \
> + ((fbc) << WM1_LP_FBC_SHIFT) | ((pri) << WM1_LP_SR_SHIFT) | (cur))
> +
> /* Memory latency timer register */
> #define MLTR_ILK 0x11222
> #define MLTR_WM1_SHIFT 0
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index ef58a1a..872e2a8 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -2129,6 +2129,12 @@ static uint32_t hsw_wm_method2(uint32_t pixel_rate, uint32_t pipe_htotal,
> return ret;
> }
>
> +static uint32_t hsw_wm_fbc(uint32_t pri_val, uint32_t horiz_pixels,
> + uint8_t bytes_per_pixel)
> +{
> + return DIV_ROUND_UP(pri_val * 64, horiz_pixels * bytes_per_pixel) + 2;
> +}
> +
> struct hsw_pipe_wm_parameters {
> bool active;
> bool sprite_enabled;
> @@ -2142,11 +2148,28 @@ struct hsw_pipe_wm_parameters {
> uint32_t pixel_rate;
> };
>
> +struct hsw_wm_maximums {
> + uint16_t pri;
> + uint16_t spr;
> + uint16_t cur;
> + uint16_t fbc;
> +};
> +
> +struct hsw_lp_wm_result {
> + bool enable;
> + bool fbc_enable;
> + uint32_t pri_val;
> + uint32_t spr_val;
> + uint32_t cur_val;
> + uint32_t fbc_val;
> +};
> +
> struct hsw_wm_values {
> uint32_t wm_pipe[3];
> uint32_t wm_lp[3];
> uint32_t wm_lp_spr[3];
> uint32_t wm_linetime[3];
> + bool enable_fbc_wm;
> };
>
> enum hsw_data_buf_partitioning {
> @@ -2154,17 +2177,31 @@ enum hsw_data_buf_partitioning {
> HSW_DATA_BUF_PART_5_6,
> };
>
> -/* Only for WM_PIPE. */
> -static uint32_t hsw_compute_pri_wm_pipe(struct hsw_pipe_wm_parameters *params,
> - uint32_t mem_value)
> +/* For both WM_PIPE and WM_LP. */
> +static uint32_t hsw_compute_pri_wm(struct hsw_pipe_wm_parameters *params,
> + uint32_t mem_value,
> + bool is_lp)
> {
> + uint32_t method1, method2;
> +
> /* TODO: for now, assume the primary plane is always enabled. */
> if (!params->active)
> return 0;
>
> - return hsw_wm_method1(params->pixel_rate,
> - params->pri_bytes_per_pixel,
> - mem_value);
> + method1 = hsw_wm_method1(params->pixel_rate,
> + params->pri_bytes_per_pixel,
> + mem_value);
> +
> + if (!is_lp)
> + return method1;
> +
> + method2 = hsw_wm_method2(params->pixel_rate,
> + params->pipe_htotal,
> + params->pri_horiz_pixels,
> + params->pri_bytes_per_pixel,
> + mem_value);
> +
> + return min(method1, method2);
> }
>
> /* For both WM_PIPE and WM_LP. */
> @@ -2201,13 +2238,59 @@ static uint32_t hsw_compute_cur_wm(struct hsw_pipe_wm_parameters *params,
> mem_value);
> }
>
> +/* Only for WM_LP. */
> +static uint32_t hsw_compute_fbc_wm(struct hsw_pipe_wm_parameters *params,
> + uint32_t pri_val,
> + uint32_t mem_value)
> +{
> + if (!params->active)
> + return 0;
> +
> + return hsw_wm_fbc(pri_val,
> + params->pri_horiz_pixels,
> + params->pri_bytes_per_pixel);
> +}
> +
> +static void hsw_compute_lp_wm(uint32_t mem_value, struct hsw_wm_maximums *max,
> + struct hsw_pipe_wm_parameters *params,
> + struct hsw_lp_wm_result *result)
> +{
> + enum pipe pipe;
> + uint32_t pri_val[3], spr_val[3], cur_val[3], fbc_val[3];
> +
> + for (pipe = PIPE_A; pipe <= PIPE_C; pipe++) {
> + struct hsw_pipe_wm_parameters *p = ¶ms[pipe];
> +
> + pri_val[pipe] = hsw_compute_pri_wm(p, mem_value, true);
> + spr_val[pipe] = hsw_compute_spr_wm(p, mem_value);
> + cur_val[pipe] = hsw_compute_cur_wm(p, mem_value);
> + fbc_val[pipe] = hsw_compute_fbc_wm(p, pri_val[pipe], mem_value);
> + }
> +
> + result->pri_val = max3(pri_val[0], pri_val[1], pri_val[2]);
> + result->spr_val = max3(spr_val[0], spr_val[1], spr_val[2]);
> + result->cur_val = max3(cur_val[0], cur_val[1], cur_val[2]);
> + result->fbc_val = max3(fbc_val[0], fbc_val[1], fbc_val[2]);
> +
> + if (result->fbc_val > max->fbc) {
> + result->fbc_enable = false;
> + result->fbc_val = 0;
> + } else {
> + result->fbc_enable = true;
> + }
> +
> + result->enable = result->pri_val <= max->pri &&
> + result->spr_val <= max->spr &&
> + result->cur_val <= max->cur;
> +}
> +
> static uint32_t hsw_compute_wm_pipe(struct drm_i915_private *dev_priv,
> uint32_t mem_value, enum pipe pipe,
> struct hsw_pipe_wm_parameters *params)
> {
> uint32_t pri_val, cur_val, spr_val;
>
> - pri_val = hsw_compute_pri_wm_pipe(params, mem_value);
> + pri_val = hsw_compute_pri_wm(params, mem_value, false);
> spr_val = hsw_compute_spr_wm(params, mem_value);
> cur_val = hsw_compute_cur_wm(params, mem_value);
>
> @@ -2250,13 +2333,15 @@ hsw_compute_linetime_wm(struct drm_device *dev, struct drm_crtc *crtc)
>
> static void hsw_compute_wm_parameters(struct drm_device *dev,
> struct hsw_pipe_wm_parameters *params,
> - uint32_t *wm)
> + uint32_t *wm,
> + struct hsw_wm_maximums *lp_max_1_2)
> {
> struct drm_i915_private *dev_priv = dev->dev_private;
> struct drm_crtc *crtc;
> struct drm_plane *plane;
> uint64_t sskpd = I915_READ64(MCH_SSKPD);
> enum pipe pipe;
> + int pipes_active = 0, sprites_enabled = 0;
>
> if ((sskpd >> 56) & 0xFF)
> wm[0] = (sskpd >> 56) & 0xFF;
> @@ -2278,6 +2363,8 @@ static void hsw_compute_wm_parameters(struct drm_device *dev,
> if (!p->active)
> continue;
>
> + pipes_active++;
> +
> p->pipe_htotal = intel_crtc->config.adjusted_mode.htotal;
> p->pixel_rate = hsw_wm_get_pixel_rate(dev, crtc);
> p->pri_bytes_per_pixel = crtc->fb->bits_per_pixel / 8;
> @@ -2297,25 +2384,89 @@ static void hsw_compute_wm_parameters(struct drm_device *dev,
> p->sprite_enabled = intel_plane->wm.enable;
> p->spr_bytes_per_pixel = intel_plane->wm.bytes_per_pixel;
> p->spr_horiz_pixels = intel_plane->wm.horiz_pixels;
> +
> + if (p->sprite_enabled)
> + sprites_enabled++;
> + }
> +
> + if (pipes_active > 1) {
> + lp_max_1_2->pri = sprites_enabled ? 128 : 256;
> + lp_max_1_2->spr = 128;
> + lp_max_1_2->cur = 64;
> + } else {
> + lp_max_1_2->pri = sprites_enabled ? 384 : 768;
> + lp_max_1_2->spr = 384;
> + lp_max_1_2->cur = 255;
> }
> + lp_max_1_2->fbc = 15;
> }
>
> static void hsw_compute_wm_results(struct drm_device *dev,
> struct hsw_pipe_wm_parameters *params,
> uint32_t *wm,
> + struct hsw_wm_maximums *lp_maximums,
> struct hsw_wm_values *results)
> {
> struct drm_i915_private *dev_priv = dev->dev_private;
> struct drm_crtc *crtc;
> + struct hsw_lp_wm_result lp_results[4];
> enum pipe pipe;
> + int i;
>
> - /* No support for LP WMs yet. */
> - results->wm_lp[2] = 0;
> - results->wm_lp[1] = 0;
> - results->wm_lp[0] = 0;
> - results->wm_lp_spr[2] = 0;
> - results->wm_lp_spr[1] = 0;
> - results->wm_lp_spr[0] = 0;
> + hsw_compute_lp_wm(wm[1], lp_maximums, params, &lp_results[0]);
> + hsw_compute_lp_wm(wm[2], lp_maximums, params, &lp_results[1]);
> + hsw_compute_lp_wm(wm[3], lp_maximums, params, &lp_results[2]);
> + hsw_compute_lp_wm(wm[4], lp_maximums, params, &lp_results[3]);
> +
> + /* The spec says it is preferred to disable FBC WMs instead of disabling
> + * a WM level. */
> + results->enable_fbc_wm = true;
> + for (i = 0; i < 4; i++) {
> + if (lp_results[i].enable && !lp_results[i].fbc_enable) {
> + results->enable_fbc_wm = false;
> + break;
> + }
> + }
> +
> + if (lp_results[3].enable) {
Here you seem to assume that if lp[3] is valid, that also lp[2] and lp[0]
are valid...
> + results->wm_lp[2] = HSW_WM_LP_VAL(8, lp_results[3].fbc_val,
> + lp_results[3].pri_val,
> + lp_results[3].cur_val);
> + results->wm_lp_spr[2] = lp_results[3].spr_val;
> + } else if (lp_results[2].enable) {
> + results->wm_lp[2] = HSW_WM_LP_VAL(6, lp_results[2].fbc_val,
> + lp_results[2].pri_val,
> + lp_results[2].cur_val);
> + results->wm_lp_spr[2] = lp_results[2].spr_val;
> + } else {
> + results->wm_lp[2] = 0;
> + results->wm_lp_spr[2] = 0;
> + }
> +
> + if (lp_results[3].enable && lp_results[2].enable) {
... but here you check if lp[2] is actually valid.
So it seems that in theory (if the latency values are really weird) the code
could enable LP3 and leave LP2 or LP1 disabled, which is illegal.
I think this could be cleared up a bit like so:
struct hsw_lp_wm_result lp_results[4] = {};
for (level = 1; level <= 4; level++)
if (!hsw_compute_lp_wm(wm[level], lp_maximums, params, &lp_results[level-1]))
break;
and obviously have hsw_compute_lp_wm() return result->enable. It could
also save a few cycles by not computing watermark levels that will never
be used.
To simplify the results handling you could take another page out of my
WM patch. Something like this:
memset(results, 0, sizeof *results);
for (wm_lp = 1; wm_lp <= 3; wm_lp++) {
int level = wm_lp + (wm_lp >= 2 && lp_results[3].enable)
const struct hsw_lp_wm_result *r = &lp_results[level-1];
if (!r->enable)
break;
results->wm_lp[wm_lp] = HSW_WM_LP_VAL(level << 1, r->fbc_val, r->pri_val, r->cur_val);
results->wm_lp_spr[wm_lp] = r->spr_val;
}
Quite a bit less code, and avoids all those awkward checks for which
levels are valid.
> + results->wm_lp[1] = HSW_WM_LP_VAL(6, lp_results[2].fbc_val,
> + lp_results[2].pri_val,
> + lp_results[2].cur_val);
> + results->wm_lp_spr[1] = lp_results[2].spr_val;
> + } else if (!lp_results[3].enable && lp_results[1].enable) {
> + results->wm_lp[1] = HSW_WM_LP_VAL(4, lp_results[1].fbc_val,
> + lp_results[1].pri_val,
> + lp_results[1].cur_val);
> + results->wm_lp_spr[1] = lp_results[1].spr_val;
> + } else {
> + results->wm_lp[1] = 0;
> + results->wm_lp_spr[1] = 0;
> + }
> +
> + if (lp_results[0].enable) {
> + results->wm_lp[0] = HSW_WM_LP_VAL(2, lp_results[0].fbc_val,
> + lp_results[0].pri_val,
> + lp_results[0].cur_val);
> + results->wm_lp_spr[0] = lp_results[0].spr_val;
> + } else {
> + results->wm_lp[0] = 0;
> + results->wm_lp_spr[0] = 0;
> + }
>
> for_each_pipe(pipe)
> results->wm_pipe[pipe] = hsw_compute_wm_pipe(dev_priv, wm[0],
> @@ -2339,6 +2490,7 @@ static void hsw_write_wm_values(struct drm_i915_private *dev_priv,
> struct hsw_wm_values previous;
> uint32_t val;
> enum hsw_data_buf_partitioning prev_partitioning;
> + bool prev_enable_fbc_wm;
>
> previous.wm_pipe[0] = I915_READ(WM0_PIPEA_ILK);
> previous.wm_pipe[1] = I915_READ(WM0_PIPEB_ILK);
> @@ -2356,11 +2508,14 @@ static void hsw_write_wm_values(struct drm_i915_private *dev_priv,
> prev_partitioning = (I915_READ(WM_MISC) & WM_MISC_DATA_PARTITION_5_6) ?
> HSW_DATA_BUF_PART_5_6 : HSW_DATA_BUF_PART_1_2;
>
> + prev_enable_fbc_wm = !(I915_READ(DISP_ARB_CTL) & DISP_FBC_WM_DIS);
> +
> if (memcmp(results->wm_pipe, previous.wm_pipe, 3) == 0 &&
> memcmp(results->wm_lp, previous.wm_lp, 3) == 0 &&
> memcmp(results->wm_lp_spr, previous.wm_lp_spr, 3) == 0 &&
> memcmp(results->wm_linetime, previous.wm_linetime, 3) == 0 &&
> - partitioning == prev_partitioning)
> + partitioning == prev_partitioning &&
> + results->enable_fbc_wm == prev_enable_fbc_wm)
> return;
>
> if (previous.wm_lp[2] != 0)
> @@ -2393,6 +2548,15 @@ static void hsw_write_wm_values(struct drm_i915_private *dev_priv,
> I915_WRITE(WM_MISC, val);
> }
>
> + if (prev_enable_fbc_wm != results->enable_fbc_wm) {
> + val = I915_READ(DISP_ARB_CTL);
> + if (results->enable_fbc_wm)
> + val &= ~DISP_FBC_WM_DIS;
> + else
> + val |= DISP_FBC_WM_DIS;
> + I915_WRITE(DISP_ARB_CTL, val);
> + }
> +
> if (previous.wm_lp_spr[0] != results->wm_lp_spr[0])
> I915_WRITE(WM1S_LP_ILK, results->wm_lp_spr[0]);
> if (previous.wm_lp_spr[1] != results->wm_lp_spr[1])
> @@ -2411,12 +2575,13 @@ static void hsw_write_wm_values(struct drm_i915_private *dev_priv,
> static void haswell_update_wm(struct drm_device *dev)
> {
> struct drm_i915_private *dev_priv = dev->dev_private;
> + struct hsw_wm_maximums lp_max_1_2;
> struct hsw_pipe_wm_parameters params[3];
> struct hsw_wm_values results;
> uint32_t wm[5];
>
> - hsw_compute_wm_parameters(dev, params, wm);
> - hsw_compute_wm_results(dev, params, wm, &results);
> + hsw_compute_wm_parameters(dev, params, wm, &lp_max_1_2);
> + hsw_compute_wm_results(dev, params, wm, &lp_max_1_2, &results);
> hsw_write_wm_values(dev_priv, &results, HSW_DATA_BUF_PART_1_2);
> }
>
> --
> 1.8.1.2
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
--
Ville Syrjälä
Intel OTC
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 5/5] drm/i915: add support for 5/6 data buffer partitioning on Haswell
2013-05-24 14:59 ` [PATCH 5/5] drm/i915: add support for 5/6 data buffer partitioning on Haswell Paulo Zanoni
@ 2013-05-29 16:17 ` Ville Syrjälä
2013-05-31 13:19 ` [PATCH 3/3] " Paulo Zanoni
0 siblings, 1 reply; 29+ messages in thread
From: Ville Syrjälä @ 2013-05-29 16:17 UTC (permalink / raw)
To: Paulo Zanoni; +Cc: intel-gfx, Paulo Zanoni
On Fri, May 24, 2013 at 11:59:21AM -0300, Paulo Zanoni wrote:
> From: Paulo Zanoni <paulo.r.zanoni@intel.com>
>
> Now we compute the results for both 1/2 and 5/6 partitioning and then
> use hsw_find_best_result to choose which one to use.
>
> With this patch, Haswell watermarks support should be in good shape.
> The only improvement we're missing is the case where the primary plane
> is disabled: we always assume it's enabled, so we take it into
> consideration when calculating the watermarks.
>
> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
> ---
> drivers/gpu/drm/i915/intel_pm.c | 64 ++++++++++++++++++++++++++++++++++-------
> 1 file changed, 53 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index 9f9eb48..6fdfd1a 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -2344,7 +2344,8 @@ hsw_compute_linetime_wm(struct drm_device *dev, struct drm_crtc *crtc)
> static void hsw_compute_wm_parameters(struct drm_device *dev,
> struct hsw_pipe_wm_parameters *params,
> uint32_t *wm,
> - struct hsw_wm_maximums *lp_max_1_2)
> + struct hsw_wm_maximums *lp_max_1_2,
> + struct hsw_wm_maximums *lp_max_5_6)
> {
> struct drm_i915_private *dev_priv = dev->dev_private;
> struct drm_crtc *crtc;
> @@ -2399,15 +2400,17 @@ static void hsw_compute_wm_parameters(struct drm_device *dev,
> }
>
> if (pipes_active > 1) {
> - lp_max_1_2->pri = sprites_enabled ? 128 : 256;
> - lp_max_1_2->spr = 128;
> - lp_max_1_2->cur = 64;
> + lp_max_1_2->pri = lp_max_5_6->pri = sprites_enabled ? 128 : 256;
> + lp_max_1_2->spr = lp_max_5_6->spr = 128;
> + lp_max_1_2->cur = lp_max_5_6->cur = 64;
> } else {
> lp_max_1_2->pri = sprites_enabled ? 384 : 768;
> + lp_max_5_6->pri = sprites_enabled ? 128 : 768;
> lp_max_1_2->spr = 384;
> - lp_max_1_2->cur = 255;
> + lp_max_5_6->spr = 640;
> + lp_max_1_2->cur = lp_max_5_6->cur = 255;
> }
> - lp_max_1_2->fbc = 15;
> + lp_max_1_2->fbc = lp_max_5_6->fbc = 15;
> }
>
> static void hsw_compute_wm_results(struct drm_device *dev,
> @@ -2488,6 +2491,32 @@ static void hsw_compute_wm_results(struct drm_device *dev,
> }
> }
>
> +/* Find the result with the highest level enabled. Check for enable_fbc_wm in
> + * case both are at the same level. Prefer r1 in case they're the same. */
> +struct hsw_wm_values *hsw_find_best_result(struct hsw_wm_values *r1,
> + struct hsw_wm_values *r2)
> +{
> + int i, val_r1 = 0, val_r2 = 0;
> +
> + for (i = 0; i < 3; i++) {
> + if (r1->wm_lp[i] & WM3_LP_EN)
> + val_r1 |= (1 << i);
> + if (r2->wm_lp[i] & WM3_LP_EN)
> + val_r2 |= (1 << i);
This could just be:
if (r1->wm_lp[i] & WM3_LP_EN)
val_r1 = i
if (r2->wm_lp[i] & WM3_LP_EN)
val_r2 = i;
And maybe call them max_r1 and max_r2 or something...
> + }
> +
> + if (val_r1 == val_r2) {
> + if (r2->enable_fbc_wm && !r1->enable_fbc_wm)
> + return r2;
> + else
> + return r1;
> + } else if (val_r1 > val_r2) {
> + return r1;
> + } else {
> + return r2;
> + }
> +}
> +
> /*
> * The spec says we shouldn't write when we don't need, because every write
> * causes WMs to be re-evaluated, expending some power.
> @@ -2584,14 +2613,27 @@ static void hsw_write_wm_values(struct drm_i915_private *dev_priv,
> static void haswell_update_wm(struct drm_device *dev)
> {
> struct drm_i915_private *dev_priv = dev->dev_private;
> - struct hsw_wm_maximums lp_max_1_2;
> + struct hsw_wm_maximums lp_max_1_2, lp_max_5_6;
> struct hsw_pipe_wm_parameters params[3];
> - struct hsw_wm_values results;
> + struct hsw_wm_values results_1_2, results_5_6, *best_results;
> uint32_t wm[5];
> + enum hsw_data_buf_partitioning partitioning;
> +
> + hsw_compute_wm_parameters(dev, params, wm, &lp_max_1_2, &lp_max_5_6);
> +
> + hsw_compute_wm_results(dev, params, wm, &lp_max_1_2, &results_1_2);
> + if (lp_max_1_2.pri != lp_max_5_6.pri) {
> + hsw_compute_wm_results(dev, params, wm, &lp_max_5_6,
> + &results_5_6);
> + best_results = hsw_find_best_result(&results_1_2, &results_5_6);
> + } else {
> + best_results = &results_1_2;
> + }
> +
> + partitioning = (best_results == &results_1_2) ?
> + HSW_DATA_BUF_PART_1_2 : HSW_DATA_BUF_PART_5_6;
>
> - hsw_compute_wm_parameters(dev, params, wm, &lp_max_1_2);
> - hsw_compute_wm_results(dev, params, wm, &lp_max_1_2, &results);
> - hsw_write_wm_values(dev_priv, &results, HSW_DATA_BUF_PART_1_2);
> + hsw_write_wm_values(dev_priv, best_results, partitioning);
> }
>
> static void haswell_update_sprite_wm(struct drm_device *dev, int pipe,
> --
> 1.8.1.2
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
--
Ville Syrjälä
Intel OTC
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 4/5] drm/i915: properly set HSW WM_LP watermarks
2013-05-29 16:06 ` Ville Syrjälä
@ 2013-05-29 16:24 ` Ville Syrjälä
2013-05-31 13:12 ` [PATCH 2/3] " Paulo Zanoni
0 siblings, 1 reply; 29+ messages in thread
From: Ville Syrjälä @ 2013-05-29 16:24 UTC (permalink / raw)
To: Paulo Zanoni; +Cc: intel-gfx, Paulo Zanoni
On Wed, May 29, 2013 at 07:06:15PM +0300, Ville Syrjälä wrote:
> On Fri, May 24, 2013 at 07:05:12PM -0300, Paulo Zanoni wrote:
> > From: Paulo Zanoni <paulo.r.zanoni@intel.com>
> >
> > We were previously only setting the WM_PIPE registers, now we are
> > setting the LP watermark registers. This should allow deeper PC
> > states, resulting in power savings.
> >
> > We're only using 1/2 data buffer partitioning for now.
> >
> > v2: Merge both hsw_compute_pri_wm_* functions (Ville)
> >
> > Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
> > ---
> > drivers/gpu/drm/i915/i915_reg.h | 4 +
> > drivers/gpu/drm/i915/intel_pm.c | 201 ++++++++++++++++++++++++++++++++++++----
> > 2 files changed, 187 insertions(+), 18 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> > index e86606c..58230ea 100644
> > --- a/drivers/gpu/drm/i915/i915_reg.h
> > +++ b/drivers/gpu/drm/i915/i915_reg.h
> > @@ -3057,6 +3057,10 @@
> > #define WM3S_LP_IVB 0x45128
> > #define WM1S_LP_EN (1<<31)
> >
> > +#define HSW_WM_LP_VAL(lat, fbc, pri, cur) \
> > + (WM3_LP_EN | ((lat) << WM1_LP_LATENCY_SHIFT) | \
> > + ((fbc) << WM1_LP_FBC_SHIFT) | ((pri) << WM1_LP_SR_SHIFT) | (cur))
> > +
> > /* Memory latency timer register */
> > #define MLTR_ILK 0x11222
> > #define MLTR_WM1_SHIFT 0
> > diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> > index ef58a1a..872e2a8 100644
> > --- a/drivers/gpu/drm/i915/intel_pm.c
> > +++ b/drivers/gpu/drm/i915/intel_pm.c
> > @@ -2129,6 +2129,12 @@ static uint32_t hsw_wm_method2(uint32_t pixel_rate, uint32_t pipe_htotal,
> > return ret;
> > }
> >
> > +static uint32_t hsw_wm_fbc(uint32_t pri_val, uint32_t horiz_pixels,
> > + uint8_t bytes_per_pixel)
> > +{
> > + return DIV_ROUND_UP(pri_val * 64, horiz_pixels * bytes_per_pixel) + 2;
> > +}
> > +
> > struct hsw_pipe_wm_parameters {
> > bool active;
> > bool sprite_enabled;
> > @@ -2142,11 +2148,28 @@ struct hsw_pipe_wm_parameters {
> > uint32_t pixel_rate;
> > };
> >
> > +struct hsw_wm_maximums {
> > + uint16_t pri;
> > + uint16_t spr;
> > + uint16_t cur;
> > + uint16_t fbc;
> > +};
> > +
> > +struct hsw_lp_wm_result {
> > + bool enable;
> > + bool fbc_enable;
> > + uint32_t pri_val;
> > + uint32_t spr_val;
> > + uint32_t cur_val;
> > + uint32_t fbc_val;
> > +};
> > +
> > struct hsw_wm_values {
> > uint32_t wm_pipe[3];
> > uint32_t wm_lp[3];
> > uint32_t wm_lp_spr[3];
> > uint32_t wm_linetime[3];
> > + bool enable_fbc_wm;
> > };
> >
> > enum hsw_data_buf_partitioning {
> > @@ -2154,17 +2177,31 @@ enum hsw_data_buf_partitioning {
> > HSW_DATA_BUF_PART_5_6,
> > };
> >
> > -/* Only for WM_PIPE. */
> > -static uint32_t hsw_compute_pri_wm_pipe(struct hsw_pipe_wm_parameters *params,
> > - uint32_t mem_value)
> > +/* For both WM_PIPE and WM_LP. */
> > +static uint32_t hsw_compute_pri_wm(struct hsw_pipe_wm_parameters *params,
> > + uint32_t mem_value,
> > + bool is_lp)
> > {
> > + uint32_t method1, method2;
> > +
> > /* TODO: for now, assume the primary plane is always enabled. */
> > if (!params->active)
> > return 0;
> >
> > - return hsw_wm_method1(params->pixel_rate,
> > - params->pri_bytes_per_pixel,
> > - mem_value);
> > + method1 = hsw_wm_method1(params->pixel_rate,
> > + params->pri_bytes_per_pixel,
> > + mem_value);
> > +
> > + if (!is_lp)
> > + return method1;
> > +
> > + method2 = hsw_wm_method2(params->pixel_rate,
> > + params->pipe_htotal,
> > + params->pri_horiz_pixels,
> > + params->pri_bytes_per_pixel,
> > + mem_value);
> > +
> > + return min(method1, method2);
> > }
> >
> > /* For both WM_PIPE and WM_LP. */
> > @@ -2201,13 +2238,59 @@ static uint32_t hsw_compute_cur_wm(struct hsw_pipe_wm_parameters *params,
> > mem_value);
> > }
> >
> > +/* Only for WM_LP. */
> > +static uint32_t hsw_compute_fbc_wm(struct hsw_pipe_wm_parameters *params,
> > + uint32_t pri_val,
> > + uint32_t mem_value)
> > +{
> > + if (!params->active)
> > + return 0;
> > +
> > + return hsw_wm_fbc(pri_val,
> > + params->pri_horiz_pixels,
> > + params->pri_bytes_per_pixel);
> > +}
> > +
> > +static void hsw_compute_lp_wm(uint32_t mem_value, struct hsw_wm_maximums *max,
> > + struct hsw_pipe_wm_parameters *params,
> > + struct hsw_lp_wm_result *result)
> > +{
> > + enum pipe pipe;
> > + uint32_t pri_val[3], spr_val[3], cur_val[3], fbc_val[3];
> > +
> > + for (pipe = PIPE_A; pipe <= PIPE_C; pipe++) {
> > + struct hsw_pipe_wm_parameters *p = ¶ms[pipe];
> > +
> > + pri_val[pipe] = hsw_compute_pri_wm(p, mem_value, true);
> > + spr_val[pipe] = hsw_compute_spr_wm(p, mem_value);
> > + cur_val[pipe] = hsw_compute_cur_wm(p, mem_value);
> > + fbc_val[pipe] = hsw_compute_fbc_wm(p, pri_val[pipe], mem_value);
> > + }
> > +
> > + result->pri_val = max3(pri_val[0], pri_val[1], pri_val[2]);
> > + result->spr_val = max3(spr_val[0], spr_val[1], spr_val[2]);
> > + result->cur_val = max3(cur_val[0], cur_val[1], cur_val[2]);
> > + result->fbc_val = max3(fbc_val[0], fbc_val[1], fbc_val[2]);
> > +
> > + if (result->fbc_val > max->fbc) {
> > + result->fbc_enable = false;
> > + result->fbc_val = 0;
> > + } else {
> > + result->fbc_enable = true;
> > + }
> > +
> > + result->enable = result->pri_val <= max->pri &&
> > + result->spr_val <= max->spr &&
> > + result->cur_val <= max->cur;
> > +}
> > +
> > static uint32_t hsw_compute_wm_pipe(struct drm_i915_private *dev_priv,
> > uint32_t mem_value, enum pipe pipe,
> > struct hsw_pipe_wm_parameters *params)
> > {
> > uint32_t pri_val, cur_val, spr_val;
> >
> > - pri_val = hsw_compute_pri_wm_pipe(params, mem_value);
> > + pri_val = hsw_compute_pri_wm(params, mem_value, false);
> > spr_val = hsw_compute_spr_wm(params, mem_value);
> > cur_val = hsw_compute_cur_wm(params, mem_value);
> >
> > @@ -2250,13 +2333,15 @@ hsw_compute_linetime_wm(struct drm_device *dev, struct drm_crtc *crtc)
> >
> > static void hsw_compute_wm_parameters(struct drm_device *dev,
> > struct hsw_pipe_wm_parameters *params,
> > - uint32_t *wm)
> > + uint32_t *wm,
> > + struct hsw_wm_maximums *lp_max_1_2)
> > {
> > struct drm_i915_private *dev_priv = dev->dev_private;
> > struct drm_crtc *crtc;
> > struct drm_plane *plane;
> > uint64_t sskpd = I915_READ64(MCH_SSKPD);
> > enum pipe pipe;
> > + int pipes_active = 0, sprites_enabled = 0;
> >
> > if ((sskpd >> 56) & 0xFF)
> > wm[0] = (sskpd >> 56) & 0xFF;
> > @@ -2278,6 +2363,8 @@ static void hsw_compute_wm_parameters(struct drm_device *dev,
> > if (!p->active)
> > continue;
> >
> > + pipes_active++;
> > +
> > p->pipe_htotal = intel_crtc->config.adjusted_mode.htotal;
> > p->pixel_rate = hsw_wm_get_pixel_rate(dev, crtc);
> > p->pri_bytes_per_pixel = crtc->fb->bits_per_pixel / 8;
> > @@ -2297,25 +2384,89 @@ static void hsw_compute_wm_parameters(struct drm_device *dev,
> > p->sprite_enabled = intel_plane->wm.enable;
> > p->spr_bytes_per_pixel = intel_plane->wm.bytes_per_pixel;
> > p->spr_horiz_pixels = intel_plane->wm.horiz_pixels;
> > +
> > + if (p->sprite_enabled)
> > + sprites_enabled++;
> > + }
> > +
> > + if (pipes_active > 1) {
> > + lp_max_1_2->pri = sprites_enabled ? 128 : 256;
> > + lp_max_1_2->spr = 128;
> > + lp_max_1_2->cur = 64;
> > + } else {
> > + lp_max_1_2->pri = sprites_enabled ? 384 : 768;
> > + lp_max_1_2->spr = 384;
> > + lp_max_1_2->cur = 255;
> > }
> > + lp_max_1_2->fbc = 15;
> > }
> >
> > static void hsw_compute_wm_results(struct drm_device *dev,
> > struct hsw_pipe_wm_parameters *params,
> > uint32_t *wm,
> > + struct hsw_wm_maximums *lp_maximums,
> > struct hsw_wm_values *results)
> > {
> > struct drm_i915_private *dev_priv = dev->dev_private;
> > struct drm_crtc *crtc;
> > + struct hsw_lp_wm_result lp_results[4];
> > enum pipe pipe;
> > + int i;
> >
> > - /* No support for LP WMs yet. */
> > - results->wm_lp[2] = 0;
> > - results->wm_lp[1] = 0;
> > - results->wm_lp[0] = 0;
> > - results->wm_lp_spr[2] = 0;
> > - results->wm_lp_spr[1] = 0;
> > - results->wm_lp_spr[0] = 0;
> > + hsw_compute_lp_wm(wm[1], lp_maximums, params, &lp_results[0]);
> > + hsw_compute_lp_wm(wm[2], lp_maximums, params, &lp_results[1]);
> > + hsw_compute_lp_wm(wm[3], lp_maximums, params, &lp_results[2]);
> > + hsw_compute_lp_wm(wm[4], lp_maximums, params, &lp_results[3]);
> > +
> > + /* The spec says it is preferred to disable FBC WMs instead of disabling
> > + * a WM level. */
> > + results->enable_fbc_wm = true;
> > + for (i = 0; i < 4; i++) {
> > + if (lp_results[i].enable && !lp_results[i].fbc_enable) {
> > + results->enable_fbc_wm = false;
> > + break;
> > + }
> > + }
> > +
> > + if (lp_results[3].enable) {
>
> Here you seem to assume that if lp[3] is valid, that also lp[2] and lp[0]
> are valid...
>
> > + results->wm_lp[2] = HSW_WM_LP_VAL(8, lp_results[3].fbc_val,
> > + lp_results[3].pri_val,
> > + lp_results[3].cur_val);
> > + results->wm_lp_spr[2] = lp_results[3].spr_val;
> > + } else if (lp_results[2].enable) {
> > + results->wm_lp[2] = HSW_WM_LP_VAL(6, lp_results[2].fbc_val,
> > + lp_results[2].pri_val,
> > + lp_results[2].cur_val);
> > + results->wm_lp_spr[2] = lp_results[2].spr_val;
> > + } else {
> > + results->wm_lp[2] = 0;
> > + results->wm_lp_spr[2] = 0;
> > + }
> > +
> > + if (lp_results[3].enable && lp_results[2].enable) {
>
> ... but here you check if lp[2] is actually valid.
>
> So it seems that in theory (if the latency values are really weird) the code
> could enable LP3 and leave LP2 or LP1 disabled, which is illegal.
>
> I think this could be cleared up a bit like so:
> struct hsw_lp_wm_result lp_results[4] = {};
>
> for (level = 1; level <= 4; level++)
> if (!hsw_compute_lp_wm(wm[level], lp_maximums, params, &lp_results[level-1]))
> break;
>
> and obviously have hsw_compute_lp_wm() return result->enable. It could
> also save a few cycles by not computing watermark levels that will never
> be used.
>
>
> To simplify the results handling you could take another page out of my
> WM patch. Something like this:
>
> memset(results, 0, sizeof *results);
>
> for (wm_lp = 1; wm_lp <= 3; wm_lp++) {
> int level = wm_lp + (wm_lp >= 2 && lp_results[3].enable)
> const struct hsw_lp_wm_result *r = &lp_results[level-1];
>
> if (!r->enable)
> break;
>
> results->wm_lp[wm_lp] = HSW_WM_LP_VAL(level << 1, r->fbc_val, r->pri_val, r->cur_val);
> results->wm_lp_spr[wm_lp] = r->spr_val;
> }
And I forgot the -1 from the results->foo[wm_lp] indexing. That's one
reason I don't quite like this split between LP0/level=0 and and
LP1+/level>=1; The indexes no longer match the LP/level. But I think
we can ignore that for now and clear it up later.
>
> Quite a bit less code, and avoids all those awkward checks for which
> levels are valid.
>
> > + results->wm_lp[1] = HSW_WM_LP_VAL(6, lp_results[2].fbc_val,
> > + lp_results[2].pri_val,
> > + lp_results[2].cur_val);
> > + results->wm_lp_spr[1] = lp_results[2].spr_val;
> > + } else if (!lp_results[3].enable && lp_results[1].enable) {
> > + results->wm_lp[1] = HSW_WM_LP_VAL(4, lp_results[1].fbc_val,
> > + lp_results[1].pri_val,
> > + lp_results[1].cur_val);
> > + results->wm_lp_spr[1] = lp_results[1].spr_val;
> > + } else {
> > + results->wm_lp[1] = 0;
> > + results->wm_lp_spr[1] = 0;
> > + }
> > +
> > + if (lp_results[0].enable) {
> > + results->wm_lp[0] = HSW_WM_LP_VAL(2, lp_results[0].fbc_val,
> > + lp_results[0].pri_val,
> > + lp_results[0].cur_val);
> > + results->wm_lp_spr[0] = lp_results[0].spr_val;
> > + } else {
> > + results->wm_lp[0] = 0;
> > + results->wm_lp_spr[0] = 0;
> > + }
> >
> > for_each_pipe(pipe)
> > results->wm_pipe[pipe] = hsw_compute_wm_pipe(dev_priv, wm[0],
> > @@ -2339,6 +2490,7 @@ static void hsw_write_wm_values(struct drm_i915_private *dev_priv,
> > struct hsw_wm_values previous;
> > uint32_t val;
> > enum hsw_data_buf_partitioning prev_partitioning;
> > + bool prev_enable_fbc_wm;
> >
> > previous.wm_pipe[0] = I915_READ(WM0_PIPEA_ILK);
> > previous.wm_pipe[1] = I915_READ(WM0_PIPEB_ILK);
> > @@ -2356,11 +2508,14 @@ static void hsw_write_wm_values(struct drm_i915_private *dev_priv,
> > prev_partitioning = (I915_READ(WM_MISC) & WM_MISC_DATA_PARTITION_5_6) ?
> > HSW_DATA_BUF_PART_5_6 : HSW_DATA_BUF_PART_1_2;
> >
> > + prev_enable_fbc_wm = !(I915_READ(DISP_ARB_CTL) & DISP_FBC_WM_DIS);
> > +
> > if (memcmp(results->wm_pipe, previous.wm_pipe, 3) == 0 &&
> > memcmp(results->wm_lp, previous.wm_lp, 3) == 0 &&
> > memcmp(results->wm_lp_spr, previous.wm_lp_spr, 3) == 0 &&
> > memcmp(results->wm_linetime, previous.wm_linetime, 3) == 0 &&
> > - partitioning == prev_partitioning)
> > + partitioning == prev_partitioning &&
> > + results->enable_fbc_wm == prev_enable_fbc_wm)
> > return;
> >
> > if (previous.wm_lp[2] != 0)
> > @@ -2393,6 +2548,15 @@ static void hsw_write_wm_values(struct drm_i915_private *dev_priv,
> > I915_WRITE(WM_MISC, val);
> > }
> >
> > + if (prev_enable_fbc_wm != results->enable_fbc_wm) {
> > + val = I915_READ(DISP_ARB_CTL);
> > + if (results->enable_fbc_wm)
> > + val &= ~DISP_FBC_WM_DIS;
> > + else
> > + val |= DISP_FBC_WM_DIS;
> > + I915_WRITE(DISP_ARB_CTL, val);
> > + }
> > +
> > if (previous.wm_lp_spr[0] != results->wm_lp_spr[0])
> > I915_WRITE(WM1S_LP_ILK, results->wm_lp_spr[0]);
> > if (previous.wm_lp_spr[1] != results->wm_lp_spr[1])
> > @@ -2411,12 +2575,13 @@ static void hsw_write_wm_values(struct drm_i915_private *dev_priv,
> > static void haswell_update_wm(struct drm_device *dev)
> > {
> > struct drm_i915_private *dev_priv = dev->dev_private;
> > + struct hsw_wm_maximums lp_max_1_2;
> > struct hsw_pipe_wm_parameters params[3];
> > struct hsw_wm_values results;
> > uint32_t wm[5];
> >
> > - hsw_compute_wm_parameters(dev, params, wm);
> > - hsw_compute_wm_results(dev, params, wm, &results);
> > + hsw_compute_wm_parameters(dev, params, wm, &lp_max_1_2);
> > + hsw_compute_wm_results(dev, params, wm, &lp_max_1_2, &results);
> > hsw_write_wm_values(dev_priv, &results, HSW_DATA_BUF_PART_1_2);
> > }
> >
> > --
> > 1.8.1.2
> >
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
>
> --
> Ville Syrjälä
> Intel OTC
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
--
Ville Syrjälä
Intel OTC
^ permalink raw reply [flat|nested] 29+ messages in thread
* [PATCH 1/3] drm/i915: properly set HSW WM_PIPE registers
2013-05-29 15:39 ` Ville Syrjälä
@ 2013-05-31 13:08 ` Paulo Zanoni
2013-05-31 15:03 ` Ville Syrjälä
0 siblings, 1 reply; 29+ messages in thread
From: Paulo Zanoni @ 2013-05-31 13:08 UTC (permalink / raw)
To: intel-gfx; +Cc: Paulo Zanoni
From: Paulo Zanoni <paulo.r.zanoni@intel.com>
We were previously calling sandybridge_update_wm on HSW, but the SNB
function didn't really match the HSW specification, so we were just
writing the wrong values.
With this patch, the haswell_update_wm function will set the correct
values for the WM_PIPE registers, but it will still keep all the LP
watermarks disabled.
The patch may look a little bit over-complicated for now, but it's
because much of the infrastructure for setting the LP watermarks is
already in place, so we won't have too much code churn on the patch
that sets the LP watermarks.
v2: - Fix pixel_rate on panel fitter case (Ville)
- Try to not overflow (Ville)
- Remove useless variable (Ville)
- Fix p->pri_horiz_pixels (Paulo)
v3: - Fix rounding errors on hsw_wm_method2 (Ville)
v4: - Fix memcmp bug (Paulo)
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
---
drivers/gpu/drm/i915/i915_reg.h | 3 +
drivers/gpu/drm/i915/intel_pm.c | 342 +++++++++++++++++++++++++++++++++++++---
2 files changed, 327 insertions(+), 18 deletions(-)
While doing some more tests I found a memcmp bug that can be reproduced with
some 2-screen configurations. This patch fixes it.
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index dbd9de5..5a49f8a 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -4931,6 +4931,9 @@
#define SFUSE_STRAP_DDIC_DETECTED (1<<1)
#define SFUSE_STRAP_DDID_DETECTED (1<<0)
+#define WM_MISC 0x45260
+#define WM_MISC_DATA_PARTITION_5_6 (1 << 0)
+
#define WM_DBG 0x45280
#define WM_DBG_DISALLOW_MULTIPLE_LP (1<<0)
#define WM_DBG_DISALLOW_MAXFIFO (1<<1)
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 9328ed9..fda7279 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -2072,19 +2072,170 @@ static void ivybridge_update_wm(struct drm_device *dev)
cursor_wm);
}
-static void
-haswell_update_linetime_wm(struct drm_device *dev, struct drm_crtc *crtc)
+static uint32_t hsw_wm_get_pixel_rate(struct drm_device *dev,
+ struct drm_crtc *crtc)
+{
+ struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
+ uint32_t pixel_rate, pfit_size;
+
+ if (intel_crtc->config.pixel_target_clock)
+ pixel_rate = intel_crtc->config.pixel_target_clock;
+ else
+ pixel_rate = intel_crtc->config.adjusted_mode.clock;
+
+ /* We only use IF-ID interlacing. If we ever use PF-ID we'll need to
+ * adjust the pixel_rate here. */
+
+ pfit_size = intel_crtc->config.pch_pfit.size;
+ if (pfit_size) {
+ uint64_t pipe_w, pipe_h, pfit_w, pfit_h;
+
+ pipe_w = intel_crtc->config.requested_mode.hdisplay;
+ pipe_h = intel_crtc->config.requested_mode.vdisplay;
+ pfit_w = (pfit_size >> 16) & 0xFFFF;
+ pfit_h = pfit_size & 0xFFFF;
+ if (pipe_w < pfit_w)
+ pipe_w = pfit_w;
+ if (pipe_h < pfit_h)
+ pipe_h = pfit_h;
+
+ pixel_rate = div_u64((uint64_t) pixel_rate * pipe_w * pipe_h,
+ pfit_w * pfit_h);
+ }
+
+ return pixel_rate;
+}
+
+static uint32_t hsw_wm_method1(uint32_t pixel_rate, uint8_t bytes_per_pixel,
+ uint32_t latency)
+{
+ uint64_t ret;
+
+ ret = (uint64_t) pixel_rate * bytes_per_pixel * latency;
+ ret = DIV_ROUND_UP_ULL(ret, 64 * 10000) + 2;
+
+ return ret;
+}
+
+static uint32_t hsw_wm_method2(uint32_t pixel_rate, uint32_t pipe_htotal,
+ uint32_t horiz_pixels, uint8_t bytes_per_pixel,
+ uint32_t latency)
+{
+ uint32_t ret;
+
+ ret = (latency * pixel_rate) / (pipe_htotal * 10000);
+ ret = (ret + 1) * horiz_pixels * bytes_per_pixel;
+ ret = DIV_ROUND_UP(ret, 64) + 2;
+ return ret;
+}
+
+struct hsw_pipe_wm_parameters {
+ bool active;
+ bool sprite_enabled;
+ uint8_t pri_bytes_per_pixel;
+ uint8_t spr_bytes_per_pixel;
+ uint8_t cur_bytes_per_pixel;
+ uint32_t pri_horiz_pixels;
+ uint32_t spr_horiz_pixels;
+ uint32_t cur_horiz_pixels;
+ uint32_t pipe_htotal;
+ uint32_t pixel_rate;
+};
+
+struct hsw_wm_values {
+ uint32_t wm_pipe[3];
+ uint32_t wm_lp[3];
+ uint32_t wm_lp_spr[3];
+ uint32_t wm_linetime[3];
+};
+
+enum hsw_data_buf_partitioning {
+ HSW_DATA_BUF_PART_1_2,
+ HSW_DATA_BUF_PART_5_6,
+};
+
+/* Only for WM_PIPE. */
+static uint32_t hsw_compute_pri_wm_pipe(struct hsw_pipe_wm_parameters *params,
+ uint32_t mem_value)
+{
+ /* TODO: for now, assume the primary plane is always enabled. */
+ if (!params->active)
+ return 0;
+
+ return hsw_wm_method1(params->pixel_rate,
+ params->pri_bytes_per_pixel,
+ mem_value);
+}
+
+/* For both WM_PIPE and WM_LP. */
+static uint32_t hsw_compute_spr_wm(struct hsw_pipe_wm_parameters *params,
+ uint32_t mem_value)
+{
+ uint32_t method1, method2;
+
+ if (!params->active || !params->sprite_enabled)
+ return 0;
+
+ method1 = hsw_wm_method1(params->pixel_rate,
+ params->spr_bytes_per_pixel,
+ mem_value);
+ method2 = hsw_wm_method2(params->pixel_rate,
+ params->pipe_htotal,
+ params->spr_horiz_pixels,
+ params->spr_bytes_per_pixel,
+ mem_value);
+ return min(method1, method2);
+}
+
+/* For both WM_PIPE and WM_LP. */
+static uint32_t hsw_compute_cur_wm(struct hsw_pipe_wm_parameters *params,
+ uint32_t mem_value)
+{
+ if (!params->active)
+ return 0;
+
+ return hsw_wm_method2(params->pixel_rate,
+ params->pipe_htotal,
+ params->cur_horiz_pixels,
+ params->cur_bytes_per_pixel,
+ mem_value);
+}
+
+static uint32_t hsw_compute_wm_pipe(struct drm_i915_private *dev_priv,
+ uint32_t mem_value, enum pipe pipe,
+ struct hsw_pipe_wm_parameters *params)
+{
+ uint32_t pri_val, cur_val, spr_val;
+
+ pri_val = hsw_compute_pri_wm_pipe(params, mem_value);
+ spr_val = hsw_compute_spr_wm(params, mem_value);
+ cur_val = hsw_compute_cur_wm(params, mem_value);
+
+ WARN(pri_val > 127,
+ "Primary WM error, mode not supported for pipe %c\n",
+ pipe_name(pipe));
+ WARN(spr_val > 127,
+ "Sprite WM error, mode not supported for pipe %c\n",
+ pipe_name(pipe));
+ WARN(cur_val > 63,
+ "Cursor WM error, mode not supported for pipe %c\n",
+ pipe_name(pipe));
+
+ return (pri_val << WM0_PIPE_PLANE_SHIFT) |
+ (spr_val << WM0_PIPE_SPRITE_SHIFT) |
+ cur_val;
+}
+
+static uint32_t
+hsw_compute_linetime_wm(struct drm_device *dev, struct drm_crtc *crtc)
{
struct drm_i915_private *dev_priv = dev->dev_private;
struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
- enum pipe pipe = intel_crtc->pipe;
struct drm_display_mode *mode = &intel_crtc->config.adjusted_mode;
u32 linetime, ips_linetime;
- if (!intel_crtc_active(crtc)) {
- I915_WRITE(PIPE_WM_LINETIME(pipe), 0);
- return;
- }
+ if (!intel_crtc_active(crtc))
+ return 0;
/* The WM are computed with base on how long it takes to fill a single
* row at the given clock rate, multiplied by 8.
@@ -2093,29 +2244,184 @@ haswell_update_linetime_wm(struct drm_device *dev, struct drm_crtc *crtc)
ips_linetime = DIV_ROUND_CLOSEST(mode->htotal * 1000 * 8,
intel_ddi_get_cdclk_freq(dev_priv));
- I915_WRITE(PIPE_WM_LINETIME(pipe),
- PIPE_WM_LINETIME_IPS_LINETIME(ips_linetime) |
- PIPE_WM_LINETIME_TIME(linetime));
+ return PIPE_WM_LINETIME_IPS_LINETIME(ips_linetime) |
+ PIPE_WM_LINETIME_TIME(linetime);
}
-static void haswell_update_wm(struct drm_device *dev)
+static void hsw_compute_wm_parameters(struct drm_device *dev,
+ struct hsw_pipe_wm_parameters *params,
+ uint32_t *wm)
{
struct drm_i915_private *dev_priv = dev->dev_private;
struct drm_crtc *crtc;
+ struct drm_plane *plane;
+ uint64_t sskpd = I915_READ64(MCH_SSKPD);
enum pipe pipe;
- /* Disable the LP WMs before changine the linetime registers. This is
- * just a temporary code that will be replaced soon. */
- I915_WRITE(WM3_LP_ILK, 0);
- I915_WRITE(WM2_LP_ILK, 0);
- I915_WRITE(WM1_LP_ILK, 0);
+ if ((sskpd >> 56) & 0xFF)
+ wm[0] = (sskpd >> 56) & 0xFF;
+ else
+ wm[0] = sskpd & 0xF;
+ wm[1] = ((sskpd >> 4) & 0xFF) * 5;
+ wm[2] = ((sskpd >> 12) & 0xFF) * 5;
+ wm[3] = ((sskpd >> 20) & 0x1FF) * 5;
+ wm[4] = ((sskpd >> 32) & 0x1FF) * 5;
+
+ list_for_each_entry(crtc, &dev->mode_config.crtc_list, head) {
+ struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
+ struct hsw_pipe_wm_parameters *p;
+
+ pipe = intel_crtc->pipe;
+ p = ¶ms[pipe];
+
+ p->active = intel_crtc_active(crtc);
+ if (!p->active)
+ continue;
+
+ p->pipe_htotal = intel_crtc->config.adjusted_mode.htotal;
+ p->pixel_rate = hsw_wm_get_pixel_rate(dev, crtc);
+ p->pri_bytes_per_pixel = crtc->fb->bits_per_pixel / 8;
+ p->cur_bytes_per_pixel = 4;
+ p->pri_horiz_pixels =
+ intel_crtc->config.requested_mode.hdisplay;
+ p->cur_horiz_pixels = 64;
+ }
+
+ list_for_each_entry(plane, &dev->mode_config.plane_list, head) {
+ struct intel_plane *intel_plane = to_intel_plane(plane);
+ struct hsw_pipe_wm_parameters *p;
+
+ pipe = intel_plane->pipe;
+ p = ¶ms[pipe];
+
+ p->sprite_enabled = intel_plane->wm.enable;
+ p->spr_bytes_per_pixel = intel_plane->wm.bytes_per_pixel;
+ p->spr_horiz_pixels = intel_plane->wm.horiz_pixels;
+ }
+}
+
+static void hsw_compute_wm_results(struct drm_device *dev,
+ struct hsw_pipe_wm_parameters *params,
+ uint32_t *wm,
+ struct hsw_wm_values *results)
+{
+ struct drm_i915_private *dev_priv = dev->dev_private;
+ struct drm_crtc *crtc;
+ enum pipe pipe;
+
+ /* No support for LP WMs yet. */
+ results->wm_lp[2] = 0;
+ results->wm_lp[1] = 0;
+ results->wm_lp[0] = 0;
+ results->wm_lp_spr[2] = 0;
+ results->wm_lp_spr[1] = 0;
+ results->wm_lp_spr[0] = 0;
+
+ for_each_pipe(pipe)
+ results->wm_pipe[pipe] = hsw_compute_wm_pipe(dev_priv, wm[0],
+ pipe,
+ ¶ms[pipe]);
for_each_pipe(pipe) {
crtc = dev_priv->pipe_to_crtc_mapping[pipe];
- haswell_update_linetime_wm(dev, crtc);
+ results->wm_linetime[pipe] = hsw_compute_linetime_wm(dev, crtc);
}
+}
+
+/*
+ * The spec says we shouldn't write when we don't need, because every write
+ * causes WMs to be re-evaluated, expending some power.
+ */
+static void hsw_write_wm_values(struct drm_i915_private *dev_priv,
+ struct hsw_wm_values *results,
+ enum hsw_data_buf_partitioning partitioning)
+{
+ struct hsw_wm_values previous;
+ uint32_t val;
+ enum hsw_data_buf_partitioning prev_partitioning;
+
+ previous.wm_pipe[0] = I915_READ(WM0_PIPEA_ILK);
+ previous.wm_pipe[1] = I915_READ(WM0_PIPEB_ILK);
+ previous.wm_pipe[2] = I915_READ(WM0_PIPEC_IVB);
+ previous.wm_lp[0] = I915_READ(WM1_LP_ILK);
+ previous.wm_lp[1] = I915_READ(WM2_LP_ILK);
+ previous.wm_lp[2] = I915_READ(WM3_LP_ILK);
+ previous.wm_lp_spr[0] = I915_READ(WM1S_LP_ILK);
+ previous.wm_lp_spr[1] = I915_READ(WM2S_LP_IVB);
+ previous.wm_lp_spr[2] = I915_READ(WM3S_LP_IVB);
+ previous.wm_linetime[0] = I915_READ(PIPE_WM_LINETIME(PIPE_A));
+ previous.wm_linetime[1] = I915_READ(PIPE_WM_LINETIME(PIPE_B));
+ previous.wm_linetime[2] = I915_READ(PIPE_WM_LINETIME(PIPE_C));
+
+ prev_partitioning = (I915_READ(WM_MISC) & WM_MISC_DATA_PARTITION_5_6) ?
+ HSW_DATA_BUF_PART_5_6 : HSW_DATA_BUF_PART_1_2;
+
+ if (memcmp(results->wm_pipe, previous.wm_pipe,
+ sizeof(results->wm_pipe)) == 0 &&
+ memcmp(results->wm_lp, previous.wm_lp,
+ sizeof(results->wm_lp)) == 0 &&
+ memcmp(results->wm_lp_spr, previous.wm_lp_spr,
+ sizeof(results->wm_lp_spr)) == 0 &&
+ memcmp(results->wm_linetime, previous.wm_linetime,
+ sizeof(results->wm_linetime)) == 0 &&
+ partitioning == prev_partitioning)
+ return;
+
+ if (previous.wm_lp[2] != 0)
+ I915_WRITE(WM3_LP_ILK, 0);
+ if (previous.wm_lp[1] != 0)
+ I915_WRITE(WM2_LP_ILK, 0);
+ if (previous.wm_lp[0] != 0)
+ I915_WRITE(WM1_LP_ILK, 0);
+
+ if (previous.wm_pipe[0] != results->wm_pipe[0])
+ I915_WRITE(WM0_PIPEA_ILK, results->wm_pipe[0]);
+ if (previous.wm_pipe[1] != results->wm_pipe[1])
+ I915_WRITE(WM0_PIPEB_ILK, results->wm_pipe[1]);
+ if (previous.wm_pipe[2] != results->wm_pipe[2])
+ I915_WRITE(WM0_PIPEC_IVB, results->wm_pipe[2]);
+
+ if (previous.wm_linetime[0] != results->wm_linetime[0])
+ I915_WRITE(PIPE_WM_LINETIME(PIPE_A), results->wm_linetime[0]);
+ if (previous.wm_linetime[1] != results->wm_linetime[1])
+ I915_WRITE(PIPE_WM_LINETIME(PIPE_B), results->wm_linetime[1]);
+ if (previous.wm_linetime[2] != results->wm_linetime[2])
+ I915_WRITE(PIPE_WM_LINETIME(PIPE_C), results->wm_linetime[2]);
+
+ if (prev_partitioning != partitioning) {
+ val = I915_READ(WM_MISC);
+ if (partitioning == HSW_DATA_BUF_PART_1_2)
+ val &= ~WM_MISC_DATA_PARTITION_5_6;
+ else
+ val |= WM_MISC_DATA_PARTITION_5_6;
+ I915_WRITE(WM_MISC, val);
+ }
+
+ if (previous.wm_lp_spr[0] != results->wm_lp_spr[0])
+ I915_WRITE(WM1S_LP_ILK, results->wm_lp_spr[0]);
+ if (previous.wm_lp_spr[1] != results->wm_lp_spr[1])
+ I915_WRITE(WM2S_LP_IVB, results->wm_lp_spr[1]);
+ if (previous.wm_lp_spr[2] != results->wm_lp_spr[2])
+ I915_WRITE(WM3S_LP_IVB, results->wm_lp_spr[2]);
+
+ if (results->wm_lp[0] != 0)
+ I915_WRITE(WM1_LP_ILK, results->wm_lp[0]);
+ if (results->wm_lp[1] != 0)
+ I915_WRITE(WM2_LP_ILK, results->wm_lp[1]);
+ if (results->wm_lp[2] != 0)
+ I915_WRITE(WM3_LP_ILK, results->wm_lp[2]);
+}
+
+static void haswell_update_wm(struct drm_device *dev)
+{
+ struct drm_i915_private *dev_priv = dev->dev_private;
+ struct hsw_pipe_wm_parameters params[3];
+ struct hsw_wm_values results;
+ uint32_t wm[5];
- sandybridge_update_wm(dev);
+ hsw_compute_wm_parameters(dev, params, wm);
+ hsw_compute_wm_results(dev, params, wm, &results);
+ hsw_write_wm_values(dev_priv, &results, HSW_DATA_BUF_PART_1_2);
}
static void haswell_update_sprite_wm(struct drm_device *dev, int pipe,
--
1.8.1.2
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH 2/3] drm/i915: properly set HSW WM_LP watermarks
2013-05-29 16:24 ` Ville Syrjälä
@ 2013-05-31 13:12 ` Paulo Zanoni
2013-05-31 13:58 ` Ville Syrjälä
0 siblings, 1 reply; 29+ messages in thread
From: Paulo Zanoni @ 2013-05-31 13:12 UTC (permalink / raw)
To: intel-gfx; +Cc: Paulo Zanoni
From: Paulo Zanoni <paulo.r.zanoni@intel.com>
We were previously only setting the WM_PIPE registers, now we are
setting the LP watermark registers. This should allow deeper PC
states, resulting in power savings.
We're only using 1/2 data buffer partitioning for now.
v2: Merge both hsw_compute_pri_wm_* functions (Ville)
v3: - Simplify hsw_compute_wm_results (Ville)
- Rebase due to changes on the previous patch
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
---
drivers/gpu/drm/i915/i915_reg.h | 4 +
drivers/gpu/drm/i915/intel_pm.c | 180 ++++++++++++++++++++++++++++++++++++----
2 files changed, 166 insertions(+), 18 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 5a49f8a..8176ba9 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -3084,6 +3084,10 @@
#define WM3S_LP_IVB 0x45128
#define WM1S_LP_EN (1<<31)
+#define HSW_WM_LP_VAL(lat, fbc, pri, cur) \
+ (WM3_LP_EN | ((lat) << WM1_LP_LATENCY_SHIFT) | \
+ ((fbc) << WM1_LP_FBC_SHIFT) | ((pri) << WM1_LP_SR_SHIFT) | (cur))
+
/* Memory latency timer register */
#define MLTR_ILK 0x11222
#define MLTR_WM1_SHIFT 0
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index fda7279..3ff9ff3 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -2129,6 +2129,12 @@ static uint32_t hsw_wm_method2(uint32_t pixel_rate, uint32_t pipe_htotal,
return ret;
}
+static uint32_t hsw_wm_fbc(uint32_t pri_val, uint32_t horiz_pixels,
+ uint8_t bytes_per_pixel)
+{
+ return DIV_ROUND_UP(pri_val * 64, horiz_pixels * bytes_per_pixel) + 2;
+}
+
struct hsw_pipe_wm_parameters {
bool active;
bool sprite_enabled;
@@ -2142,11 +2148,28 @@ struct hsw_pipe_wm_parameters {
uint32_t pixel_rate;
};
+struct hsw_wm_maximums {
+ uint16_t pri;
+ uint16_t spr;
+ uint16_t cur;
+ uint16_t fbc;
+};
+
+struct hsw_lp_wm_result {
+ bool enable;
+ bool fbc_enable;
+ uint32_t pri_val;
+ uint32_t spr_val;
+ uint32_t cur_val;
+ uint32_t fbc_val;
+};
+
struct hsw_wm_values {
uint32_t wm_pipe[3];
uint32_t wm_lp[3];
uint32_t wm_lp_spr[3];
uint32_t wm_linetime[3];
+ bool enable_fbc_wm;
};
enum hsw_data_buf_partitioning {
@@ -2154,17 +2177,31 @@ enum hsw_data_buf_partitioning {
HSW_DATA_BUF_PART_5_6,
};
-/* Only for WM_PIPE. */
-static uint32_t hsw_compute_pri_wm_pipe(struct hsw_pipe_wm_parameters *params,
- uint32_t mem_value)
+/* For both WM_PIPE and WM_LP. */
+static uint32_t hsw_compute_pri_wm(struct hsw_pipe_wm_parameters *params,
+ uint32_t mem_value,
+ bool is_lp)
{
+ uint32_t method1, method2;
+
/* TODO: for now, assume the primary plane is always enabled. */
if (!params->active)
return 0;
- return hsw_wm_method1(params->pixel_rate,
- params->pri_bytes_per_pixel,
- mem_value);
+ method1 = hsw_wm_method1(params->pixel_rate,
+ params->pri_bytes_per_pixel,
+ mem_value);
+
+ if (!is_lp)
+ return method1;
+
+ method2 = hsw_wm_method2(params->pixel_rate,
+ params->pipe_htotal,
+ params->pri_horiz_pixels,
+ params->pri_bytes_per_pixel,
+ mem_value);
+
+ return min(method1, method2);
}
/* For both WM_PIPE and WM_LP. */
@@ -2201,13 +2238,60 @@ static uint32_t hsw_compute_cur_wm(struct hsw_pipe_wm_parameters *params,
mem_value);
}
+/* Only for WM_LP. */
+static uint32_t hsw_compute_fbc_wm(struct hsw_pipe_wm_parameters *params,
+ uint32_t pri_val,
+ uint32_t mem_value)
+{
+ if (!params->active)
+ return 0;
+
+ return hsw_wm_fbc(pri_val,
+ params->pri_horiz_pixels,
+ params->pri_bytes_per_pixel);
+}
+
+static bool hsw_compute_lp_wm(uint32_t mem_value, struct hsw_wm_maximums *max,
+ struct hsw_pipe_wm_parameters *params,
+ struct hsw_lp_wm_result *result)
+{
+ enum pipe pipe;
+ uint32_t pri_val[3], spr_val[3], cur_val[3], fbc_val[3];
+
+ for (pipe = PIPE_A; pipe <= PIPE_C; pipe++) {
+ struct hsw_pipe_wm_parameters *p = ¶ms[pipe];
+
+ pri_val[pipe] = hsw_compute_pri_wm(p, mem_value, true);
+ spr_val[pipe] = hsw_compute_spr_wm(p, mem_value);
+ cur_val[pipe] = hsw_compute_cur_wm(p, mem_value);
+ fbc_val[pipe] = hsw_compute_fbc_wm(p, pri_val[pipe], mem_value);
+ }
+
+ result->pri_val = max3(pri_val[0], pri_val[1], pri_val[2]);
+ result->spr_val = max3(spr_val[0], spr_val[1], spr_val[2]);
+ result->cur_val = max3(cur_val[0], cur_val[1], cur_val[2]);
+ result->fbc_val = max3(fbc_val[0], fbc_val[1], fbc_val[2]);
+
+ if (result->fbc_val > max->fbc) {
+ result->fbc_enable = false;
+ result->fbc_val = 0;
+ } else {
+ result->fbc_enable = true;
+ }
+
+ result->enable = result->pri_val <= max->pri &&
+ result->spr_val <= max->spr &&
+ result->cur_val <= max->cur;
+ return result->enable;
+}
+
static uint32_t hsw_compute_wm_pipe(struct drm_i915_private *dev_priv,
uint32_t mem_value, enum pipe pipe,
struct hsw_pipe_wm_parameters *params)
{
uint32_t pri_val, cur_val, spr_val;
- pri_val = hsw_compute_pri_wm_pipe(params, mem_value);
+ pri_val = hsw_compute_pri_wm(params, mem_value, false);
spr_val = hsw_compute_spr_wm(params, mem_value);
cur_val = hsw_compute_cur_wm(params, mem_value);
@@ -2250,13 +2334,15 @@ hsw_compute_linetime_wm(struct drm_device *dev, struct drm_crtc *crtc)
static void hsw_compute_wm_parameters(struct drm_device *dev,
struct hsw_pipe_wm_parameters *params,
- uint32_t *wm)
+ uint32_t *wm,
+ struct hsw_wm_maximums *lp_max_1_2)
{
struct drm_i915_private *dev_priv = dev->dev_private;
struct drm_crtc *crtc;
struct drm_plane *plane;
uint64_t sskpd = I915_READ64(MCH_SSKPD);
enum pipe pipe;
+ int pipes_active = 0, sprites_enabled = 0;
if ((sskpd >> 56) & 0xFF)
wm[0] = (sskpd >> 56) & 0xFF;
@@ -2278,6 +2364,8 @@ static void hsw_compute_wm_parameters(struct drm_device *dev,
if (!p->active)
continue;
+ pipes_active++;
+
p->pipe_htotal = intel_crtc->config.adjusted_mode.htotal;
p->pixel_rate = hsw_wm_get_pixel_rate(dev, crtc);
p->pri_bytes_per_pixel = crtc->fb->bits_per_pixel / 8;
@@ -2297,25 +2385,67 @@ static void hsw_compute_wm_parameters(struct drm_device *dev,
p->sprite_enabled = intel_plane->wm.enable;
p->spr_bytes_per_pixel = intel_plane->wm.bytes_per_pixel;
p->spr_horiz_pixels = intel_plane->wm.horiz_pixels;
+
+ if (p->sprite_enabled)
+ sprites_enabled++;
+ }
+
+ if (pipes_active > 1) {
+ lp_max_1_2->pri = sprites_enabled ? 128 : 256;
+ lp_max_1_2->spr = 128;
+ lp_max_1_2->cur = 64;
+ } else {
+ lp_max_1_2->pri = sprites_enabled ? 384 : 768;
+ lp_max_1_2->spr = 384;
+ lp_max_1_2->cur = 255;
}
+ lp_max_1_2->fbc = 15;
}
static void hsw_compute_wm_results(struct drm_device *dev,
struct hsw_pipe_wm_parameters *params,
uint32_t *wm,
+ struct hsw_wm_maximums *lp_maximums,
struct hsw_wm_values *results)
{
struct drm_i915_private *dev_priv = dev->dev_private;
struct drm_crtc *crtc;
+ struct hsw_lp_wm_result lp_results[4] = {};
enum pipe pipe;
+ int level, max_level;
+
+ for (level = 1; level <= 4; level++)
+ if (!hsw_compute_lp_wm(wm[level], lp_maximums, params,
+ &lp_results[level - 1]))
+ break;
+ max_level = level - 1;
+
+ /* The spec says it is preferred to disable FBC WMs instead of disabling
+ * a WM level. */
+ results->enable_fbc_wm = true;
+ for (level = 1; level <= max_level; level++) {
+ if (!lp_results[level - 1].fbc_enable) {
+ results->enable_fbc_wm = false;
+ break;
+ }
+ }
+
+ memset(results, 0, sizeof(*results));
+ for (level = 1; level <= max_level; level++) {
+ const struct hsw_lp_wm_result *r;
+ int used_level;
- /* No support for LP WMs yet. */
- results->wm_lp[2] = 0;
- results->wm_lp[1] = 0;
- results->wm_lp[0] = 0;
- results->wm_lp_spr[2] = 0;
- results->wm_lp_spr[1] = 0;
- results->wm_lp_spr[0] = 0;
+ used_level = (max_level == 4 && level > 1) ? level + 1 : level;
+ if (used_level > max_level)
+ break;
+
+ r = &lp_results[used_level - 1];
+ results->wm_lp[level - 1] = HSW_WM_LP_VAL(used_level * 2,
+ r->fbc_val,
+ r->pri_val,
+ r->cur_val);
+ results->wm_lp_spr[level - 1] = r->spr_val;
+ }
for_each_pipe(pipe)
results->wm_pipe[pipe] = hsw_compute_wm_pipe(dev_priv, wm[0],
@@ -2339,6 +2469,7 @@ static void hsw_write_wm_values(struct drm_i915_private *dev_priv,
struct hsw_wm_values previous;
uint32_t val;
enum hsw_data_buf_partitioning prev_partitioning;
+ bool prev_enable_fbc_wm;
previous.wm_pipe[0] = I915_READ(WM0_PIPEA_ILK);
previous.wm_pipe[1] = I915_READ(WM0_PIPEB_ILK);
@@ -2356,6 +2487,8 @@ static void hsw_write_wm_values(struct drm_i915_private *dev_priv,
prev_partitioning = (I915_READ(WM_MISC) & WM_MISC_DATA_PARTITION_5_6) ?
HSW_DATA_BUF_PART_5_6 : HSW_DATA_BUF_PART_1_2;
+ prev_enable_fbc_wm = !(I915_READ(DISP_ARB_CTL) & DISP_FBC_WM_DIS);
+
if (memcmp(results->wm_pipe, previous.wm_pipe,
sizeof(results->wm_pipe)) == 0 &&
memcmp(results->wm_lp, previous.wm_lp,
@@ -2364,7 +2497,8 @@ static void hsw_write_wm_values(struct drm_i915_private *dev_priv,
sizeof(results->wm_lp_spr)) == 0 &&
memcmp(results->wm_linetime, previous.wm_linetime,
sizeof(results->wm_linetime)) == 0 &&
- partitioning == prev_partitioning)
+ partitioning == prev_partitioning &&
+ results->enable_fbc_wm == prev_enable_fbc_wm)
return;
if (previous.wm_lp[2] != 0)
@@ -2397,6 +2531,15 @@ static void hsw_write_wm_values(struct drm_i915_private *dev_priv,
I915_WRITE(WM_MISC, val);
}
+ if (prev_enable_fbc_wm != results->enable_fbc_wm) {
+ val = I915_READ(DISP_ARB_CTL);
+ if (results->enable_fbc_wm)
+ val &= ~DISP_FBC_WM_DIS;
+ else
+ val |= DISP_FBC_WM_DIS;
+ I915_WRITE(DISP_ARB_CTL, val);
+ }
+
if (previous.wm_lp_spr[0] != results->wm_lp_spr[0])
I915_WRITE(WM1S_LP_ILK, results->wm_lp_spr[0]);
if (previous.wm_lp_spr[1] != results->wm_lp_spr[1])
@@ -2415,12 +2558,13 @@ static void hsw_write_wm_values(struct drm_i915_private *dev_priv,
static void haswell_update_wm(struct drm_device *dev)
{
struct drm_i915_private *dev_priv = dev->dev_private;
+ struct hsw_wm_maximums lp_max_1_2;
struct hsw_pipe_wm_parameters params[3];
struct hsw_wm_values results;
uint32_t wm[5];
- hsw_compute_wm_parameters(dev, params, wm);
- hsw_compute_wm_results(dev, params, wm, &results);
+ hsw_compute_wm_parameters(dev, params, wm, &lp_max_1_2);
+ hsw_compute_wm_results(dev, params, wm, &lp_max_1_2, &results);
hsw_write_wm_values(dev_priv, &results, HSW_DATA_BUF_PART_1_2);
}
--
1.8.1.2
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH 3/3] drm/i915: add support for 5/6 data buffer partitioning on Haswell
2013-05-29 16:17 ` Ville Syrjälä
@ 2013-05-31 13:19 ` Paulo Zanoni
2013-05-31 13:44 ` Ville Syrjälä
0 siblings, 1 reply; 29+ messages in thread
From: Paulo Zanoni @ 2013-05-31 13:19 UTC (permalink / raw)
To: intel-gfx; +Cc: Paulo Zanoni
From: Paulo Zanoni <paulo.r.zanoni@intel.com>
Now we compute the results for both 1/2 and 5/6 partitioning and then
use hsw_find_best_result to choose which one to use.
With this patch, Haswell watermarks support should be in good shape.
The only improvement we're missing is the case where the primary plane
is disabled: we always assume it's enabled, so we take it into
consideration when calculating the watermarks.
v2: - Check the latency when finding the best result
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
---
drivers/gpu/drm/i915/intel_pm.c | 64 ++++++++++++++++++++++++++++++++++-------
1 file changed, 53 insertions(+), 11 deletions(-)
I was going to implement Ville's review, but then I realized we don't check
whether we're using level 4 or level 3, so now instead of assigning "i" we
assign the latency, which reflects which level we're using.
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 3ff9ff3..a6eae70 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -2335,7 +2335,8 @@ hsw_compute_linetime_wm(struct drm_device *dev, struct drm_crtc *crtc)
static void hsw_compute_wm_parameters(struct drm_device *dev,
struct hsw_pipe_wm_parameters *params,
uint32_t *wm,
- struct hsw_wm_maximums *lp_max_1_2)
+ struct hsw_wm_maximums *lp_max_1_2,
+ struct hsw_wm_maximums *lp_max_5_6)
{
struct drm_i915_private *dev_priv = dev->dev_private;
struct drm_crtc *crtc;
@@ -2391,15 +2392,17 @@ static void hsw_compute_wm_parameters(struct drm_device *dev,
}
if (pipes_active > 1) {
- lp_max_1_2->pri = sprites_enabled ? 128 : 256;
- lp_max_1_2->spr = 128;
- lp_max_1_2->cur = 64;
+ lp_max_1_2->pri = lp_max_5_6->pri = sprites_enabled ? 128 : 256;
+ lp_max_1_2->spr = lp_max_5_6->spr = 128;
+ lp_max_1_2->cur = lp_max_5_6->cur = 64;
} else {
lp_max_1_2->pri = sprites_enabled ? 384 : 768;
+ lp_max_5_6->pri = sprites_enabled ? 128 : 768;
lp_max_1_2->spr = 384;
- lp_max_1_2->cur = 255;
+ lp_max_5_6->spr = 640;
+ lp_max_1_2->cur = lp_max_5_6->cur = 255;
}
- lp_max_1_2->fbc = 15;
+ lp_max_1_2->fbc = lp_max_5_6->fbc = 15;
}
static void hsw_compute_wm_results(struct drm_device *dev,
@@ -2458,6 +2461,32 @@ static void hsw_compute_wm_results(struct drm_device *dev,
}
}
+/* Find the result with the highest level enabled. Check for enable_fbc_wm in
+ * case both are at the same level. Prefer r1 in case they're the same. */
+struct hsw_wm_values *hsw_find_best_result(struct hsw_wm_values *r1,
+ struct hsw_wm_values *r2)
+{
+ int i, val_r1 = 0, val_r2 = 0;
+
+ for (i = 0; i < 3; i++) {
+ if (r1->wm_lp[i] & WM3_LP_EN)
+ val_r1 = r1->wm_lp[i] & WM1_LP_LATENCY_MASK;
+ if (r2->wm_lp[i] & WM3_LP_EN)
+ val_r2 = r2->wm_lp[i] & WM1_LP_LATENCY_MASK;
+ }
+
+ if (val_r1 == val_r2) {
+ if (r2->enable_fbc_wm && !r1->enable_fbc_wm)
+ return r2;
+ else
+ return r1;
+ } else if (val_r1 > val_r2) {
+ return r1;
+ } else {
+ return r2;
+ }
+}
+
/*
* The spec says we shouldn't write when we don't need, because every write
* causes WMs to be re-evaluated, expending some power.
@@ -2558,14 +2587,27 @@ static void hsw_write_wm_values(struct drm_i915_private *dev_priv,
static void haswell_update_wm(struct drm_device *dev)
{
struct drm_i915_private *dev_priv = dev->dev_private;
- struct hsw_wm_maximums lp_max_1_2;
+ struct hsw_wm_maximums lp_max_1_2, lp_max_5_6;
struct hsw_pipe_wm_parameters params[3];
- struct hsw_wm_values results;
+ struct hsw_wm_values results_1_2, results_5_6, *best_results;
uint32_t wm[5];
+ enum hsw_data_buf_partitioning partitioning;
+
+ hsw_compute_wm_parameters(dev, params, wm, &lp_max_1_2, &lp_max_5_6);
+
+ hsw_compute_wm_results(dev, params, wm, &lp_max_1_2, &results_1_2);
+ if (lp_max_1_2.pri != lp_max_5_6.pri) {
+ hsw_compute_wm_results(dev, params, wm, &lp_max_5_6,
+ &results_5_6);
+ best_results = hsw_find_best_result(&results_1_2, &results_5_6);
+ } else {
+ best_results = &results_1_2;
+ }
+
+ partitioning = (best_results == &results_1_2) ?
+ HSW_DATA_BUF_PART_1_2 : HSW_DATA_BUF_PART_5_6;
- hsw_compute_wm_parameters(dev, params, wm, &lp_max_1_2);
- hsw_compute_wm_results(dev, params, wm, &lp_max_1_2, &results);
- hsw_write_wm_values(dev_priv, &results, HSW_DATA_BUF_PART_1_2);
+ hsw_write_wm_values(dev_priv, best_results, partitioning);
}
static void haswell_update_sprite_wm(struct drm_device *dev, int pipe,
--
1.8.1.2
^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [PATCH 3/3] drm/i915: add support for 5/6 data buffer partitioning on Haswell
2013-05-31 13:19 ` [PATCH 3/3] " Paulo Zanoni
@ 2013-05-31 13:44 ` Ville Syrjälä
2013-05-31 15:19 ` Daniel Vetter
0 siblings, 1 reply; 29+ messages in thread
From: Ville Syrjälä @ 2013-05-31 13:44 UTC (permalink / raw)
To: Paulo Zanoni; +Cc: intel-gfx, Paulo Zanoni
On Fri, May 31, 2013 at 10:19:21AM -0300, Paulo Zanoni wrote:
> From: Paulo Zanoni <paulo.r.zanoni@intel.com>
>
> Now we compute the results for both 1/2 and 5/6 partitioning and then
> use hsw_find_best_result to choose which one to use.
>
> With this patch, Haswell watermarks support should be in good shape.
> The only improvement we're missing is the case where the primary plane
> is disabled: we always assume it's enabled, so we take it into
> consideration when calculating the watermarks.
>
> v2: - Check the latency when finding the best result
>
> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
> ---
> drivers/gpu/drm/i915/intel_pm.c | 64 ++++++++++++++++++++++++++++++++++-------
> 1 file changed, 53 insertions(+), 11 deletions(-)
>
>
> I was going to implement Ville's review, but then I realized we don't check
> whether we're using level 4 or level 3, so now instead of assigning "i" we
> assign the latency, which reflects which level we're using.
Makes sense. For pre-HSW I guess the same code would still work. It would just
have the same latency value for both 1/2 and 5/6, so the code would end
up working the same way as v1 did, which is still OK.
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
>
>
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index 3ff9ff3..a6eae70 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -2335,7 +2335,8 @@ hsw_compute_linetime_wm(struct drm_device *dev, struct drm_crtc *crtc)
> static void hsw_compute_wm_parameters(struct drm_device *dev,
> struct hsw_pipe_wm_parameters *params,
> uint32_t *wm,
> - struct hsw_wm_maximums *lp_max_1_2)
> + struct hsw_wm_maximums *lp_max_1_2,
> + struct hsw_wm_maximums *lp_max_5_6)
> {
> struct drm_i915_private *dev_priv = dev->dev_private;
> struct drm_crtc *crtc;
> @@ -2391,15 +2392,17 @@ static void hsw_compute_wm_parameters(struct drm_device *dev,
> }
>
> if (pipes_active > 1) {
> - lp_max_1_2->pri = sprites_enabled ? 128 : 256;
> - lp_max_1_2->spr = 128;
> - lp_max_1_2->cur = 64;
> + lp_max_1_2->pri = lp_max_5_6->pri = sprites_enabled ? 128 : 256;
> + lp_max_1_2->spr = lp_max_5_6->spr = 128;
> + lp_max_1_2->cur = lp_max_5_6->cur = 64;
> } else {
> lp_max_1_2->pri = sprites_enabled ? 384 : 768;
> + lp_max_5_6->pri = sprites_enabled ? 128 : 768;
> lp_max_1_2->spr = 384;
> - lp_max_1_2->cur = 255;
> + lp_max_5_6->spr = 640;
> + lp_max_1_2->cur = lp_max_5_6->cur = 255;
> }
> - lp_max_1_2->fbc = 15;
> + lp_max_1_2->fbc = lp_max_5_6->fbc = 15;
> }
>
> static void hsw_compute_wm_results(struct drm_device *dev,
> @@ -2458,6 +2461,32 @@ static void hsw_compute_wm_results(struct drm_device *dev,
> }
> }
>
> +/* Find the result with the highest level enabled. Check for enable_fbc_wm in
> + * case both are at the same level. Prefer r1 in case they're the same. */
> +struct hsw_wm_values *hsw_find_best_result(struct hsw_wm_values *r1,
> + struct hsw_wm_values *r2)
> +{
> + int i, val_r1 = 0, val_r2 = 0;
> +
> + for (i = 0; i < 3; i++) {
> + if (r1->wm_lp[i] & WM3_LP_EN)
> + val_r1 = r1->wm_lp[i] & WM1_LP_LATENCY_MASK;
> + if (r2->wm_lp[i] & WM3_LP_EN)
> + val_r2 = r2->wm_lp[i] & WM1_LP_LATENCY_MASK;
> + }
> +
> + if (val_r1 == val_r2) {
> + if (r2->enable_fbc_wm && !r1->enable_fbc_wm)
> + return r2;
> + else
> + return r1;
> + } else if (val_r1 > val_r2) {
> + return r1;
> + } else {
> + return r2;
> + }
> +}
> +
> /*
> * The spec says we shouldn't write when we don't need, because every write
> * causes WMs to be re-evaluated, expending some power.
> @@ -2558,14 +2587,27 @@ static void hsw_write_wm_values(struct drm_i915_private *dev_priv,
> static void haswell_update_wm(struct drm_device *dev)
> {
> struct drm_i915_private *dev_priv = dev->dev_private;
> - struct hsw_wm_maximums lp_max_1_2;
> + struct hsw_wm_maximums lp_max_1_2, lp_max_5_6;
> struct hsw_pipe_wm_parameters params[3];
> - struct hsw_wm_values results;
> + struct hsw_wm_values results_1_2, results_5_6, *best_results;
> uint32_t wm[5];
> + enum hsw_data_buf_partitioning partitioning;
> +
> + hsw_compute_wm_parameters(dev, params, wm, &lp_max_1_2, &lp_max_5_6);
> +
> + hsw_compute_wm_results(dev, params, wm, &lp_max_1_2, &results_1_2);
> + if (lp_max_1_2.pri != lp_max_5_6.pri) {
> + hsw_compute_wm_results(dev, params, wm, &lp_max_5_6,
> + &results_5_6);
> + best_results = hsw_find_best_result(&results_1_2, &results_5_6);
> + } else {
> + best_results = &results_1_2;
> + }
> +
> + partitioning = (best_results == &results_1_2) ?
> + HSW_DATA_BUF_PART_1_2 : HSW_DATA_BUF_PART_5_6;
>
> - hsw_compute_wm_parameters(dev, params, wm, &lp_max_1_2);
> - hsw_compute_wm_results(dev, params, wm, &lp_max_1_2, &results);
> - hsw_write_wm_values(dev_priv, &results, HSW_DATA_BUF_PART_1_2);
> + hsw_write_wm_values(dev_priv, best_results, partitioning);
> }
>
> static void haswell_update_sprite_wm(struct drm_device *dev, int pipe,
> --
> 1.8.1.2
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
--
Ville Syrjälä
Intel OTC
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 2/3] drm/i915: properly set HSW WM_LP watermarks
2013-05-31 13:12 ` [PATCH 2/3] " Paulo Zanoni
@ 2013-05-31 13:58 ` Ville Syrjälä
2013-05-31 14:45 ` Paulo Zanoni
0 siblings, 1 reply; 29+ messages in thread
From: Ville Syrjälä @ 2013-05-31 13:58 UTC (permalink / raw)
To: Paulo Zanoni; +Cc: intel-gfx, Paulo Zanoni
On Fri, May 31, 2013 at 10:12:22AM -0300, Paulo Zanoni wrote:
> From: Paulo Zanoni <paulo.r.zanoni@intel.com>
>
> We were previously only setting the WM_PIPE registers, now we are
> setting the LP watermark registers. This should allow deeper PC
> states, resulting in power savings.
>
> We're only using 1/2 data buffer partitioning for now.
>
> v2: Merge both hsw_compute_pri_wm_* functions (Ville)
> v3: - Simplify hsw_compute_wm_results (Ville)
> - Rebase due to changes on the previous patch
>
> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
> ---
> drivers/gpu/drm/i915/i915_reg.h | 4 +
> drivers/gpu/drm/i915/intel_pm.c | 180 ++++++++++++++++++++++++++++++++++++----
> 2 files changed, 166 insertions(+), 18 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index 5a49f8a..8176ba9 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -3084,6 +3084,10 @@
> #define WM3S_LP_IVB 0x45128
> #define WM1S_LP_EN (1<<31)
>
> +#define HSW_WM_LP_VAL(lat, fbc, pri, cur) \
> + (WM3_LP_EN | ((lat) << WM1_LP_LATENCY_SHIFT) | \
> + ((fbc) << WM1_LP_FBC_SHIFT) | ((pri) << WM1_LP_SR_SHIFT) | (cur))
> +
> /* Memory latency timer register */
> #define MLTR_ILK 0x11222
> #define MLTR_WM1_SHIFT 0
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index fda7279..3ff9ff3 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -2129,6 +2129,12 @@ static uint32_t hsw_wm_method2(uint32_t pixel_rate, uint32_t pipe_htotal,
> return ret;
> }
>
> +static uint32_t hsw_wm_fbc(uint32_t pri_val, uint32_t horiz_pixels,
> + uint8_t bytes_per_pixel)
> +{
> + return DIV_ROUND_UP(pri_val * 64, horiz_pixels * bytes_per_pixel) + 2;
> +}
> +
> struct hsw_pipe_wm_parameters {
> bool active;
> bool sprite_enabled;
> @@ -2142,11 +2148,28 @@ struct hsw_pipe_wm_parameters {
> uint32_t pixel_rate;
> };
>
> +struct hsw_wm_maximums {
> + uint16_t pri;
> + uint16_t spr;
> + uint16_t cur;
> + uint16_t fbc;
> +};
> +
> +struct hsw_lp_wm_result {
> + bool enable;
> + bool fbc_enable;
> + uint32_t pri_val;
> + uint32_t spr_val;
> + uint32_t cur_val;
> + uint32_t fbc_val;
> +};
> +
> struct hsw_wm_values {
> uint32_t wm_pipe[3];
> uint32_t wm_lp[3];
> uint32_t wm_lp_spr[3];
> uint32_t wm_linetime[3];
> + bool enable_fbc_wm;
> };
>
> enum hsw_data_buf_partitioning {
> @@ -2154,17 +2177,31 @@ enum hsw_data_buf_partitioning {
> HSW_DATA_BUF_PART_5_6,
> };
>
> -/* Only for WM_PIPE. */
> -static uint32_t hsw_compute_pri_wm_pipe(struct hsw_pipe_wm_parameters *params,
> - uint32_t mem_value)
> +/* For both WM_PIPE and WM_LP. */
> +static uint32_t hsw_compute_pri_wm(struct hsw_pipe_wm_parameters *params,
> + uint32_t mem_value,
> + bool is_lp)
> {
> + uint32_t method1, method2;
> +
> /* TODO: for now, assume the primary plane is always enabled. */
> if (!params->active)
> return 0;
>
> - return hsw_wm_method1(params->pixel_rate,
> - params->pri_bytes_per_pixel,
> - mem_value);
> + method1 = hsw_wm_method1(params->pixel_rate,
> + params->pri_bytes_per_pixel,
> + mem_value);
> +
> + if (!is_lp)
> + return method1;
> +
> + method2 = hsw_wm_method2(params->pixel_rate,
> + params->pipe_htotal,
> + params->pri_horiz_pixels,
> + params->pri_bytes_per_pixel,
> + mem_value);
> +
> + return min(method1, method2);
> }
>
> /* For both WM_PIPE and WM_LP. */
> @@ -2201,13 +2238,60 @@ static uint32_t hsw_compute_cur_wm(struct hsw_pipe_wm_parameters *params,
> mem_value);
> }
>
> +/* Only for WM_LP. */
> +static uint32_t hsw_compute_fbc_wm(struct hsw_pipe_wm_parameters *params,
> + uint32_t pri_val,
> + uint32_t mem_value)
> +{
> + if (!params->active)
> + return 0;
> +
> + return hsw_wm_fbc(pri_val,
> + params->pri_horiz_pixels,
> + params->pri_bytes_per_pixel);
> +}
> +
> +static bool hsw_compute_lp_wm(uint32_t mem_value, struct hsw_wm_maximums *max,
> + struct hsw_pipe_wm_parameters *params,
> + struct hsw_lp_wm_result *result)
> +{
> + enum pipe pipe;
> + uint32_t pri_val[3], spr_val[3], cur_val[3], fbc_val[3];
> +
> + for (pipe = PIPE_A; pipe <= PIPE_C; pipe++) {
> + struct hsw_pipe_wm_parameters *p = ¶ms[pipe];
> +
> + pri_val[pipe] = hsw_compute_pri_wm(p, mem_value, true);
> + spr_val[pipe] = hsw_compute_spr_wm(p, mem_value);
> + cur_val[pipe] = hsw_compute_cur_wm(p, mem_value);
> + fbc_val[pipe] = hsw_compute_fbc_wm(p, pri_val[pipe], mem_value);
> + }
> +
> + result->pri_val = max3(pri_val[0], pri_val[1], pri_val[2]);
> + result->spr_val = max3(spr_val[0], spr_val[1], spr_val[2]);
> + result->cur_val = max3(cur_val[0], cur_val[1], cur_val[2]);
> + result->fbc_val = max3(fbc_val[0], fbc_val[1], fbc_val[2]);
> +
> + if (result->fbc_val > max->fbc) {
> + result->fbc_enable = false;
> + result->fbc_val = 0;
> + } else {
> + result->fbc_enable = true;
> + }
> +
> + result->enable = result->pri_val <= max->pri &&
> + result->spr_val <= max->spr &&
> + result->cur_val <= max->cur;
> + return result->enable;
> +}
> +
> static uint32_t hsw_compute_wm_pipe(struct drm_i915_private *dev_priv,
> uint32_t mem_value, enum pipe pipe,
> struct hsw_pipe_wm_parameters *params)
> {
> uint32_t pri_val, cur_val, spr_val;
>
> - pri_val = hsw_compute_pri_wm_pipe(params, mem_value);
> + pri_val = hsw_compute_pri_wm(params, mem_value, false);
> spr_val = hsw_compute_spr_wm(params, mem_value);
> cur_val = hsw_compute_cur_wm(params, mem_value);
>
> @@ -2250,13 +2334,15 @@ hsw_compute_linetime_wm(struct drm_device *dev, struct drm_crtc *crtc)
>
> static void hsw_compute_wm_parameters(struct drm_device *dev,
> struct hsw_pipe_wm_parameters *params,
> - uint32_t *wm)
> + uint32_t *wm,
> + struct hsw_wm_maximums *lp_max_1_2)
> {
> struct drm_i915_private *dev_priv = dev->dev_private;
> struct drm_crtc *crtc;
> struct drm_plane *plane;
> uint64_t sskpd = I915_READ64(MCH_SSKPD);
> enum pipe pipe;
> + int pipes_active = 0, sprites_enabled = 0;
>
> if ((sskpd >> 56) & 0xFF)
> wm[0] = (sskpd >> 56) & 0xFF;
> @@ -2278,6 +2364,8 @@ static void hsw_compute_wm_parameters(struct drm_device *dev,
> if (!p->active)
> continue;
>
> + pipes_active++;
> +
> p->pipe_htotal = intel_crtc->config.adjusted_mode.htotal;
> p->pixel_rate = hsw_wm_get_pixel_rate(dev, crtc);
> p->pri_bytes_per_pixel = crtc->fb->bits_per_pixel / 8;
> @@ -2297,25 +2385,67 @@ static void hsw_compute_wm_parameters(struct drm_device *dev,
> p->sprite_enabled = intel_plane->wm.enable;
> p->spr_bytes_per_pixel = intel_plane->wm.bytes_per_pixel;
> p->spr_horiz_pixels = intel_plane->wm.horiz_pixels;
> +
> + if (p->sprite_enabled)
> + sprites_enabled++;
> + }
> +
> + if (pipes_active > 1) {
> + lp_max_1_2->pri = sprites_enabled ? 128 : 256;
> + lp_max_1_2->spr = 128;
> + lp_max_1_2->cur = 64;
> + } else {
> + lp_max_1_2->pri = sprites_enabled ? 384 : 768;
> + lp_max_1_2->spr = 384;
> + lp_max_1_2->cur = 255;
> }
> + lp_max_1_2->fbc = 15;
> }
>
> static void hsw_compute_wm_results(struct drm_device *dev,
> struct hsw_pipe_wm_parameters *params,
> uint32_t *wm,
> + struct hsw_wm_maximums *lp_maximums,
> struct hsw_wm_values *results)
> {
> struct drm_i915_private *dev_priv = dev->dev_private;
> struct drm_crtc *crtc;
> + struct hsw_lp_wm_result lp_results[4] = {};
> enum pipe pipe;
> + int level, max_level;
> +
> + for (level = 1; level <= 4; level++)
> + if (!hsw_compute_lp_wm(wm[level], lp_maximums, params,
> + &lp_results[level - 1]))
> + break;
> + max_level = level - 1;
> +
> + /* The spec says it is preferred to disable FBC WMs instead of disabling
> + * a WM level. */
> + results->enable_fbc_wm = true;
> + for (level = 1; level <= max_level; level++) {
> + if (!lp_results[level - 1].fbc_enable) {
> + results->enable_fbc_wm = false;
> + break;
> + }
> + }
> +
> + memset(results, 0, sizeof(*results));
> + for (level = 1; level <= max_level; level++) {
> + const struct hsw_lp_wm_result *r;
> + int used_level;
Now you're calling both things "level". That's confusing. I'd just keep
"level" to mean the same thing as in BSpec (it goes from 0 to 4), and
name the other thing something else. Since the registers are called WM_LP_n,
I suggested "wm_lp" as the name of the variable earlier.
Also the loop shouldn't go up to max_level, it should just go up to 3
since we have just three WM_LP registers. Sure the used_level>max_level
check should break out before we overrun the array, but that's very
non-obvious.
>
> - /* No support for LP WMs yet. */
> - results->wm_lp[2] = 0;
> - results->wm_lp[1] = 0;
> - results->wm_lp[0] = 0;
> - results->wm_lp_spr[2] = 0;
> - results->wm_lp_spr[1] = 0;
> - results->wm_lp_spr[0] = 0;
> + used_level = (max_level == 4 && level > 1) ? level + 1 : level;
> + if (used_level > max_level)
> + break;
> +
> + r = &lp_results[used_level - 1];
> + results->wm_lp[level - 1] = HSW_WM_LP_VAL(used_level * 2,
> + r->fbc_val,
> + r->pri_val,
> + r->cur_val);
> + results->wm_lp_spr[level - 1] = r->spr_val;
> + }
>
> for_each_pipe(pipe)
> results->wm_pipe[pipe] = hsw_compute_wm_pipe(dev_priv, wm[0],
> @@ -2339,6 +2469,7 @@ static void hsw_write_wm_values(struct drm_i915_private *dev_priv,
> struct hsw_wm_values previous;
> uint32_t val;
> enum hsw_data_buf_partitioning prev_partitioning;
> + bool prev_enable_fbc_wm;
>
> previous.wm_pipe[0] = I915_READ(WM0_PIPEA_ILK);
> previous.wm_pipe[1] = I915_READ(WM0_PIPEB_ILK);
> @@ -2356,6 +2487,8 @@ static void hsw_write_wm_values(struct drm_i915_private *dev_priv,
> prev_partitioning = (I915_READ(WM_MISC) & WM_MISC_DATA_PARTITION_5_6) ?
> HSW_DATA_BUF_PART_5_6 : HSW_DATA_BUF_PART_1_2;
>
> + prev_enable_fbc_wm = !(I915_READ(DISP_ARB_CTL) & DISP_FBC_WM_DIS);
> +
> if (memcmp(results->wm_pipe, previous.wm_pipe,
> sizeof(results->wm_pipe)) == 0 &&
> memcmp(results->wm_lp, previous.wm_lp,
> @@ -2364,7 +2497,8 @@ static void hsw_write_wm_values(struct drm_i915_private *dev_priv,
> sizeof(results->wm_lp_spr)) == 0 &&
> memcmp(results->wm_linetime, previous.wm_linetime,
> sizeof(results->wm_linetime)) == 0 &&
> - partitioning == prev_partitioning)
> + partitioning == prev_partitioning &&
> + results->enable_fbc_wm == prev_enable_fbc_wm)
> return;
>
> if (previous.wm_lp[2] != 0)
> @@ -2397,6 +2531,15 @@ static void hsw_write_wm_values(struct drm_i915_private *dev_priv,
> I915_WRITE(WM_MISC, val);
> }
>
> + if (prev_enable_fbc_wm != results->enable_fbc_wm) {
> + val = I915_READ(DISP_ARB_CTL);
> + if (results->enable_fbc_wm)
> + val &= ~DISP_FBC_WM_DIS;
> + else
> + val |= DISP_FBC_WM_DIS;
> + I915_WRITE(DISP_ARB_CTL, val);
> + }
> +
> if (previous.wm_lp_spr[0] != results->wm_lp_spr[0])
> I915_WRITE(WM1S_LP_ILK, results->wm_lp_spr[0]);
> if (previous.wm_lp_spr[1] != results->wm_lp_spr[1])
> @@ -2415,12 +2558,13 @@ static void hsw_write_wm_values(struct drm_i915_private *dev_priv,
> static void haswell_update_wm(struct drm_device *dev)
> {
> struct drm_i915_private *dev_priv = dev->dev_private;
> + struct hsw_wm_maximums lp_max_1_2;
> struct hsw_pipe_wm_parameters params[3];
> struct hsw_wm_values results;
> uint32_t wm[5];
>
> - hsw_compute_wm_parameters(dev, params, wm);
> - hsw_compute_wm_results(dev, params, wm, &results);
> + hsw_compute_wm_parameters(dev, params, wm, &lp_max_1_2);
> + hsw_compute_wm_results(dev, params, wm, &lp_max_1_2, &results);
> hsw_write_wm_values(dev_priv, &results, HSW_DATA_BUF_PART_1_2);
> }
>
> --
> 1.8.1.2
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
--
Ville Syrjälä
Intel OTC
^ permalink raw reply [flat|nested] 29+ messages in thread
* [PATCH 2/3] drm/i915: properly set HSW WM_LP watermarks
2013-05-31 13:58 ` Ville Syrjälä
@ 2013-05-31 14:45 ` Paulo Zanoni
2013-05-31 15:05 ` Ville Syrjälä
0 siblings, 1 reply; 29+ messages in thread
From: Paulo Zanoni @ 2013-05-31 14:45 UTC (permalink / raw)
To: intel-gfx; +Cc: Paulo Zanoni
From: Paulo Zanoni <paulo.r.zanoni@intel.com>
We were previously only setting the WM_PIPE registers, now we are
setting the LP watermark registers. This should allow deeper PC
states, resulting in power savings.
We're only using 1/2 data buffer partitioning for now.
v2: Merge both hsw_compute_pri_wm_* functions (Ville)
v3: - Simplify hsw_compute_wm_results (Ville)
- Rebase due to changes on the previous patch
v4: Unconfuse wm_lp/level (Ville)
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
---
drivers/gpu/drm/i915/i915_reg.h | 4 +
drivers/gpu/drm/i915/intel_pm.c | 179 ++++++++++++++++++++++++++++++++++++----
2 files changed, 165 insertions(+), 18 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 5a49f8a..8176ba9 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -3084,6 +3084,10 @@
#define WM3S_LP_IVB 0x45128
#define WM1S_LP_EN (1<<31)
+#define HSW_WM_LP_VAL(lat, fbc, pri, cur) \
+ (WM3_LP_EN | ((lat) << WM1_LP_LATENCY_SHIFT) | \
+ ((fbc) << WM1_LP_FBC_SHIFT) | ((pri) << WM1_LP_SR_SHIFT) | (cur))
+
/* Memory latency timer register */
#define MLTR_ILK 0x11222
#define MLTR_WM1_SHIFT 0
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index fda7279..1373552 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -2129,6 +2129,12 @@ static uint32_t hsw_wm_method2(uint32_t pixel_rate, uint32_t pipe_htotal,
return ret;
}
+static uint32_t hsw_wm_fbc(uint32_t pri_val, uint32_t horiz_pixels,
+ uint8_t bytes_per_pixel)
+{
+ return DIV_ROUND_UP(pri_val * 64, horiz_pixels * bytes_per_pixel) + 2;
+}
+
struct hsw_pipe_wm_parameters {
bool active;
bool sprite_enabled;
@@ -2142,11 +2148,28 @@ struct hsw_pipe_wm_parameters {
uint32_t pixel_rate;
};
+struct hsw_wm_maximums {
+ uint16_t pri;
+ uint16_t spr;
+ uint16_t cur;
+ uint16_t fbc;
+};
+
+struct hsw_lp_wm_result {
+ bool enable;
+ bool fbc_enable;
+ uint32_t pri_val;
+ uint32_t spr_val;
+ uint32_t cur_val;
+ uint32_t fbc_val;
+};
+
struct hsw_wm_values {
uint32_t wm_pipe[3];
uint32_t wm_lp[3];
uint32_t wm_lp_spr[3];
uint32_t wm_linetime[3];
+ bool enable_fbc_wm;
};
enum hsw_data_buf_partitioning {
@@ -2154,17 +2177,31 @@ enum hsw_data_buf_partitioning {
HSW_DATA_BUF_PART_5_6,
};
-/* Only for WM_PIPE. */
-static uint32_t hsw_compute_pri_wm_pipe(struct hsw_pipe_wm_parameters *params,
- uint32_t mem_value)
+/* For both WM_PIPE and WM_LP. */
+static uint32_t hsw_compute_pri_wm(struct hsw_pipe_wm_parameters *params,
+ uint32_t mem_value,
+ bool is_lp)
{
+ uint32_t method1, method2;
+
/* TODO: for now, assume the primary plane is always enabled. */
if (!params->active)
return 0;
- return hsw_wm_method1(params->pixel_rate,
- params->pri_bytes_per_pixel,
- mem_value);
+ method1 = hsw_wm_method1(params->pixel_rate,
+ params->pri_bytes_per_pixel,
+ mem_value);
+
+ if (!is_lp)
+ return method1;
+
+ method2 = hsw_wm_method2(params->pixel_rate,
+ params->pipe_htotal,
+ params->pri_horiz_pixels,
+ params->pri_bytes_per_pixel,
+ mem_value);
+
+ return min(method1, method2);
}
/* For both WM_PIPE and WM_LP. */
@@ -2201,13 +2238,60 @@ static uint32_t hsw_compute_cur_wm(struct hsw_pipe_wm_parameters *params,
mem_value);
}
+/* Only for WM_LP. */
+static uint32_t hsw_compute_fbc_wm(struct hsw_pipe_wm_parameters *params,
+ uint32_t pri_val,
+ uint32_t mem_value)
+{
+ if (!params->active)
+ return 0;
+
+ return hsw_wm_fbc(pri_val,
+ params->pri_horiz_pixels,
+ params->pri_bytes_per_pixel);
+}
+
+static bool hsw_compute_lp_wm(uint32_t mem_value, struct hsw_wm_maximums *max,
+ struct hsw_pipe_wm_parameters *params,
+ struct hsw_lp_wm_result *result)
+{
+ enum pipe pipe;
+ uint32_t pri_val[3], spr_val[3], cur_val[3], fbc_val[3];
+
+ for (pipe = PIPE_A; pipe <= PIPE_C; pipe++) {
+ struct hsw_pipe_wm_parameters *p = ¶ms[pipe];
+
+ pri_val[pipe] = hsw_compute_pri_wm(p, mem_value, true);
+ spr_val[pipe] = hsw_compute_spr_wm(p, mem_value);
+ cur_val[pipe] = hsw_compute_cur_wm(p, mem_value);
+ fbc_val[pipe] = hsw_compute_fbc_wm(p, pri_val[pipe], mem_value);
+ }
+
+ result->pri_val = max3(pri_val[0], pri_val[1], pri_val[2]);
+ result->spr_val = max3(spr_val[0], spr_val[1], spr_val[2]);
+ result->cur_val = max3(cur_val[0], cur_val[1], cur_val[2]);
+ result->fbc_val = max3(fbc_val[0], fbc_val[1], fbc_val[2]);
+
+ if (result->fbc_val > max->fbc) {
+ result->fbc_enable = false;
+ result->fbc_val = 0;
+ } else {
+ result->fbc_enable = true;
+ }
+
+ result->enable = result->pri_val <= max->pri &&
+ result->spr_val <= max->spr &&
+ result->cur_val <= max->cur;
+ return result->enable;
+}
+
static uint32_t hsw_compute_wm_pipe(struct drm_i915_private *dev_priv,
uint32_t mem_value, enum pipe pipe,
struct hsw_pipe_wm_parameters *params)
{
uint32_t pri_val, cur_val, spr_val;
- pri_val = hsw_compute_pri_wm_pipe(params, mem_value);
+ pri_val = hsw_compute_pri_wm(params, mem_value, false);
spr_val = hsw_compute_spr_wm(params, mem_value);
cur_val = hsw_compute_cur_wm(params, mem_value);
@@ -2250,13 +2334,15 @@ hsw_compute_linetime_wm(struct drm_device *dev, struct drm_crtc *crtc)
static void hsw_compute_wm_parameters(struct drm_device *dev,
struct hsw_pipe_wm_parameters *params,
- uint32_t *wm)
+ uint32_t *wm,
+ struct hsw_wm_maximums *lp_max_1_2)
{
struct drm_i915_private *dev_priv = dev->dev_private;
struct drm_crtc *crtc;
struct drm_plane *plane;
uint64_t sskpd = I915_READ64(MCH_SSKPD);
enum pipe pipe;
+ int pipes_active = 0, sprites_enabled = 0;
if ((sskpd >> 56) & 0xFF)
wm[0] = (sskpd >> 56) & 0xFF;
@@ -2278,6 +2364,8 @@ static void hsw_compute_wm_parameters(struct drm_device *dev,
if (!p->active)
continue;
+ pipes_active++;
+
p->pipe_htotal = intel_crtc->config.adjusted_mode.htotal;
p->pixel_rate = hsw_wm_get_pixel_rate(dev, crtc);
p->pri_bytes_per_pixel = crtc->fb->bits_per_pixel / 8;
@@ -2297,25 +2385,66 @@ static void hsw_compute_wm_parameters(struct drm_device *dev,
p->sprite_enabled = intel_plane->wm.enable;
p->spr_bytes_per_pixel = intel_plane->wm.bytes_per_pixel;
p->spr_horiz_pixels = intel_plane->wm.horiz_pixels;
+
+ if (p->sprite_enabled)
+ sprites_enabled++;
+ }
+
+ if (pipes_active > 1) {
+ lp_max_1_2->pri = sprites_enabled ? 128 : 256;
+ lp_max_1_2->spr = 128;
+ lp_max_1_2->cur = 64;
+ } else {
+ lp_max_1_2->pri = sprites_enabled ? 384 : 768;
+ lp_max_1_2->spr = 384;
+ lp_max_1_2->cur = 255;
}
+ lp_max_1_2->fbc = 15;
}
static void hsw_compute_wm_results(struct drm_device *dev,
struct hsw_pipe_wm_parameters *params,
uint32_t *wm,
+ struct hsw_wm_maximums *lp_maximums,
struct hsw_wm_values *results)
{
struct drm_i915_private *dev_priv = dev->dev_private;
struct drm_crtc *crtc;
+ struct hsw_lp_wm_result lp_results[4] = {};
enum pipe pipe;
+ int level, max_level, wm_lp;
+
+ for (level = 1; level <= 4; level++)
+ if (!hsw_compute_lp_wm(wm[level], lp_maximums, params,
+ &lp_results[level - 1]))
+ break;
+ max_level = level - 1;
+
+ /* The spec says it is preferred to disable FBC WMs instead of disabling
+ * a WM level. */
+ results->enable_fbc_wm = true;
+ for (level = 1; level <= max_level; level++) {
+ if (!lp_results[level - 1].fbc_enable) {
+ results->enable_fbc_wm = false;
+ break;
+ }
+ }
+
+ memset(results, 0, sizeof(*results));
+ for (wm_lp = 1; wm_lp <= 3; wm_lp++) {
+ const struct hsw_lp_wm_result *r;
- /* No support for LP WMs yet. */
- results->wm_lp[2] = 0;
- results->wm_lp[1] = 0;
- results->wm_lp[0] = 0;
- results->wm_lp_spr[2] = 0;
- results->wm_lp_spr[1] = 0;
- results->wm_lp_spr[0] = 0;
+ level = (max_level == 4 && wm_lp > 1) ? wm_lp + 1 : wm_lp;
+ if (level > max_level)
+ break;
+
+ r = &lp_results[level - 1];
+ results->wm_lp[wm_lp - 1] = HSW_WM_LP_VAL(level * 2,
+ r->fbc_val,
+ r->pri_val,
+ r->cur_val);
+ results->wm_lp_spr[wm_lp - 1] = r->spr_val;
+ }
for_each_pipe(pipe)
results->wm_pipe[pipe] = hsw_compute_wm_pipe(dev_priv, wm[0],
@@ -2339,6 +2468,7 @@ static void hsw_write_wm_values(struct drm_i915_private *dev_priv,
struct hsw_wm_values previous;
uint32_t val;
enum hsw_data_buf_partitioning prev_partitioning;
+ bool prev_enable_fbc_wm;
previous.wm_pipe[0] = I915_READ(WM0_PIPEA_ILK);
previous.wm_pipe[1] = I915_READ(WM0_PIPEB_ILK);
@@ -2356,6 +2486,8 @@ static void hsw_write_wm_values(struct drm_i915_private *dev_priv,
prev_partitioning = (I915_READ(WM_MISC) & WM_MISC_DATA_PARTITION_5_6) ?
HSW_DATA_BUF_PART_5_6 : HSW_DATA_BUF_PART_1_2;
+ prev_enable_fbc_wm = !(I915_READ(DISP_ARB_CTL) & DISP_FBC_WM_DIS);
+
if (memcmp(results->wm_pipe, previous.wm_pipe,
sizeof(results->wm_pipe)) == 0 &&
memcmp(results->wm_lp, previous.wm_lp,
@@ -2364,7 +2496,8 @@ static void hsw_write_wm_values(struct drm_i915_private *dev_priv,
sizeof(results->wm_lp_spr)) == 0 &&
memcmp(results->wm_linetime, previous.wm_linetime,
sizeof(results->wm_linetime)) == 0 &&
- partitioning == prev_partitioning)
+ partitioning == prev_partitioning &&
+ results->enable_fbc_wm == prev_enable_fbc_wm)
return;
if (previous.wm_lp[2] != 0)
@@ -2397,6 +2530,15 @@ static void hsw_write_wm_values(struct drm_i915_private *dev_priv,
I915_WRITE(WM_MISC, val);
}
+ if (prev_enable_fbc_wm != results->enable_fbc_wm) {
+ val = I915_READ(DISP_ARB_CTL);
+ if (results->enable_fbc_wm)
+ val &= ~DISP_FBC_WM_DIS;
+ else
+ val |= DISP_FBC_WM_DIS;
+ I915_WRITE(DISP_ARB_CTL, val);
+ }
+
if (previous.wm_lp_spr[0] != results->wm_lp_spr[0])
I915_WRITE(WM1S_LP_ILK, results->wm_lp_spr[0]);
if (previous.wm_lp_spr[1] != results->wm_lp_spr[1])
@@ -2415,12 +2557,13 @@ static void hsw_write_wm_values(struct drm_i915_private *dev_priv,
static void haswell_update_wm(struct drm_device *dev)
{
struct drm_i915_private *dev_priv = dev->dev_private;
+ struct hsw_wm_maximums lp_max_1_2;
struct hsw_pipe_wm_parameters params[3];
struct hsw_wm_values results;
uint32_t wm[5];
- hsw_compute_wm_parameters(dev, params, wm);
- hsw_compute_wm_results(dev, params, wm, &results);
+ hsw_compute_wm_parameters(dev, params, wm, &lp_max_1_2);
+ hsw_compute_wm_results(dev, params, wm, &lp_max_1_2, &results);
hsw_write_wm_values(dev_priv, &results, HSW_DATA_BUF_PART_1_2);
}
--
1.8.1.2
^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [PATCH 1/3] drm/i915: properly set HSW WM_PIPE registers
2013-05-31 13:08 ` [PATCH 1/3] " Paulo Zanoni
@ 2013-05-31 15:03 ` Ville Syrjälä
0 siblings, 0 replies; 29+ messages in thread
From: Ville Syrjälä @ 2013-05-31 15:03 UTC (permalink / raw)
To: Paulo Zanoni; +Cc: intel-gfx, Paulo Zanoni
On Fri, May 31, 2013 at 10:08:35AM -0300, Paulo Zanoni wrote:
> From: Paulo Zanoni <paulo.r.zanoni@intel.com>
>
> We were previously calling sandybridge_update_wm on HSW, but the SNB
> function didn't really match the HSW specification, so we were just
> writing the wrong values.
>
> With this patch, the haswell_update_wm function will set the correct
> values for the WM_PIPE registers, but it will still keep all the LP
> watermarks disabled.
>
> The patch may look a little bit over-complicated for now, but it's
> because much of the infrastructure for setting the LP watermarks is
> already in place, so we won't have too much code churn on the patch
> that sets the LP watermarks.
>
> v2: - Fix pixel_rate on panel fitter case (Ville)
> - Try to not overflow (Ville)
> - Remove useless variable (Ville)
> - Fix p->pri_horiz_pixels (Paulo)
> v3: - Fix rounding errors on hsw_wm_method2 (Ville)
> v4: - Fix memcmp bug (Paulo)
>
> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
> ---
> drivers/gpu/drm/i915/i915_reg.h | 3 +
> drivers/gpu/drm/i915/intel_pm.c | 342 +++++++++++++++++++++++++++++++++++++---
> 2 files changed, 327 insertions(+), 18 deletions(-)
>
>
> While doing some more tests I found a memcmp bug that can be reproduced with
> some 2-screen configurations. This patch fixes it.
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
and shame on me for not catching it during review.
>
>
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index dbd9de5..5a49f8a 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -4931,6 +4931,9 @@
> #define SFUSE_STRAP_DDIC_DETECTED (1<<1)
> #define SFUSE_STRAP_DDID_DETECTED (1<<0)
>
> +#define WM_MISC 0x45260
> +#define WM_MISC_DATA_PARTITION_5_6 (1 << 0)
> +
> #define WM_DBG 0x45280
> #define WM_DBG_DISALLOW_MULTIPLE_LP (1<<0)
> #define WM_DBG_DISALLOW_MAXFIFO (1<<1)
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index 9328ed9..fda7279 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -2072,19 +2072,170 @@ static void ivybridge_update_wm(struct drm_device *dev)
> cursor_wm);
> }
>
> -static void
> -haswell_update_linetime_wm(struct drm_device *dev, struct drm_crtc *crtc)
> +static uint32_t hsw_wm_get_pixel_rate(struct drm_device *dev,
> + struct drm_crtc *crtc)
> +{
> + struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
> + uint32_t pixel_rate, pfit_size;
> +
> + if (intel_crtc->config.pixel_target_clock)
> + pixel_rate = intel_crtc->config.pixel_target_clock;
> + else
> + pixel_rate = intel_crtc->config.adjusted_mode.clock;
> +
> + /* We only use IF-ID interlacing. If we ever use PF-ID we'll need to
> + * adjust the pixel_rate here. */
> +
> + pfit_size = intel_crtc->config.pch_pfit.size;
> + if (pfit_size) {
> + uint64_t pipe_w, pipe_h, pfit_w, pfit_h;
> +
> + pipe_w = intel_crtc->config.requested_mode.hdisplay;
> + pipe_h = intel_crtc->config.requested_mode.vdisplay;
> + pfit_w = (pfit_size >> 16) & 0xFFFF;
> + pfit_h = pfit_size & 0xFFFF;
> + if (pipe_w < pfit_w)
> + pipe_w = pfit_w;
> + if (pipe_h < pfit_h)
> + pipe_h = pfit_h;
> +
> + pixel_rate = div_u64((uint64_t) pixel_rate * pipe_w * pipe_h,
> + pfit_w * pfit_h);
> + }
> +
> + return pixel_rate;
> +}
> +
> +static uint32_t hsw_wm_method1(uint32_t pixel_rate, uint8_t bytes_per_pixel,
> + uint32_t latency)
> +{
> + uint64_t ret;
> +
> + ret = (uint64_t) pixel_rate * bytes_per_pixel * latency;
> + ret = DIV_ROUND_UP_ULL(ret, 64 * 10000) + 2;
> +
> + return ret;
> +}
> +
> +static uint32_t hsw_wm_method2(uint32_t pixel_rate, uint32_t pipe_htotal,
> + uint32_t horiz_pixels, uint8_t bytes_per_pixel,
> + uint32_t latency)
> +{
> + uint32_t ret;
> +
> + ret = (latency * pixel_rate) / (pipe_htotal * 10000);
> + ret = (ret + 1) * horiz_pixels * bytes_per_pixel;
> + ret = DIV_ROUND_UP(ret, 64) + 2;
> + return ret;
> +}
> +
> +struct hsw_pipe_wm_parameters {
> + bool active;
> + bool sprite_enabled;
> + uint8_t pri_bytes_per_pixel;
> + uint8_t spr_bytes_per_pixel;
> + uint8_t cur_bytes_per_pixel;
> + uint32_t pri_horiz_pixels;
> + uint32_t spr_horiz_pixels;
> + uint32_t cur_horiz_pixels;
> + uint32_t pipe_htotal;
> + uint32_t pixel_rate;
> +};
> +
> +struct hsw_wm_values {
> + uint32_t wm_pipe[3];
> + uint32_t wm_lp[3];
> + uint32_t wm_lp_spr[3];
> + uint32_t wm_linetime[3];
> +};
> +
> +enum hsw_data_buf_partitioning {
> + HSW_DATA_BUF_PART_1_2,
> + HSW_DATA_BUF_PART_5_6,
> +};
> +
> +/* Only for WM_PIPE. */
> +static uint32_t hsw_compute_pri_wm_pipe(struct hsw_pipe_wm_parameters *params,
> + uint32_t mem_value)
> +{
> + /* TODO: for now, assume the primary plane is always enabled. */
> + if (!params->active)
> + return 0;
> +
> + return hsw_wm_method1(params->pixel_rate,
> + params->pri_bytes_per_pixel,
> + mem_value);
> +}
> +
> +/* For both WM_PIPE and WM_LP. */
> +static uint32_t hsw_compute_spr_wm(struct hsw_pipe_wm_parameters *params,
> + uint32_t mem_value)
> +{
> + uint32_t method1, method2;
> +
> + if (!params->active || !params->sprite_enabled)
> + return 0;
> +
> + method1 = hsw_wm_method1(params->pixel_rate,
> + params->spr_bytes_per_pixel,
> + mem_value);
> + method2 = hsw_wm_method2(params->pixel_rate,
> + params->pipe_htotal,
> + params->spr_horiz_pixels,
> + params->spr_bytes_per_pixel,
> + mem_value);
> + return min(method1, method2);
> +}
> +
> +/* For both WM_PIPE and WM_LP. */
> +static uint32_t hsw_compute_cur_wm(struct hsw_pipe_wm_parameters *params,
> + uint32_t mem_value)
> +{
> + if (!params->active)
> + return 0;
> +
> + return hsw_wm_method2(params->pixel_rate,
> + params->pipe_htotal,
> + params->cur_horiz_pixels,
> + params->cur_bytes_per_pixel,
> + mem_value);
> +}
> +
> +static uint32_t hsw_compute_wm_pipe(struct drm_i915_private *dev_priv,
> + uint32_t mem_value, enum pipe pipe,
> + struct hsw_pipe_wm_parameters *params)
> +{
> + uint32_t pri_val, cur_val, spr_val;
> +
> + pri_val = hsw_compute_pri_wm_pipe(params, mem_value);
> + spr_val = hsw_compute_spr_wm(params, mem_value);
> + cur_val = hsw_compute_cur_wm(params, mem_value);
> +
> + WARN(pri_val > 127,
> + "Primary WM error, mode not supported for pipe %c\n",
> + pipe_name(pipe));
> + WARN(spr_val > 127,
> + "Sprite WM error, mode not supported for pipe %c\n",
> + pipe_name(pipe));
> + WARN(cur_val > 63,
> + "Cursor WM error, mode not supported for pipe %c\n",
> + pipe_name(pipe));
> +
> + return (pri_val << WM0_PIPE_PLANE_SHIFT) |
> + (spr_val << WM0_PIPE_SPRITE_SHIFT) |
> + cur_val;
> +}
> +
> +static uint32_t
> +hsw_compute_linetime_wm(struct drm_device *dev, struct drm_crtc *crtc)
> {
> struct drm_i915_private *dev_priv = dev->dev_private;
> struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
> - enum pipe pipe = intel_crtc->pipe;
> struct drm_display_mode *mode = &intel_crtc->config.adjusted_mode;
> u32 linetime, ips_linetime;
>
> - if (!intel_crtc_active(crtc)) {
> - I915_WRITE(PIPE_WM_LINETIME(pipe), 0);
> - return;
> - }
> + if (!intel_crtc_active(crtc))
> + return 0;
>
> /* The WM are computed with base on how long it takes to fill a single
> * row at the given clock rate, multiplied by 8.
> @@ -2093,29 +2244,184 @@ haswell_update_linetime_wm(struct drm_device *dev, struct drm_crtc *crtc)
> ips_linetime = DIV_ROUND_CLOSEST(mode->htotal * 1000 * 8,
> intel_ddi_get_cdclk_freq(dev_priv));
>
> - I915_WRITE(PIPE_WM_LINETIME(pipe),
> - PIPE_WM_LINETIME_IPS_LINETIME(ips_linetime) |
> - PIPE_WM_LINETIME_TIME(linetime));
> + return PIPE_WM_LINETIME_IPS_LINETIME(ips_linetime) |
> + PIPE_WM_LINETIME_TIME(linetime);
> }
>
> -static void haswell_update_wm(struct drm_device *dev)
> +static void hsw_compute_wm_parameters(struct drm_device *dev,
> + struct hsw_pipe_wm_parameters *params,
> + uint32_t *wm)
> {
> struct drm_i915_private *dev_priv = dev->dev_private;
> struct drm_crtc *crtc;
> + struct drm_plane *plane;
> + uint64_t sskpd = I915_READ64(MCH_SSKPD);
> enum pipe pipe;
>
> - /* Disable the LP WMs before changine the linetime registers. This is
> - * just a temporary code that will be replaced soon. */
> - I915_WRITE(WM3_LP_ILK, 0);
> - I915_WRITE(WM2_LP_ILK, 0);
> - I915_WRITE(WM1_LP_ILK, 0);
> + if ((sskpd >> 56) & 0xFF)
> + wm[0] = (sskpd >> 56) & 0xFF;
> + else
> + wm[0] = sskpd & 0xF;
> + wm[1] = ((sskpd >> 4) & 0xFF) * 5;
> + wm[2] = ((sskpd >> 12) & 0xFF) * 5;
> + wm[3] = ((sskpd >> 20) & 0x1FF) * 5;
> + wm[4] = ((sskpd >> 32) & 0x1FF) * 5;
> +
> + list_for_each_entry(crtc, &dev->mode_config.crtc_list, head) {
> + struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
> + struct hsw_pipe_wm_parameters *p;
> +
> + pipe = intel_crtc->pipe;
> + p = ¶ms[pipe];
> +
> + p->active = intel_crtc_active(crtc);
> + if (!p->active)
> + continue;
> +
> + p->pipe_htotal = intel_crtc->config.adjusted_mode.htotal;
> + p->pixel_rate = hsw_wm_get_pixel_rate(dev, crtc);
> + p->pri_bytes_per_pixel = crtc->fb->bits_per_pixel / 8;
> + p->cur_bytes_per_pixel = 4;
> + p->pri_horiz_pixels =
> + intel_crtc->config.requested_mode.hdisplay;
> + p->cur_horiz_pixels = 64;
> + }
> +
> + list_for_each_entry(plane, &dev->mode_config.plane_list, head) {
> + struct intel_plane *intel_plane = to_intel_plane(plane);
> + struct hsw_pipe_wm_parameters *p;
> +
> + pipe = intel_plane->pipe;
> + p = ¶ms[pipe];
> +
> + p->sprite_enabled = intel_plane->wm.enable;
> + p->spr_bytes_per_pixel = intel_plane->wm.bytes_per_pixel;
> + p->spr_horiz_pixels = intel_plane->wm.horiz_pixels;
> + }
> +}
> +
> +static void hsw_compute_wm_results(struct drm_device *dev,
> + struct hsw_pipe_wm_parameters *params,
> + uint32_t *wm,
> + struct hsw_wm_values *results)
> +{
> + struct drm_i915_private *dev_priv = dev->dev_private;
> + struct drm_crtc *crtc;
> + enum pipe pipe;
> +
> + /* No support for LP WMs yet. */
> + results->wm_lp[2] = 0;
> + results->wm_lp[1] = 0;
> + results->wm_lp[0] = 0;
> + results->wm_lp_spr[2] = 0;
> + results->wm_lp_spr[1] = 0;
> + results->wm_lp_spr[0] = 0;
> +
> + for_each_pipe(pipe)
> + results->wm_pipe[pipe] = hsw_compute_wm_pipe(dev_priv, wm[0],
> + pipe,
> + ¶ms[pipe]);
>
> for_each_pipe(pipe) {
> crtc = dev_priv->pipe_to_crtc_mapping[pipe];
> - haswell_update_linetime_wm(dev, crtc);
> + results->wm_linetime[pipe] = hsw_compute_linetime_wm(dev, crtc);
> }
> +}
> +
> +/*
> + * The spec says we shouldn't write when we don't need, because every write
> + * causes WMs to be re-evaluated, expending some power.
> + */
> +static void hsw_write_wm_values(struct drm_i915_private *dev_priv,
> + struct hsw_wm_values *results,
> + enum hsw_data_buf_partitioning partitioning)
> +{
> + struct hsw_wm_values previous;
> + uint32_t val;
> + enum hsw_data_buf_partitioning prev_partitioning;
> +
> + previous.wm_pipe[0] = I915_READ(WM0_PIPEA_ILK);
> + previous.wm_pipe[1] = I915_READ(WM0_PIPEB_ILK);
> + previous.wm_pipe[2] = I915_READ(WM0_PIPEC_IVB);
> + previous.wm_lp[0] = I915_READ(WM1_LP_ILK);
> + previous.wm_lp[1] = I915_READ(WM2_LP_ILK);
> + previous.wm_lp[2] = I915_READ(WM3_LP_ILK);
> + previous.wm_lp_spr[0] = I915_READ(WM1S_LP_ILK);
> + previous.wm_lp_spr[1] = I915_READ(WM2S_LP_IVB);
> + previous.wm_lp_spr[2] = I915_READ(WM3S_LP_IVB);
> + previous.wm_linetime[0] = I915_READ(PIPE_WM_LINETIME(PIPE_A));
> + previous.wm_linetime[1] = I915_READ(PIPE_WM_LINETIME(PIPE_B));
> + previous.wm_linetime[2] = I915_READ(PIPE_WM_LINETIME(PIPE_C));
> +
> + prev_partitioning = (I915_READ(WM_MISC) & WM_MISC_DATA_PARTITION_5_6) ?
> + HSW_DATA_BUF_PART_5_6 : HSW_DATA_BUF_PART_1_2;
> +
> + if (memcmp(results->wm_pipe, previous.wm_pipe,
> + sizeof(results->wm_pipe)) == 0 &&
> + memcmp(results->wm_lp, previous.wm_lp,
> + sizeof(results->wm_lp)) == 0 &&
> + memcmp(results->wm_lp_spr, previous.wm_lp_spr,
> + sizeof(results->wm_lp_spr)) == 0 &&
> + memcmp(results->wm_linetime, previous.wm_linetime,
> + sizeof(results->wm_linetime)) == 0 &&
> + partitioning == prev_partitioning)
> + return;
> +
> + if (previous.wm_lp[2] != 0)
> + I915_WRITE(WM3_LP_ILK, 0);
> + if (previous.wm_lp[1] != 0)
> + I915_WRITE(WM2_LP_ILK, 0);
> + if (previous.wm_lp[0] != 0)
> + I915_WRITE(WM1_LP_ILK, 0);
> +
> + if (previous.wm_pipe[0] != results->wm_pipe[0])
> + I915_WRITE(WM0_PIPEA_ILK, results->wm_pipe[0]);
> + if (previous.wm_pipe[1] != results->wm_pipe[1])
> + I915_WRITE(WM0_PIPEB_ILK, results->wm_pipe[1]);
> + if (previous.wm_pipe[2] != results->wm_pipe[2])
> + I915_WRITE(WM0_PIPEC_IVB, results->wm_pipe[2]);
> +
> + if (previous.wm_linetime[0] != results->wm_linetime[0])
> + I915_WRITE(PIPE_WM_LINETIME(PIPE_A), results->wm_linetime[0]);
> + if (previous.wm_linetime[1] != results->wm_linetime[1])
> + I915_WRITE(PIPE_WM_LINETIME(PIPE_B), results->wm_linetime[1]);
> + if (previous.wm_linetime[2] != results->wm_linetime[2])
> + I915_WRITE(PIPE_WM_LINETIME(PIPE_C), results->wm_linetime[2]);
> +
> + if (prev_partitioning != partitioning) {
> + val = I915_READ(WM_MISC);
> + if (partitioning == HSW_DATA_BUF_PART_1_2)
> + val &= ~WM_MISC_DATA_PARTITION_5_6;
> + else
> + val |= WM_MISC_DATA_PARTITION_5_6;
> + I915_WRITE(WM_MISC, val);
> + }
> +
> + if (previous.wm_lp_spr[0] != results->wm_lp_spr[0])
> + I915_WRITE(WM1S_LP_ILK, results->wm_lp_spr[0]);
> + if (previous.wm_lp_spr[1] != results->wm_lp_spr[1])
> + I915_WRITE(WM2S_LP_IVB, results->wm_lp_spr[1]);
> + if (previous.wm_lp_spr[2] != results->wm_lp_spr[2])
> + I915_WRITE(WM3S_LP_IVB, results->wm_lp_spr[2]);
> +
> + if (results->wm_lp[0] != 0)
> + I915_WRITE(WM1_LP_ILK, results->wm_lp[0]);
> + if (results->wm_lp[1] != 0)
> + I915_WRITE(WM2_LP_ILK, results->wm_lp[1]);
> + if (results->wm_lp[2] != 0)
> + I915_WRITE(WM3_LP_ILK, results->wm_lp[2]);
> +}
> +
> +static void haswell_update_wm(struct drm_device *dev)
> +{
> + struct drm_i915_private *dev_priv = dev->dev_private;
> + struct hsw_pipe_wm_parameters params[3];
> + struct hsw_wm_values results;
> + uint32_t wm[5];
>
> - sandybridge_update_wm(dev);
> + hsw_compute_wm_parameters(dev, params, wm);
> + hsw_compute_wm_results(dev, params, wm, &results);
> + hsw_write_wm_values(dev_priv, &results, HSW_DATA_BUF_PART_1_2);
> }
>
> static void haswell_update_sprite_wm(struct drm_device *dev, int pipe,
> --
> 1.8.1.2
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
--
Ville Syrjälä
Intel OTC
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 2/3] drm/i915: properly set HSW WM_LP watermarks
2013-05-31 14:45 ` Paulo Zanoni
@ 2013-05-31 15:05 ` Ville Syrjälä
0 siblings, 0 replies; 29+ messages in thread
From: Ville Syrjälä @ 2013-05-31 15:05 UTC (permalink / raw)
To: Paulo Zanoni; +Cc: intel-gfx, Paulo Zanoni
On Fri, May 31, 2013 at 11:45:06AM -0300, Paulo Zanoni wrote:
> From: Paulo Zanoni <paulo.r.zanoni@intel.com>
>
> We were previously only setting the WM_PIPE registers, now we are
> setting the LP watermark registers. This should allow deeper PC
> states, resulting in power savings.
>
> We're only using 1/2 data buffer partitioning for now.
>
> v2: Merge both hsw_compute_pri_wm_* functions (Ville)
> v3: - Simplify hsw_compute_wm_results (Ville)
> - Rebase due to changes on the previous patch
> v4: Unconfuse wm_lp/level (Ville)
>
> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Looks good.
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> ---
> drivers/gpu/drm/i915/i915_reg.h | 4 +
> drivers/gpu/drm/i915/intel_pm.c | 179 ++++++++++++++++++++++++++++++++++++----
> 2 files changed, 165 insertions(+), 18 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index 5a49f8a..8176ba9 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -3084,6 +3084,10 @@
> #define WM3S_LP_IVB 0x45128
> #define WM1S_LP_EN (1<<31)
>
> +#define HSW_WM_LP_VAL(lat, fbc, pri, cur) \
> + (WM3_LP_EN | ((lat) << WM1_LP_LATENCY_SHIFT) | \
> + ((fbc) << WM1_LP_FBC_SHIFT) | ((pri) << WM1_LP_SR_SHIFT) | (cur))
> +
> /* Memory latency timer register */
> #define MLTR_ILK 0x11222
> #define MLTR_WM1_SHIFT 0
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index fda7279..1373552 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -2129,6 +2129,12 @@ static uint32_t hsw_wm_method2(uint32_t pixel_rate, uint32_t pipe_htotal,
> return ret;
> }
>
> +static uint32_t hsw_wm_fbc(uint32_t pri_val, uint32_t horiz_pixels,
> + uint8_t bytes_per_pixel)
> +{
> + return DIV_ROUND_UP(pri_val * 64, horiz_pixels * bytes_per_pixel) + 2;
> +}
> +
> struct hsw_pipe_wm_parameters {
> bool active;
> bool sprite_enabled;
> @@ -2142,11 +2148,28 @@ struct hsw_pipe_wm_parameters {
> uint32_t pixel_rate;
> };
>
> +struct hsw_wm_maximums {
> + uint16_t pri;
> + uint16_t spr;
> + uint16_t cur;
> + uint16_t fbc;
> +};
> +
> +struct hsw_lp_wm_result {
> + bool enable;
> + bool fbc_enable;
> + uint32_t pri_val;
> + uint32_t spr_val;
> + uint32_t cur_val;
> + uint32_t fbc_val;
> +};
> +
> struct hsw_wm_values {
> uint32_t wm_pipe[3];
> uint32_t wm_lp[3];
> uint32_t wm_lp_spr[3];
> uint32_t wm_linetime[3];
> + bool enable_fbc_wm;
> };
>
> enum hsw_data_buf_partitioning {
> @@ -2154,17 +2177,31 @@ enum hsw_data_buf_partitioning {
> HSW_DATA_BUF_PART_5_6,
> };
>
> -/* Only for WM_PIPE. */
> -static uint32_t hsw_compute_pri_wm_pipe(struct hsw_pipe_wm_parameters *params,
> - uint32_t mem_value)
> +/* For both WM_PIPE and WM_LP. */
> +static uint32_t hsw_compute_pri_wm(struct hsw_pipe_wm_parameters *params,
> + uint32_t mem_value,
> + bool is_lp)
> {
> + uint32_t method1, method2;
> +
> /* TODO: for now, assume the primary plane is always enabled. */
> if (!params->active)
> return 0;
>
> - return hsw_wm_method1(params->pixel_rate,
> - params->pri_bytes_per_pixel,
> - mem_value);
> + method1 = hsw_wm_method1(params->pixel_rate,
> + params->pri_bytes_per_pixel,
> + mem_value);
> +
> + if (!is_lp)
> + return method1;
> +
> + method2 = hsw_wm_method2(params->pixel_rate,
> + params->pipe_htotal,
> + params->pri_horiz_pixels,
> + params->pri_bytes_per_pixel,
> + mem_value);
> +
> + return min(method1, method2);
> }
>
> /* For both WM_PIPE and WM_LP. */
> @@ -2201,13 +2238,60 @@ static uint32_t hsw_compute_cur_wm(struct hsw_pipe_wm_parameters *params,
> mem_value);
> }
>
> +/* Only for WM_LP. */
> +static uint32_t hsw_compute_fbc_wm(struct hsw_pipe_wm_parameters *params,
> + uint32_t pri_val,
> + uint32_t mem_value)
> +{
> + if (!params->active)
> + return 0;
> +
> + return hsw_wm_fbc(pri_val,
> + params->pri_horiz_pixels,
> + params->pri_bytes_per_pixel);
> +}
> +
> +static bool hsw_compute_lp_wm(uint32_t mem_value, struct hsw_wm_maximums *max,
> + struct hsw_pipe_wm_parameters *params,
> + struct hsw_lp_wm_result *result)
> +{
> + enum pipe pipe;
> + uint32_t pri_val[3], spr_val[3], cur_val[3], fbc_val[3];
> +
> + for (pipe = PIPE_A; pipe <= PIPE_C; pipe++) {
> + struct hsw_pipe_wm_parameters *p = ¶ms[pipe];
> +
> + pri_val[pipe] = hsw_compute_pri_wm(p, mem_value, true);
> + spr_val[pipe] = hsw_compute_spr_wm(p, mem_value);
> + cur_val[pipe] = hsw_compute_cur_wm(p, mem_value);
> + fbc_val[pipe] = hsw_compute_fbc_wm(p, pri_val[pipe], mem_value);
> + }
> +
> + result->pri_val = max3(pri_val[0], pri_val[1], pri_val[2]);
> + result->spr_val = max3(spr_val[0], spr_val[1], spr_val[2]);
> + result->cur_val = max3(cur_val[0], cur_val[1], cur_val[2]);
> + result->fbc_val = max3(fbc_val[0], fbc_val[1], fbc_val[2]);
> +
> + if (result->fbc_val > max->fbc) {
> + result->fbc_enable = false;
> + result->fbc_val = 0;
> + } else {
> + result->fbc_enable = true;
> + }
> +
> + result->enable = result->pri_val <= max->pri &&
> + result->spr_val <= max->spr &&
> + result->cur_val <= max->cur;
> + return result->enable;
> +}
> +
> static uint32_t hsw_compute_wm_pipe(struct drm_i915_private *dev_priv,
> uint32_t mem_value, enum pipe pipe,
> struct hsw_pipe_wm_parameters *params)
> {
> uint32_t pri_val, cur_val, spr_val;
>
> - pri_val = hsw_compute_pri_wm_pipe(params, mem_value);
> + pri_val = hsw_compute_pri_wm(params, mem_value, false);
> spr_val = hsw_compute_spr_wm(params, mem_value);
> cur_val = hsw_compute_cur_wm(params, mem_value);
>
> @@ -2250,13 +2334,15 @@ hsw_compute_linetime_wm(struct drm_device *dev, struct drm_crtc *crtc)
>
> static void hsw_compute_wm_parameters(struct drm_device *dev,
> struct hsw_pipe_wm_parameters *params,
> - uint32_t *wm)
> + uint32_t *wm,
> + struct hsw_wm_maximums *lp_max_1_2)
> {
> struct drm_i915_private *dev_priv = dev->dev_private;
> struct drm_crtc *crtc;
> struct drm_plane *plane;
> uint64_t sskpd = I915_READ64(MCH_SSKPD);
> enum pipe pipe;
> + int pipes_active = 0, sprites_enabled = 0;
>
> if ((sskpd >> 56) & 0xFF)
> wm[0] = (sskpd >> 56) & 0xFF;
> @@ -2278,6 +2364,8 @@ static void hsw_compute_wm_parameters(struct drm_device *dev,
> if (!p->active)
> continue;
>
> + pipes_active++;
> +
> p->pipe_htotal = intel_crtc->config.adjusted_mode.htotal;
> p->pixel_rate = hsw_wm_get_pixel_rate(dev, crtc);
> p->pri_bytes_per_pixel = crtc->fb->bits_per_pixel / 8;
> @@ -2297,25 +2385,66 @@ static void hsw_compute_wm_parameters(struct drm_device *dev,
> p->sprite_enabled = intel_plane->wm.enable;
> p->spr_bytes_per_pixel = intel_plane->wm.bytes_per_pixel;
> p->spr_horiz_pixels = intel_plane->wm.horiz_pixels;
> +
> + if (p->sprite_enabled)
> + sprites_enabled++;
> + }
> +
> + if (pipes_active > 1) {
> + lp_max_1_2->pri = sprites_enabled ? 128 : 256;
> + lp_max_1_2->spr = 128;
> + lp_max_1_2->cur = 64;
> + } else {
> + lp_max_1_2->pri = sprites_enabled ? 384 : 768;
> + lp_max_1_2->spr = 384;
> + lp_max_1_2->cur = 255;
> }
> + lp_max_1_2->fbc = 15;
> }
>
> static void hsw_compute_wm_results(struct drm_device *dev,
> struct hsw_pipe_wm_parameters *params,
> uint32_t *wm,
> + struct hsw_wm_maximums *lp_maximums,
> struct hsw_wm_values *results)
> {
> struct drm_i915_private *dev_priv = dev->dev_private;
> struct drm_crtc *crtc;
> + struct hsw_lp_wm_result lp_results[4] = {};
> enum pipe pipe;
> + int level, max_level, wm_lp;
> +
> + for (level = 1; level <= 4; level++)
> + if (!hsw_compute_lp_wm(wm[level], lp_maximums, params,
> + &lp_results[level - 1]))
> + break;
> + max_level = level - 1;
> +
> + /* The spec says it is preferred to disable FBC WMs instead of disabling
> + * a WM level. */
> + results->enable_fbc_wm = true;
> + for (level = 1; level <= max_level; level++) {
> + if (!lp_results[level - 1].fbc_enable) {
> + results->enable_fbc_wm = false;
> + break;
> + }
> + }
> +
> + memset(results, 0, sizeof(*results));
> + for (wm_lp = 1; wm_lp <= 3; wm_lp++) {
> + const struct hsw_lp_wm_result *r;
>
> - /* No support for LP WMs yet. */
> - results->wm_lp[2] = 0;
> - results->wm_lp[1] = 0;
> - results->wm_lp[0] = 0;
> - results->wm_lp_spr[2] = 0;
> - results->wm_lp_spr[1] = 0;
> - results->wm_lp_spr[0] = 0;
> + level = (max_level == 4 && wm_lp > 1) ? wm_lp + 1 : wm_lp;
> + if (level > max_level)
> + break;
> +
> + r = &lp_results[level - 1];
> + results->wm_lp[wm_lp - 1] = HSW_WM_LP_VAL(level * 2,
> + r->fbc_val,
> + r->pri_val,
> + r->cur_val);
> + results->wm_lp_spr[wm_lp - 1] = r->spr_val;
> + }
>
> for_each_pipe(pipe)
> results->wm_pipe[pipe] = hsw_compute_wm_pipe(dev_priv, wm[0],
> @@ -2339,6 +2468,7 @@ static void hsw_write_wm_values(struct drm_i915_private *dev_priv,
> struct hsw_wm_values previous;
> uint32_t val;
> enum hsw_data_buf_partitioning prev_partitioning;
> + bool prev_enable_fbc_wm;
>
> previous.wm_pipe[0] = I915_READ(WM0_PIPEA_ILK);
> previous.wm_pipe[1] = I915_READ(WM0_PIPEB_ILK);
> @@ -2356,6 +2486,8 @@ static void hsw_write_wm_values(struct drm_i915_private *dev_priv,
> prev_partitioning = (I915_READ(WM_MISC) & WM_MISC_DATA_PARTITION_5_6) ?
> HSW_DATA_BUF_PART_5_6 : HSW_DATA_BUF_PART_1_2;
>
> + prev_enable_fbc_wm = !(I915_READ(DISP_ARB_CTL) & DISP_FBC_WM_DIS);
> +
> if (memcmp(results->wm_pipe, previous.wm_pipe,
> sizeof(results->wm_pipe)) == 0 &&
> memcmp(results->wm_lp, previous.wm_lp,
> @@ -2364,7 +2496,8 @@ static void hsw_write_wm_values(struct drm_i915_private *dev_priv,
> sizeof(results->wm_lp_spr)) == 0 &&
> memcmp(results->wm_linetime, previous.wm_linetime,
> sizeof(results->wm_linetime)) == 0 &&
> - partitioning == prev_partitioning)
> + partitioning == prev_partitioning &&
> + results->enable_fbc_wm == prev_enable_fbc_wm)
> return;
>
> if (previous.wm_lp[2] != 0)
> @@ -2397,6 +2530,15 @@ static void hsw_write_wm_values(struct drm_i915_private *dev_priv,
> I915_WRITE(WM_MISC, val);
> }
>
> + if (prev_enable_fbc_wm != results->enable_fbc_wm) {
> + val = I915_READ(DISP_ARB_CTL);
> + if (results->enable_fbc_wm)
> + val &= ~DISP_FBC_WM_DIS;
> + else
> + val |= DISP_FBC_WM_DIS;
> + I915_WRITE(DISP_ARB_CTL, val);
> + }
> +
> if (previous.wm_lp_spr[0] != results->wm_lp_spr[0])
> I915_WRITE(WM1S_LP_ILK, results->wm_lp_spr[0]);
> if (previous.wm_lp_spr[1] != results->wm_lp_spr[1])
> @@ -2415,12 +2557,13 @@ static void hsw_write_wm_values(struct drm_i915_private *dev_priv,
> static void haswell_update_wm(struct drm_device *dev)
> {
> struct drm_i915_private *dev_priv = dev->dev_private;
> + struct hsw_wm_maximums lp_max_1_2;
> struct hsw_pipe_wm_parameters params[3];
> struct hsw_wm_values results;
> uint32_t wm[5];
>
> - hsw_compute_wm_parameters(dev, params, wm);
> - hsw_compute_wm_results(dev, params, wm, &results);
> + hsw_compute_wm_parameters(dev, params, wm, &lp_max_1_2);
> + hsw_compute_wm_results(dev, params, wm, &lp_max_1_2, &results);
> hsw_write_wm_values(dev_priv, &results, HSW_DATA_BUF_PART_1_2);
> }
>
> --
> 1.8.1.2
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
--
Ville Syrjälä
Intel OTC
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 3/3] drm/i915: add support for 5/6 data buffer partitioning on Haswell
2013-05-31 13:44 ` Ville Syrjälä
@ 2013-05-31 15:19 ` Daniel Vetter
0 siblings, 0 replies; 29+ messages in thread
From: Daniel Vetter @ 2013-05-31 15:19 UTC (permalink / raw)
To: Ville Syrjälä; +Cc: intel-gfx, Paulo Zanoni
On Fri, May 31, 2013 at 04:44:28PM +0300, Ville Syrjälä wrote:
> On Fri, May 31, 2013 at 10:19:21AM -0300, Paulo Zanoni wrote:
> > From: Paulo Zanoni <paulo.r.zanoni@intel.com>
> >
> > Now we compute the results for both 1/2 and 5/6 partitioning and then
> > use hsw_find_best_result to choose which one to use.
> >
> > With this patch, Haswell watermarks support should be in good shape.
> > The only improvement we're missing is the case where the primary plane
> > is disabled: we always assume it's enabled, so we take it into
> > consideration when calculating the watermarks.
> >
> > v2: - Check the latency when finding the best result
> >
> > Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
> > ---
> > drivers/gpu/drm/i915/intel_pm.c | 64 ++++++++++++++++++++++++++++++++++-------
> > 1 file changed, 53 insertions(+), 11 deletions(-)
> >
> >
> > I was going to implement Ville's review, but then I realized we don't check
> > whether we're using level 4 or level 3, so now instead of assigning "i" we
> > assign the latency, which reflects which level we're using.
>
> Makes sense. For pre-HSW I guess the same code would still work. It would just
> have the same latency value for both 1/2 and 5/6, so the code would end
> up working the same way as v1 did, which is still OK.
>
> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
3 patches picked up for dinq (I hope the right ones, this thread is
massive, please check). Thanks for the patches and review.
-Daniel
>
> >
> >
> > diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> > index 3ff9ff3..a6eae70 100644
> > --- a/drivers/gpu/drm/i915/intel_pm.c
> > +++ b/drivers/gpu/drm/i915/intel_pm.c
> > @@ -2335,7 +2335,8 @@ hsw_compute_linetime_wm(struct drm_device *dev, struct drm_crtc *crtc)
> > static void hsw_compute_wm_parameters(struct drm_device *dev,
> > struct hsw_pipe_wm_parameters *params,
> > uint32_t *wm,
> > - struct hsw_wm_maximums *lp_max_1_2)
> > + struct hsw_wm_maximums *lp_max_1_2,
> > + struct hsw_wm_maximums *lp_max_5_6)
> > {
> > struct drm_i915_private *dev_priv = dev->dev_private;
> > struct drm_crtc *crtc;
> > @@ -2391,15 +2392,17 @@ static void hsw_compute_wm_parameters(struct drm_device *dev,
> > }
> >
> > if (pipes_active > 1) {
> > - lp_max_1_2->pri = sprites_enabled ? 128 : 256;
> > - lp_max_1_2->spr = 128;
> > - lp_max_1_2->cur = 64;
> > + lp_max_1_2->pri = lp_max_5_6->pri = sprites_enabled ? 128 : 256;
> > + lp_max_1_2->spr = lp_max_5_6->spr = 128;
> > + lp_max_1_2->cur = lp_max_5_6->cur = 64;
> > } else {
> > lp_max_1_2->pri = sprites_enabled ? 384 : 768;
> > + lp_max_5_6->pri = sprites_enabled ? 128 : 768;
> > lp_max_1_2->spr = 384;
> > - lp_max_1_2->cur = 255;
> > + lp_max_5_6->spr = 640;
> > + lp_max_1_2->cur = lp_max_5_6->cur = 255;
> > }
> > - lp_max_1_2->fbc = 15;
> > + lp_max_1_2->fbc = lp_max_5_6->fbc = 15;
> > }
> >
> > static void hsw_compute_wm_results(struct drm_device *dev,
> > @@ -2458,6 +2461,32 @@ static void hsw_compute_wm_results(struct drm_device *dev,
> > }
> > }
> >
> > +/* Find the result with the highest level enabled. Check for enable_fbc_wm in
> > + * case both are at the same level. Prefer r1 in case they're the same. */
> > +struct hsw_wm_values *hsw_find_best_result(struct hsw_wm_values *r1,
> > + struct hsw_wm_values *r2)
> > +{
> > + int i, val_r1 = 0, val_r2 = 0;
> > +
> > + for (i = 0; i < 3; i++) {
> > + if (r1->wm_lp[i] & WM3_LP_EN)
> > + val_r1 = r1->wm_lp[i] & WM1_LP_LATENCY_MASK;
> > + if (r2->wm_lp[i] & WM3_LP_EN)
> > + val_r2 = r2->wm_lp[i] & WM1_LP_LATENCY_MASK;
> > + }
> > +
> > + if (val_r1 == val_r2) {
> > + if (r2->enable_fbc_wm && !r1->enable_fbc_wm)
> > + return r2;
> > + else
> > + return r1;
> > + } else if (val_r1 > val_r2) {
> > + return r1;
> > + } else {
> > + return r2;
> > + }
> > +}
> > +
> > /*
> > * The spec says we shouldn't write when we don't need, because every write
> > * causes WMs to be re-evaluated, expending some power.
> > @@ -2558,14 +2587,27 @@ static void hsw_write_wm_values(struct drm_i915_private *dev_priv,
> > static void haswell_update_wm(struct drm_device *dev)
> > {
> > struct drm_i915_private *dev_priv = dev->dev_private;
> > - struct hsw_wm_maximums lp_max_1_2;
> > + struct hsw_wm_maximums lp_max_1_2, lp_max_5_6;
> > struct hsw_pipe_wm_parameters params[3];
> > - struct hsw_wm_values results;
> > + struct hsw_wm_values results_1_2, results_5_6, *best_results;
> > uint32_t wm[5];
> > + enum hsw_data_buf_partitioning partitioning;
> > +
> > + hsw_compute_wm_parameters(dev, params, wm, &lp_max_1_2, &lp_max_5_6);
> > +
> > + hsw_compute_wm_results(dev, params, wm, &lp_max_1_2, &results_1_2);
> > + if (lp_max_1_2.pri != lp_max_5_6.pri) {
> > + hsw_compute_wm_results(dev, params, wm, &lp_max_5_6,
> > + &results_5_6);
> > + best_results = hsw_find_best_result(&results_1_2, &results_5_6);
> > + } else {
> > + best_results = &results_1_2;
> > + }
> > +
> > + partitioning = (best_results == &results_1_2) ?
> > + HSW_DATA_BUF_PART_1_2 : HSW_DATA_BUF_PART_5_6;
> >
> > - hsw_compute_wm_parameters(dev, params, wm, &lp_max_1_2);
> > - hsw_compute_wm_results(dev, params, wm, &lp_max_1_2, &results);
> > - hsw_write_wm_values(dev_priv, &results, HSW_DATA_BUF_PART_1_2);
> > + hsw_write_wm_values(dev_priv, best_results, partitioning);
> > }
> >
> > static void haswell_update_sprite_wm(struct drm_device *dev, int pipe,
> > --
> > 1.8.1.2
> >
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
>
> --
> Ville Syrjälä
> Intel OTC
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
^ permalink raw reply [flat|nested] 29+ messages in thread
end of thread, other threads:[~2013-05-31 15:19 UTC | newest]
Thread overview: 29+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-05-24 14:59 [PATCH 0/5] Haswell watermarks Paulo Zanoni
2013-05-24 14:59 ` [PATCH 1/5] drm/i915: add "enable" argument to intel_update_sprite_watermarks Paulo Zanoni
2013-05-24 16:22 ` Ville Syrjälä
2013-05-24 14:59 ` [PATCH 2/5] drm/i915: add haswell_update_sprite_wm Paulo Zanoni
2013-05-24 17:00 ` Ville Syrjälä
2013-05-24 19:35 ` Daniel Vetter
2013-05-24 14:59 ` [PATCH 3/5] drm/i915: properly set HSW WM_PIPE registers Paulo Zanoni
2013-05-24 16:07 ` Ville Syrjälä
2013-05-24 22:00 ` Paulo Zanoni
2013-05-24 22:02 ` Paulo Zanoni
2013-05-27 11:07 ` Ville Syrjälä
2013-05-27 19:21 ` Paulo Zanoni
2013-05-29 15:39 ` Ville Syrjälä
2013-05-31 13:08 ` [PATCH 1/3] " Paulo Zanoni
2013-05-31 15:03 ` Ville Syrjälä
2013-05-24 14:59 ` [PATCH 4/5] drm/i915: properly set HSW WM_LP watermarks Paulo Zanoni
2013-05-24 16:11 ` Ville Syrjälä
2013-05-24 22:05 ` Paulo Zanoni
2013-05-29 16:06 ` Ville Syrjälä
2013-05-29 16:24 ` Ville Syrjälä
2013-05-31 13:12 ` [PATCH 2/3] " Paulo Zanoni
2013-05-31 13:58 ` Ville Syrjälä
2013-05-31 14:45 ` Paulo Zanoni
2013-05-31 15:05 ` Ville Syrjälä
2013-05-24 14:59 ` [PATCH 5/5] drm/i915: add support for 5/6 data buffer partitioning on Haswell Paulo Zanoni
2013-05-29 16:17 ` Ville Syrjälä
2013-05-31 13:19 ` [PATCH 3/3] " Paulo Zanoni
2013-05-31 13:44 ` Ville Syrjälä
2013-05-31 15:19 ` Daniel Vetter
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.