[PATCH 0/9] Fixed16.16 wrapper cleanup & wm optimization

public inbox for intel-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed

* [PATCH 0/9] Fixed16.16 wrapper cleanup & wm optimization
@ 2017-06-13 13:11 Mahesh Kumar
  2017-06-13 13:11 ` [PATCH 1/9] drm/i915: take-out common clamping code of fixed16 wrappers Mahesh Kumar
                   ` (9 more replies)
  0 siblings, 10 replies; 11+ messages in thread
From: Mahesh Kumar @ 2017-06-13 13:11 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

This series Include patches for:
	clean fixed16.16 naming & make them consistent
	reuse intel_compute_linetime_wm in hsw
	optimize wm calculation code
	enable/Implement trans wm calculation

Changes Since V1:
 - Split fixed16 cleanup code in more logical patches (Maarten)
 - make intel_compute_linetime_wm function static (Maarten)

Mahesh Kumar (9):
  drm/i915: take-out common clamping code of fixed16 wrappers
  drm/i915: Always perform internal fixed16 division in 64 bits
  drm/i915: cleanup fixed-point wrappers naming
  drm/i915: Addition wrapper for fixed16.16 operation
  drm/i915/hsw: use intel_compute_linetime_wm function for linetime wm
  drm/i915/skl+: WM calculation don't require height
  drm/i915/skl+: unify cpp value in WM calculation
  drm/i915/skl+: Optimize WM calculation
  drm/i915/gen10: Calculate and enable transition WM

 drivers/gpu/drm/i915/i915_drv.h |  83 ++++++----
 drivers/gpu/drm/i915/intel_pm.c | 325 ++++++++++++++++++++++++----------------
 2 files changed, 246 insertions(+), 162 deletions(-)

-- 
2.13.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/9] drm/i915: take-out common clamping code of fixed16 wrappers
  2017-06-13 13:11 [PATCH 0/9] Fixed16.16 wrapper cleanup & wm optimization Mahesh Kumar
@ 2017-06-13 13:11 ` Mahesh Kumar
  2017-06-13 13:11 ` [PATCH 2/9] drm/i915: Always perform internal fixed16 division in 64 bits Mahesh Kumar
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Mahesh Kumar @ 2017-06-13 13:11 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

This patch creates a new function for clamping u64 to fixed16.
And make use of this function in other fixed16 wrappers.

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h | 28 ++++++++++++----------------
 1 file changed, 12 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 602fb3324484..33b162dbee4e 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -160,6 +160,14 @@ static inline uint_fixed_16_16_t max_fixed_16_16(uint_fixed_16_16_t max1,
 	return max;
 }
 
+static inline uint_fixed_16_16_t clamp_u64_to_fixed16(uint64_t val)
+{
+	uint_fixed_16_16_t fp;
+	WARN_ON(val >> 32);
+	fp.val = clamp_t(uint32_t, val, 0, ~0);
+	return fp;
+}
+
 static inline uint32_t div_round_up_fixed16(uint_fixed_16_16_t val,
 					    uint_fixed_16_16_t d)
 {
@@ -170,26 +178,21 @@ static inline uint32_t mul_round_up_u32_fixed16(uint32_t val,
 						uint_fixed_16_16_t mul)
 {
 	uint64_t intermediate_val;
-	uint32_t result;
 
 	intermediate_val = (uint64_t) val * mul.val;
 	intermediate_val = DIV_ROUND_UP_ULL(intermediate_val, 1 << 16);
 	WARN_ON(intermediate_val >> 32);
-	result = clamp_t(uint32_t, intermediate_val, 0, ~0);
-	return result;
+	return clamp_t(uint32_t, intermediate_val, 0, ~0);
 }
 
 static inline uint_fixed_16_16_t mul_fixed16(uint_fixed_16_16_t val,
 					     uint_fixed_16_16_t mul)
 {
 	uint64_t intermediate_val;
-	uint_fixed_16_16_t fp;
 
 	intermediate_val = (uint64_t) val.val * mul.val;
 	intermediate_val = intermediate_val >> 16;
-	WARN_ON(intermediate_val >> 32);
-	fp.val = clamp_t(uint32_t, intermediate_val, 0, ~0);
-	return fp;
+	return clamp_u64_to_fixed16(intermediate_val);
 }
 
 static inline uint_fixed_16_16_t fixed_16_16_div(uint32_t val, uint32_t d)
@@ -203,15 +206,11 @@ static inline uint_fixed_16_16_t fixed_16_16_div(uint32_t val, uint32_t d)
 
 static inline uint_fixed_16_16_t fixed_16_16_div_u64(uint32_t val, uint32_t d)
 {
-	uint_fixed_16_16_t res;
 	uint64_t interm_val;
 
 	interm_val = (uint64_t)val << 16;
 	interm_val = DIV_ROUND_UP_ULL(interm_val, d);
-	WARN_ON(interm_val >> 32);
-	res.val = (uint32_t) interm_val;
-
-	return res;
+	return clamp_u64_to_fixed16(interm_val);
 }
 
 static inline uint32_t div_round_up_u32_fixed16(uint32_t val,
@@ -229,12 +228,9 @@ static inline uint_fixed_16_16_t mul_u32_fixed_16_16(uint32_t val,
 						     uint_fixed_16_16_t mul)
 {
 	uint64_t intermediate_val;
-	uint_fixed_16_16_t fp;
 
 	intermediate_val = (uint64_t) val * mul.val;
-	WARN_ON(intermediate_val >> 32);
-	fp.val = (uint32_t) intermediate_val;
-	return fp;
+	return clamp_u64_to_fixed16(intermediate_val);
 }
 
 static inline const char *yesno(bool v)
-- 
2.13.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/9] drm/i915: Always perform internal fixed16 division in 64 bits
  2017-06-13 13:11 [PATCH 0/9] Fixed16.16 wrapper cleanup & wm optimization Mahesh Kumar
  2017-06-13 13:11 ` [PATCH 1/9] drm/i915: take-out common clamping code of fixed16 wrappers Mahesh Kumar
@ 2017-06-13 13:11 ` Mahesh Kumar
  2017-06-13 13:11 ` [PATCH 3/9] drm/i915: cleanup fixed-point wrappers naming Mahesh Kumar
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Mahesh Kumar @ 2017-06-13 13:11 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

This patch combines fixed_16_16_div & fixed_16_16_div_u64 wrappers.
And new fixed_16_16_div wrapper always performs division operation in
u64 internally, to avoid any data loss which was happening in earlier
version of wrapper.
earlier wrapper was converting u32 to fixed16 in 32 bit so we were
loosing 16-MSB data.

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h | 9 ---------
 drivers/gpu/drm/i915/intel_pm.c | 4 ++--
 2 files changed, 2 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 33b162dbee4e..21e77b0e5ea2 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -197,15 +197,6 @@ static inline uint_fixed_16_16_t mul_fixed16(uint_fixed_16_16_t val,
 
 static inline uint_fixed_16_16_t fixed_16_16_div(uint32_t val, uint32_t d)
 {
-	uint_fixed_16_16_t fp, res;
-
-	fp = u32_to_fixed_16_16(val);
-	res.val = DIV_ROUND_UP(fp.val, d);
-	return res;
-}
-
-static inline uint_fixed_16_16_t fixed_16_16_div_u64(uint32_t val, uint32_t d)
-{
 	uint64_t interm_val;
 
 	interm_val = (uint64_t)val << 16;
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 0aed13dcedf0..0483e9110cf6 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -4359,7 +4359,7 @@ static uint_fixed_16_16_t skl_wm_method1(uint32_t pixel_rate, uint8_t cpp,
 		return FP_16_16_MAX;
 
 	wm_intermediate_val = latency * pixel_rate * cpp;
-	ret = fixed_16_16_div_u64(wm_intermediate_val, 1000 * 512);
+	ret = fixed_16_16_div(wm_intermediate_val, 1000 * 512);
 	return ret;
 }
 
@@ -4397,7 +4397,7 @@ intel_get_linetime_us(struct intel_crtc_state *cstate)
 		return u32_to_fixed_16_16(0);
 
 	crtc_htotal = cstate->base.adjusted_mode.crtc_htotal;
-	linetime_us = fixed_16_16_div_u64(crtc_htotal * 1000, pixel_rate);
+	linetime_us = fixed_16_16_div(crtc_htotal * 1000, pixel_rate);
 
 	return linetime_us;
 }
-- 
2.13.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 3/9] drm/i915: cleanup fixed-point wrappers naming
  2017-06-13 13:11 [PATCH 0/9] Fixed16.16 wrapper cleanup & wm optimization Mahesh Kumar
  2017-06-13 13:11 ` [PATCH 1/9] drm/i915: take-out common clamping code of fixed16 wrappers Mahesh Kumar
  2017-06-13 13:11 ` [PATCH 2/9] drm/i915: Always perform internal fixed16 division in 64 bits Mahesh Kumar
@ 2017-06-13 13:11 ` Mahesh Kumar
  2017-06-13 13:11 ` [PATCH 4/9] drm/i915: Addition wrapper for fixed16.16 operation Mahesh Kumar
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Mahesh Kumar @ 2017-06-13 13:11 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

This patch make naming of fixed-point wrappers consistent
operation_<any_post_operation>_<1st operand>_<2nd operand>
also shorten the name for fixed_16_16 to fixed16

s/u32_to_fixed_16_16/u32_to_fixed16
s/fixed_16_16_to_u32/fixed16_to_u32
s/fixed_16_16_to_u32_round_up/fixed16_to_u32_round_up
s/min_fixed_16_16/min_fixed16
s/max_fixed_16_16/max_fixed16
s/mul_u32_fixed_16_16/mul_u32_fixed16
s/fixed_16_16_div/div_fixed16

Changes Since V1:
 - Split the patch in more logical patches (Maarten)

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h | 14 +++++-----
 drivers/gpu/drm/i915/intel_pm.c | 59 ++++++++++++++++++++---------------------
 2 files changed, 36 insertions(+), 37 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 21e77b0e5ea2..d0698c290cba 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -122,7 +122,7 @@ static inline bool is_fixed16_zero(uint_fixed_16_16_t val)
 	return false;
 }
 
-static inline uint_fixed_16_16_t u32_to_fixed_16_16(uint32_t val)
+static inline uint_fixed_16_16_t u32_to_fixed16(uint32_t val)
 {
 	uint_fixed_16_16_t fp;
 
@@ -132,17 +132,17 @@ static inline uint_fixed_16_16_t u32_to_fixed_16_16(uint32_t val)
 	return fp;
 }
 
-static inline uint32_t fixed_16_16_to_u32_round_up(uint_fixed_16_16_t fp)
+static inline uint32_t fixed16_to_u32_round_up(uint_fixed_16_16_t fp)
 {
 	return DIV_ROUND_UP(fp.val, 1 << 16);
 }
 
-static inline uint32_t fixed_16_16_to_u32(uint_fixed_16_16_t fp)
+static inline uint32_t fixed16_to_u32(uint_fixed_16_16_t fp)
 {
 	return fp.val >> 16;
 }
 
-static inline uint_fixed_16_16_t min_fixed_16_16(uint_fixed_16_16_t min1,
+static inline uint_fixed_16_16_t min_fixed16(uint_fixed_16_16_t min1,
 						 uint_fixed_16_16_t min2)
 {
 	uint_fixed_16_16_t min;
@@ -151,7 +151,7 @@ static inline uint_fixed_16_16_t min_fixed_16_16(uint_fixed_16_16_t min1,
 	return min;
 }
 
-static inline uint_fixed_16_16_t max_fixed_16_16(uint_fixed_16_16_t max1,
+static inline uint_fixed_16_16_t max_fixed16(uint_fixed_16_16_t max1,
 						 uint_fixed_16_16_t max2)
 {
 	uint_fixed_16_16_t max;
@@ -195,7 +195,7 @@ static inline uint_fixed_16_16_t mul_fixed16(uint_fixed_16_16_t val,
 	return clamp_u64_to_fixed16(intermediate_val);
 }
 
-static inline uint_fixed_16_16_t fixed_16_16_div(uint32_t val, uint32_t d)
+static inline uint_fixed_16_16_t div_fixed16(uint32_t val, uint32_t d)
 {
 	uint64_t interm_val;
 
@@ -215,7 +215,7 @@ static inline uint32_t div_round_up_u32_fixed16(uint32_t val,
 	return clamp_t(uint32_t, interm_val, 0, ~0);
 }
 
-static inline uint_fixed_16_16_t mul_u32_fixed_16_16(uint32_t val,
+static inline uint_fixed_16_16_t mul_u32_fixed16(uint32_t val,
 						     uint_fixed_16_16_t mul)
 {
 	uint64_t intermediate_val;
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 0483e9110cf6..155f54a1f516 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3837,7 +3837,7 @@ skl_plane_downscale_amount(const struct intel_crtc_state *cstate,
 	uint_fixed_16_16_t downscale_h, downscale_w;
 
 	if (WARN_ON(!intel_wm_plane_visible(cstate, pstate)))
-		return u32_to_fixed_16_16(0);
+		return u32_to_fixed16(0);
 
 	/* n.b., src is 16.16 fixed point, dst is whole integer */
 	if (plane->id == PLANE_CURSOR) {
@@ -3861,10 +3861,10 @@ skl_plane_downscale_amount(const struct intel_crtc_state *cstate,
 		dst_h = drm_rect_height(&pstate->base.dst);
 	}
 
-	fp_w_ratio = fixed_16_16_div(src_w, dst_w);
-	fp_h_ratio = fixed_16_16_div(src_h, dst_h);
-	downscale_w = max_fixed_16_16(fp_w_ratio, u32_to_fixed_16_16(1));
-	downscale_h = max_fixed_16_16(fp_h_ratio, u32_to_fixed_16_16(1));
+	fp_w_ratio = div_fixed16(src_w, dst_w);
+	fp_h_ratio = div_fixed16(src_h, dst_h);
+	downscale_w = max_fixed16(fp_w_ratio, u32_to_fixed16(1));
+	downscale_h = max_fixed16(fp_h_ratio, u32_to_fixed16(1));
 
 	return mul_fixed16(downscale_w, downscale_h);
 }
@@ -3872,7 +3872,7 @@ skl_plane_downscale_amount(const struct intel_crtc_state *cstate,
 static uint_fixed_16_16_t
 skl_pipe_downscale_amount(const struct intel_crtc_state *crtc_state)
 {
-	uint_fixed_16_16_t pipe_downscale = u32_to_fixed_16_16(1);
+	uint_fixed_16_16_t pipe_downscale = u32_to_fixed16(1);
 
 	if (!crtc_state->base.enable)
 		return pipe_downscale;
@@ -3891,10 +3891,10 @@ skl_pipe_downscale_amount(const struct intel_crtc_state *crtc_state)
 		if (!dst_w || !dst_h)
 			return pipe_downscale;
 
-		fp_w_ratio = fixed_16_16_div(src_w, dst_w);
-		fp_h_ratio = fixed_16_16_div(src_h, dst_h);
-		downscale_w = max_fixed_16_16(fp_w_ratio, u32_to_fixed_16_16(1));
-		downscale_h = max_fixed_16_16(fp_h_ratio, u32_to_fixed_16_16(1));
+		fp_w_ratio = div_fixed16(src_w, dst_w);
+		fp_h_ratio = div_fixed16(src_h, dst_h);
+		downscale_w = max_fixed16(fp_w_ratio, u32_to_fixed16(1));
+		downscale_h = max_fixed16(fp_h_ratio, u32_to_fixed16(1));
 
 		pipe_downscale = mul_fixed16(downscale_w, downscale_h);
 	}
@@ -3913,14 +3913,14 @@ int skl_check_pipe_max_pixel_rate(struct intel_crtc *intel_crtc,
 	int crtc_clock, dotclk;
 	uint32_t pipe_max_pixel_rate;
 	uint_fixed_16_16_t pipe_downscale;
-	uint_fixed_16_16_t max_downscale = u32_to_fixed_16_16(1);
+	uint_fixed_16_16_t max_downscale = u32_to_fixed16(1);
 
 	if (!cstate->base.enable)
 		return 0;
 
 	drm_atomic_crtc_state_for_each_plane_state(plane, pstate, crtc_state) {
 		uint_fixed_16_16_t plane_downscale;
-		uint_fixed_16_16_t fp_9_div_8 = fixed_16_16_div(9, 8);
+		uint_fixed_16_16_t fp_9_div_8 = div_fixed16(9, 8);
 		int bpp;
 
 		if (!intel_wm_plane_visible(cstate,
@@ -3938,7 +3938,7 @@ int skl_check_pipe_max_pixel_rate(struct intel_crtc *intel_crtc,
 			plane_downscale = mul_fixed16(plane_downscale,
 						      fp_9_div_8);
 
-		max_downscale = max_fixed_16_16(plane_downscale, max_downscale);
+		max_downscale = max_fixed16(plane_downscale, max_downscale);
 	}
 	pipe_downscale = skl_pipe_downscale_amount(cstate);
 
@@ -4359,7 +4359,7 @@ static uint_fixed_16_16_t skl_wm_method1(uint32_t pixel_rate, uint8_t cpp,
 		return FP_16_16_MAX;
 
 	wm_intermediate_val = latency * pixel_rate * cpp;
-	ret = fixed_16_16_div(wm_intermediate_val, 1000 * 512);
+	ret = div_fixed16(wm_intermediate_val, 1000 * 512);
 	return ret;
 }
 
@@ -4377,7 +4377,7 @@ static uint_fixed_16_16_t skl_wm_method2(uint32_t pixel_rate,
 	wm_intermediate_val = latency * pixel_rate;
 	wm_intermediate_val = DIV_ROUND_UP(wm_intermediate_val,
 					   pipe_htotal * 1000);
-	ret = mul_u32_fixed_16_16(wm_intermediate_val, plane_blocks_per_line);
+	ret = mul_u32_fixed16(wm_intermediate_val, plane_blocks_per_line);
 	return ret;
 }
 
@@ -4389,15 +4389,15 @@ intel_get_linetime_us(struct intel_crtc_state *cstate)
 	uint_fixed_16_16_t linetime_us;
 
 	if (!cstate->base.active)
-		return u32_to_fixed_16_16(0);
+		return u32_to_fixed16(0);
 
 	pixel_rate = cstate->pixel_rate;
 
 	if (WARN_ON(pixel_rate == 0))
-		return u32_to_fixed_16_16(0);
+		return u32_to_fixed16(0);
 
 	crtc_htotal = cstate->base.adjusted_mode.crtc_htotal;
-	linetime_us = fixed_16_16_div(crtc_htotal * 1000, pixel_rate);
+	linetime_us = div_fixed16(crtc_htotal * 1000, pixel_rate);
 
 	return linetime_us;
 }
@@ -4513,14 +4513,14 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 	if (y_tiled) {
 		interm_pbpl = DIV_ROUND_UP(plane_bytes_per_line *
 					   y_min_scanlines, 512);
-		plane_blocks_per_line = fixed_16_16_div(interm_pbpl,
+		plane_blocks_per_line = div_fixed16(interm_pbpl,
 							y_min_scanlines);
 	} else if (x_tiled) {
 		interm_pbpl = DIV_ROUND_UP(plane_bytes_per_line, 512);
-		plane_blocks_per_line = u32_to_fixed_16_16(interm_pbpl);
+		plane_blocks_per_line = u32_to_fixed16(interm_pbpl);
 	} else {
 		interm_pbpl = DIV_ROUND_UP(plane_bytes_per_line, 512) + 1;
-		plane_blocks_per_line = u32_to_fixed_16_16(interm_pbpl);
+		plane_blocks_per_line = u32_to_fixed16(interm_pbpl);
 	}
 
 	method1 = skl_wm_method1(plane_pixel_rate, cpp, latency);
@@ -4529,32 +4529,32 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 				 latency,
 				 plane_blocks_per_line);
 
-	y_tile_minimum = mul_u32_fixed_16_16(y_min_scanlines,
-					     plane_blocks_per_line);
+	y_tile_minimum = mul_u32_fixed16(y_min_scanlines,
+					 plane_blocks_per_line);
 
 	if (y_tiled) {
-		selected_result = max_fixed_16_16(method2, y_tile_minimum);
+		selected_result = max_fixed16(method2, y_tile_minimum);
 	} else {
 		uint32_t linetime_us;
 
-		linetime_us = fixed_16_16_to_u32_round_up(
+		linetime_us = fixed16_to_u32_round_up(
 				intel_get_linetime_us(cstate));
 		if ((cpp * cstate->base.adjusted_mode.crtc_htotal / 512 < 1) &&
 		    (plane_bytes_per_line / 512 < 1))
 			selected_result = method2;
 		else if (latency >= linetime_us)
-			selected_result = min_fixed_16_16(method1, method2);
+			selected_result = min_fixed16(method1, method2);
 		else
 			selected_result = method1;
 	}
 
-	res_blocks = fixed_16_16_to_u32_round_up(selected_result) + 1;
+	res_blocks = fixed16_to_u32_round_up(selected_result) + 1;
 	res_lines = div_round_up_fixed16(selected_result,
 					 plane_blocks_per_line);
 
 	if (level >= 1 && level <= 7) {
 		if (y_tiled) {
-			res_blocks += fixed_16_16_to_u32_round_up(y_tile_minimum);
+			res_blocks += fixed16_to_u32_round_up(y_tile_minimum);
 			res_lines += y_min_scanlines;
 		} else {
 			res_blocks++;
@@ -4617,8 +4617,7 @@ skl_compute_linetime_wm(struct intel_crtc_state *cstate)
 	if (is_fixed16_zero(linetime_us))
 		return 0;
 
-	linetime_wm = fixed_16_16_to_u32_round_up(mul_u32_fixed_16_16(8,
-				linetime_us));
+	linetime_wm = fixed16_to_u32_round_up(mul_u32_fixed16(8, linetime_us));
 
 	/* Display WA #1135: bxt. */
 	if (IS_BROXTON(dev_priv) && dev_priv->ipc_enabled)
-- 
2.13.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 4/9] drm/i915: Addition wrapper for fixed16.16 operation
  2017-06-13 13:11 [PATCH 0/9] Fixed16.16 wrapper cleanup & wm optimization Mahesh Kumar
                   ` (2 preceding siblings ...)
  2017-06-13 13:11 ` [PATCH 3/9] drm/i915: cleanup fixed-point wrappers naming Mahesh Kumar
@ 2017-06-13 13:11 ` Mahesh Kumar
  2017-06-13 13:11 ` [PATCH 5/9] drm/i915/hsw: use intel_compute_linetime_wm function for linetime wm Mahesh Kumar
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Mahesh Kumar @ 2017-06-13 13:11 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

This patch introduce addition wrapper for fixed point 16.16 operations.
Which will be used by later patches to avoid direct member variables
access of fixed_16_16_t structure.

add_fixed16 : takes 2 fixed_16_16_t variable & returns fixed_16_16_t
add_fixed16_u32 : takes fixed_16_16_t & u32 variable & returns fixed_16_16_t

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index d0698c290cba..7b95dec9529b 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -224,6 +224,25 @@ static inline uint_fixed_16_16_t mul_u32_fixed16(uint32_t val,
 	return clamp_u64_to_fixed16(intermediate_val);
 }
 
+static inline uint_fixed_16_16_t add_fixed16(uint_fixed_16_16_t add1,
+					     uint_fixed_16_16_t add2)
+{
+	uint64_t interm_sum;
+
+	interm_sum = (uint64_t) add1.val + add2.val;
+	return clamp_u64_to_fixed16(interm_sum);
+}
+
+static inline uint_fixed_16_16_t add_fixed16_u32(uint_fixed_16_16_t add1,
+						 uint32_t add2)
+{
+	uint64_t interm_sum;
+	uint_fixed_16_16_t interm_add2 = u32_to_fixed16(add2);
+
+	interm_sum = (uint64_t) add1.val + interm_add2.val;
+	return clamp_u64_to_fixed16(interm_sum);
+}
+
 static inline const char *yesno(bool v)
 {
 	return v ? "yes" : "no";
-- 
2.13.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 5/9] drm/i915/hsw: use intel_compute_linetime_wm function for linetime wm
  2017-06-13 13:11 [PATCH 0/9] Fixed16.16 wrapper cleanup & wm optimization Mahesh Kumar
                   ` (3 preceding siblings ...)
  2017-06-13 13:11 ` [PATCH 4/9] drm/i915: Addition wrapper for fixed16.16 operation Mahesh Kumar
@ 2017-06-13 13:11 ` Mahesh Kumar
  2017-06-13 13:11 ` [PATCH 6/9] drm/i915/skl+: WM calculation don't require height Mahesh Kumar
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Mahesh Kumar @ 2017-06-13 13:11 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

linetime wm is time taken to fill a single display line with given clock
rate, multiplied by 8.
This patch reuses the common code of hsw_compute_linetime_wm &
skl_compute_linetime_wm.

Changes since V1:
 - don't expose intel_compute_linetime_wm out of intel_pm.c (Maarten)

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c | 66 ++++++++++++++++++++++-------------------
 1 file changed, 36 insertions(+), 30 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 155f54a1f516..be65c0b01469 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -2727,6 +2727,40 @@ static void ilk_compute_wm_level(const struct drm_i915_private *dev_priv,
 	result->enable = true;
 }
 
+static uint_fixed_16_16_t
+intel_get_linetime_us(const struct intel_crtc_state *cstate)
+{
+	uint32_t pixel_rate;
+	uint32_t crtc_htotal;
+	uint_fixed_16_16_t linetime_us;
+
+	if (!cstate->base.active)
+		return u32_to_fixed16(0);
+
+	pixel_rate = cstate->pixel_rate;
+
+	if (WARN_ON(pixel_rate == 0))
+		return u32_to_fixed16(0);
+
+	crtc_htotal = cstate->base.adjusted_mode.crtc_htotal;
+	linetime_us = div_fixed16(crtc_htotal * 1000, pixel_rate);
+
+	return linetime_us;
+}
+
+static uint32_t
+intel_compute_linetime_wm(const struct intel_crtc_state *cstate)
+{
+	uint_fixed_16_16_t linetime_us;
+
+	linetime_us = intel_get_linetime_us(cstate);
+
+	if (is_fixed16_zero(linetime_us))
+		return 0;
+
+	return fixed16_to_u32_round_up(mul_u32_fixed16(8, linetime_us));
+}
+
 static uint32_t
 hsw_compute_linetime_wm(const struct intel_crtc_state *cstate)
 {
@@ -2746,8 +2780,7 @@ hsw_compute_linetime_wm(const struct intel_crtc_state *cstate)
 	/* The WM are computed with base on how long it takes to fill a single
 	 * row at the given clock rate, multiplied by 8.
 	 * */
-	linetime = DIV_ROUND_CLOSEST(adjusted_mode->crtc_htotal * 1000 * 8,
-				     adjusted_mode->crtc_clock);
+	linetime = intel_compute_linetime_wm(cstate);
 	ips_linetime = DIV_ROUND_CLOSEST(adjusted_mode->crtc_htotal * 1000 * 8,
 					 intel_state->cdclk.logical.cdclk);
 
@@ -4381,27 +4414,6 @@ static uint_fixed_16_16_t skl_wm_method2(uint32_t pixel_rate,
 	return ret;
 }
 
-static uint_fixed_16_16_t
-intel_get_linetime_us(struct intel_crtc_state *cstate)
-{
-	uint32_t pixel_rate;
-	uint32_t crtc_htotal;
-	uint_fixed_16_16_t linetime_us;
-
-	if (!cstate->base.active)
-		return u32_to_fixed16(0);
-
-	pixel_rate = cstate->pixel_rate;
-
-	if (WARN_ON(pixel_rate == 0))
-		return u32_to_fixed16(0);
-
-	crtc_htotal = cstate->base.adjusted_mode.crtc_htotal;
-	linetime_us = div_fixed16(crtc_htotal * 1000, pixel_rate);
-
-	return linetime_us;
-}
-
 static uint32_t
 skl_adjusted_plane_pixel_rate(const struct intel_crtc_state *cstate,
 			      const struct intel_plane_state *pstate)
@@ -4609,15 +4621,9 @@ skl_compute_linetime_wm(struct intel_crtc_state *cstate)
 {
 	struct drm_atomic_state *state = cstate->base.state;
 	struct drm_i915_private *dev_priv = to_i915(state->dev);
-	uint_fixed_16_16_t linetime_us;
 	uint32_t linetime_wm;
 
-	linetime_us = intel_get_linetime_us(cstate);
-
-	if (is_fixed16_zero(linetime_us))
-		return 0;
-
-	linetime_wm = fixed16_to_u32_round_up(mul_u32_fixed16(8, linetime_us));
+	linetime_wm = intel_compute_linetime_wm(cstate);
 
 	/* Display WA #1135: bxt. */
 	if (IS_BROXTON(dev_priv) && dev_priv->ipc_enabled)
-- 
2.13.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 6/9] drm/i915/skl+: WM calculation don't require height
  2017-06-13 13:11 [PATCH 0/9] Fixed16.16 wrapper cleanup & wm optimization Mahesh Kumar
                   ` (4 preceding siblings ...)
  2017-06-13 13:11 ` [PATCH 5/9] drm/i915/hsw: use intel_compute_linetime_wm function for linetime wm Mahesh Kumar
@ 2017-06-13 13:11 ` Mahesh Kumar
  2017-06-13 13:12 ` [PATCH 7/9] drm/i915/skl+: unify cpp value in WM calculation Mahesh Kumar
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Mahesh Kumar @ 2017-06-13 13:11 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

height of plane was require to swap width/height in case of 90/270
rotation. Now src structure contains already swapped values, So we
don't have to calculate height of the plane.

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index be65c0b01469..84d3e66bb0f3 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -4454,7 +4454,7 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 	uint32_t plane_bytes_per_line;
 	uint32_t res_blocks, res_lines;
 	uint8_t cpp;
-	uint32_t width = 0, height = 0;
+	uint32_t width = 0;
 	uint32_t plane_pixel_rate;
 	uint_fixed_16_16_t y_tile_minimum;
 	uint32_t y_min_scanlines;
@@ -4481,7 +4481,6 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 
 	if (plane->id == PLANE_CURSOR) {
 		width = intel_pstate->base.crtc_w;
-		height = intel_pstate->base.crtc_h;
 	} else {
 		/*
 		 * Src coordinates are already rotated by 270 degrees for
@@ -4489,7 +4488,6 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 		 * GTT mapping), hence no need to account for rotation here.
 		 */
 		width = drm_rect_width(&intel_pstate->base.src) >> 16;
-		height = drm_rect_height(&intel_pstate->base.src) >> 16;
 	}
 
 	cpp = fb->format->cpp[0];
-- 
2.13.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 7/9] drm/i915/skl+: unify cpp value in WM calculation
  2017-06-13 13:11 [PATCH 0/9] Fixed16.16 wrapper cleanup & wm optimization Mahesh Kumar
                   ` (5 preceding siblings ...)
  2017-06-13 13:11 ` [PATCH 6/9] drm/i915/skl+: WM calculation don't require height Mahesh Kumar
@ 2017-06-13 13:12 ` Mahesh Kumar
  2017-06-13 13:12 ` [PATCH 8/9] drm/i915/skl+: Optimize " Mahesh Kumar
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Mahesh Kumar @ 2017-06-13 13:12 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

use same cpp value in different phase of plane WM caluclation.

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 84d3e66bb0f3..3616d07a5500 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -4490,13 +4490,11 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 		width = drm_rect_width(&intel_pstate->base.src) >> 16;
 	}
 
-	cpp = fb->format->cpp[0];
+	cpp = (fb->format->format == DRM_FORMAT_NV12) ? fb->format->cpp[1] :
+							fb->format->cpp[0];
 	plane_pixel_rate = skl_adjusted_plane_pixel_rate(cstate, intel_pstate);
 
 	if (drm_rotation_90_or_270(pstate->rotation)) {
-		int cpp = (fb->format->format == DRM_FORMAT_NV12) ?
-			fb->format->cpp[1] :
-			fb->format->cpp[0];
 
 		switch (cpp) {
 		case 1:
-- 
2.13.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 8/9] drm/i915/skl+: Optimize WM calculation
  2017-06-13 13:11 [PATCH 0/9] Fixed16.16 wrapper cleanup & wm optimization Mahesh Kumar
                   ` (6 preceding siblings ...)
  2017-06-13 13:12 ` [PATCH 7/9] drm/i915/skl+: unify cpp value in WM calculation Mahesh Kumar
@ 2017-06-13 13:12 ` Mahesh Kumar
  2017-06-13 13:12 ` [PATCH 9/9] drm/i915/gen10: Calculate and enable transition WM Mahesh Kumar
  2017-06-13 13:24 ` ✓ Fi.CI.BAT: success for Fixed16.16 wrapper cleanup & wm optimization (rev2) Patchwork
  9 siblings, 0 replies; 11+ messages in thread
From: Mahesh Kumar @ 2017-06-13 13:12 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

Plane configuration parameters doesn't change for each WM-level
calculation. Currently we compute same parameters 8 times for each
wm-level.
This patch optimizes it by calculating these parameters in beginning
& reuse during each level-wm calculation.

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h |  13 ++++
 drivers/gpu/drm/i915/intel_pm.c | 169 ++++++++++++++++++++++------------------
 2 files changed, 107 insertions(+), 75 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 7b95dec9529b..2ec6b17e54c5 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1776,6 +1776,19 @@ struct skl_wm_level {
 	uint8_t plane_res_l;
 };
 
+/* Stores plane specific WM parameters */
+struct skl_wm_params{
+	bool x_tiled, y_tiled;
+	uint32_t width;
+	uint8_t cpp;
+	uint32_t plane_pixel_rate;
+	uint32_t y_min_scanlines;
+	uint32_t plane_bytes_per_line;
+	uint_fixed_16_16_t plane_blocks_per_line;
+	uint_fixed_16_16_t y_tile_minimum;
+	uint32_t linetime_us;
+};
+
 /*
  * This struct helps tracking the state needed for runtime PM, which puts the
  * device in PCI D3 state. Notice that when this happens, nothing on the
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 3616d07a5500..55f9e1ae0a94 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -4436,121 +4436,129 @@ skl_adjusted_plane_pixel_rate(const struct intel_crtc_state *cstate,
 					    downscale_amount);
 }
 
-static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
-				struct intel_crtc_state *cstate,
-				const struct intel_plane_state *intel_pstate,
-				int level,
-				uint16_t *out_blocks, /* out */
-				uint8_t *out_lines /* out */)
+static int
+skl_compute_plane_wm_params(const struct drm_i915_private *dev_priv,
+			    struct intel_crtc_state *cstate,
+			    const struct intel_plane_state *intel_pstate,
+			    struct skl_wm_params *wp)
 {
 	struct intel_plane *plane = to_intel_plane(intel_pstate->base.plane);
 	const struct drm_plane_state *pstate = &intel_pstate->base;
 	const struct drm_framebuffer *fb = pstate->fb;
-	uint32_t latency = dev_priv->wm.skl_latency[level];
-	uint_fixed_16_16_t method1, method2;
-	uint_fixed_16_16_t plane_blocks_per_line;
-	uint_fixed_16_16_t selected_result;
 	uint32_t interm_pbpl;
-	uint32_t plane_bytes_per_line;
-	uint32_t res_blocks, res_lines;
-	uint8_t cpp;
-	uint32_t width = 0;
-	uint32_t plane_pixel_rate;
-	uint_fixed_16_16_t y_tile_minimum;
-	uint32_t y_min_scanlines;
 	struct intel_atomic_state *state =
 		to_intel_atomic_state(cstate->base.state);
 	bool apply_memory_bw_wa = skl_needs_memory_bw_wa(state);
-	bool y_tiled, x_tiled;
 
-	if (latency == 0 ||
-	    !intel_wm_plane_visible(cstate, intel_pstate))
+	if (!intel_wm_plane_visible(cstate, intel_pstate))
 		return 0;
 
-	y_tiled = fb->modifier == I915_FORMAT_MOD_Y_TILED ||
-		  fb->modifier == I915_FORMAT_MOD_Yf_TILED;
-	x_tiled = fb->modifier == I915_FORMAT_MOD_X_TILED;
-
-	/* Display WA #1141: kbl,cfl */
-	if ((IS_KABYLAKE(dev_priv) || IS_COFFEELAKE(dev_priv)) &&
-	    dev_priv->ipc_enabled)
-		latency += 4;
-
-	if (apply_memory_bw_wa && x_tiled)
-		latency += 15;
+	wp->y_tiled = fb->modifier == I915_FORMAT_MOD_Y_TILED ||
+		      fb->modifier == I915_FORMAT_MOD_Yf_TILED;
+	wp->x_tiled = fb->modifier == I915_FORMAT_MOD_X_TILED;
 
 	if (plane->id == PLANE_CURSOR) {
-		width = intel_pstate->base.crtc_w;
+		wp->width = intel_pstate->base.crtc_w;
 	} else {
 		/*
 		 * Src coordinates are already rotated by 270 degrees for
 		 * the 90/270 degree plane rotation cases (to match the
 		 * GTT mapping), hence no need to account for rotation here.
 		 */
-		width = drm_rect_width(&intel_pstate->base.src) >> 16;
+		wp->width = drm_rect_width(&intel_pstate->base.src) >> 16;
 	}
 
-	cpp = (fb->format->format == DRM_FORMAT_NV12) ? fb->format->cpp[1] :
-							fb->format->cpp[0];
-	plane_pixel_rate = skl_adjusted_plane_pixel_rate(cstate, intel_pstate);
+	wp->cpp = (fb->format->format == DRM_FORMAT_NV12) ? fb->format->cpp[1] :
+							    fb->format->cpp[0];
+	wp->plane_pixel_rate = skl_adjusted_plane_pixel_rate(cstate,
+							     intel_pstate);
 
 	if (drm_rotation_90_or_270(pstate->rotation)) {
 
-		switch (cpp) {
+		switch (wp->cpp) {
 		case 1:
-			y_min_scanlines = 16;
+			wp->y_min_scanlines = 16;
 			break;
 		case 2:
-			y_min_scanlines = 8;
+			wp->y_min_scanlines = 8;
 			break;
 		case 4:
-			y_min_scanlines = 4;
+			wp->y_min_scanlines = 4;
 			break;
 		default:
-			MISSING_CASE(cpp);
+			MISSING_CASE(wp->cpp);
 			return -EINVAL;
 		}
 	} else {
-		y_min_scanlines = 4;
+		wp->y_min_scanlines = 4;
 	}
 
 	if (apply_memory_bw_wa)
-		y_min_scanlines *= 2;
-
-	plane_bytes_per_line = width * cpp;
-	if (y_tiled) {
-		interm_pbpl = DIV_ROUND_UP(plane_bytes_per_line *
-					   y_min_scanlines, 512);
-		plane_blocks_per_line = div_fixed16(interm_pbpl,
-							y_min_scanlines);
-	} else if (x_tiled) {
-		interm_pbpl = DIV_ROUND_UP(plane_bytes_per_line, 512);
-		plane_blocks_per_line = u32_to_fixed16(interm_pbpl);
+		wp->y_min_scanlines *= 2;
+
+	wp->plane_bytes_per_line = wp->width * wp->cpp;
+	if (wp->y_tiled) {
+		interm_pbpl = DIV_ROUND_UP(wp->plane_bytes_per_line *
+					   wp->y_min_scanlines, 512);
+		wp->plane_blocks_per_line = div_fixed16(interm_pbpl,
+							wp->y_min_scanlines);
+	} else if (wp->x_tiled) {
+		interm_pbpl = DIV_ROUND_UP(wp->plane_bytes_per_line, 512);
+		wp->plane_blocks_per_line = u32_to_fixed16(interm_pbpl);
 	} else {
-		interm_pbpl = DIV_ROUND_UP(plane_bytes_per_line, 512) + 1;
-		plane_blocks_per_line = u32_to_fixed16(interm_pbpl);
+		interm_pbpl = DIV_ROUND_UP(wp->plane_bytes_per_line, 512) + 1;
+		wp->plane_blocks_per_line = u32_to_fixed16(interm_pbpl);
 	}
 
-	method1 = skl_wm_method1(plane_pixel_rate, cpp, latency);
-	method2 = skl_wm_method2(plane_pixel_rate,
+	wp->y_tile_minimum = mul_u32_fixed16(wp->y_min_scanlines,
+					     wp->plane_blocks_per_line);
+	wp->linetime_us = fixed16_to_u32_round_up(
+					intel_get_linetime_us(cstate));
+	return 0;
+}
+
+static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
+				struct intel_crtc_state *cstate,
+				const struct intel_plane_state *intel_pstate,
+				int level,
+				const struct skl_wm_params *wp,
+				uint16_t *out_blocks, /* out */
+				uint8_t *out_lines /* out */)
+{
+	const struct drm_plane_state *pstate = &intel_pstate->base;
+	uint32_t latency = dev_priv->wm.skl_latency[level];
+	uint_fixed_16_16_t method1, method2;
+	uint_fixed_16_16_t selected_result;
+	uint32_t res_blocks, res_lines;
+	struct intel_atomic_state *state =
+		to_intel_atomic_state(cstate->base.state);
+	bool apply_memory_bw_wa = skl_needs_memory_bw_wa(state);
+
+	if (latency == 0 ||
+	    !intel_wm_plane_visible(cstate, intel_pstate))
+		return 0;
+
+	/* Display WA #1141: kbl,cfl */
+	if ((IS_KABYLAKE(dev_priv) || IS_COFFEELAKE(dev_priv)) &&
+	    dev_priv->ipc_enabled)
+		latency += 4;
+
+	if (apply_memory_bw_wa && wp->x_tiled)
+		latency += 15;
+
+	method1 = skl_wm_method1(wp->plane_pixel_rate, wp->cpp, latency);
+	method2 = skl_wm_method2(wp->plane_pixel_rate,
 				 cstate->base.adjusted_mode.crtc_htotal,
 				 latency,
-				 plane_blocks_per_line);
+				 wp->plane_blocks_per_line);
 
-	y_tile_minimum = mul_u32_fixed16(y_min_scanlines,
-					 plane_blocks_per_line);
-
-	if (y_tiled) {
-		selected_result = max_fixed16(method2, y_tile_minimum);
+	if (wp->y_tiled) {
+		selected_result = max_fixed16(method2, wp->y_tile_minimum);
 	} else {
-		uint32_t linetime_us;
-
-		linetime_us = fixed16_to_u32_round_up(
-				intel_get_linetime_us(cstate));
-		if ((cpp * cstate->base.adjusted_mode.crtc_htotal / 512 < 1) &&
-		    (plane_bytes_per_line / 512 < 1))
+		if ((wp->cpp * cstate->base.adjusted_mode.crtc_htotal / 512 < 1)
+		     && (wp->plane_bytes_per_line / 512 < 1))
 			selected_result = method2;
-		else if (latency >= linetime_us)
+		else if (latency >= wp->linetime_us)
 			selected_result = min_fixed16(method1, method2);
 		else
 			selected_result = method1;
@@ -4558,12 +4566,13 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 
 	res_blocks = fixed16_to_u32_round_up(selected_result) + 1;
 	res_lines = div_round_up_fixed16(selected_result,
-					 plane_blocks_per_line);
+					 wp->plane_blocks_per_line);
 
 	if (level >= 1 && level <= 7) {
-		if (y_tiled) {
-			res_blocks += fixed16_to_u32_round_up(y_tile_minimum);
-			res_lines += y_min_scanlines;
+		if (wp->y_tiled) {
+			res_blocks += fixed16_to_u32_round_up(
+							wp->y_tile_minimum);
+			res_lines += wp->y_min_scanlines;
 		} else {
 			res_blocks++;
 		}
@@ -4588,6 +4597,7 @@ static int
 skl_compute_wm_levels(const struct drm_i915_private *dev_priv,
 		      struct intel_crtc_state *cstate,
 		      const struct intel_plane_state *intel_pstate,
+		      const struct skl_wm_params *wm_params,
 		      struct skl_plane_wm *wm)
 {
 	int level, max_level = ilk_wm_max_level(dev_priv);
@@ -4603,6 +4613,7 @@ skl_compute_wm_levels(const struct drm_i915_private *dev_priv,
 					   cstate,
 					   intel_pstate,
 					   level,
+					   wm_params,
 					   &result->plane_res_b,
 					   &result->plane_res_l);
 		if (ret)
@@ -4660,10 +4671,18 @@ static int skl_build_pipe_wm(struct intel_crtc_state *cstate,
 		const struct intel_plane_state *intel_pstate =
 						to_intel_plane_state(pstate);
 		enum plane_id plane_id = to_intel_plane(plane)->id;
+		struct skl_wm_params wm_params;
 
 		wm = &pipe_wm->planes[plane_id];
+		memset(&wm_params, 0, sizeof(struct skl_wm_params));
+
+		ret = skl_compute_plane_wm_params(dev_priv, cstate,
+						  intel_pstate, &wm_params);
+		if (ret)
+			return ret;
 
-		ret = skl_compute_wm_levels(dev_priv, cstate, intel_pstate, wm);
+		ret = skl_compute_wm_levels(dev_priv, cstate, intel_pstate,
+					    &wm_params, wm);
 		if (ret)
 			return ret;
 		skl_compute_transition_wm(cstate, &wm->trans_wm);
-- 
2.13.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 9/9] drm/i915/gen10: Calculate and enable transition WM
  2017-06-13 13:11 [PATCH 0/9] Fixed16.16 wrapper cleanup & wm optimization Mahesh Kumar
                   ` (7 preceding siblings ...)
  2017-06-13 13:12 ` [PATCH 8/9] drm/i915/skl+: Optimize " Mahesh Kumar
@ 2017-06-13 13:12 ` Mahesh Kumar
  2017-06-13 13:24 ` ✓ Fi.CI.BAT: success for Fixed16.16 wrapper cleanup & wm optimization (rev2) Patchwork
  9 siblings, 0 replies; 11+ messages in thread
From: Mahesh Kumar @ 2017-06-13 13:12 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

GEN > 9 require transition WM to be programmed if IPC is enabled.
This patch calculates & enable transition WM for supported platforms.
If transition WM is enabled, Plane read requests are sent at high
priority until filling above the transition watermark, then the
requests are sent at lower priority until dropping below the level-0 WM.
The lower priority requests allow other memory clients to have better
memory access.

transition minimum is the minimum amount needed for trans_wm to work to
ensure  the demote does not happen before enough data has been read to
meet the level 0 watermark requirements.

transition amount is configurable value. Higher values will
tend to cause longer periods of high priority reads followed by longer
periods of lower priority reads. Tuning to lower values will tend to
cause shorter periods of high and lower priority reads.

Keeping transition amount to 0 in this patch.

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c | 51 ++++++++++++++++++++++++++++++++++++++---
 1 file changed, 48 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 55f9e1ae0a94..4bbe242643cf 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -4197,6 +4197,15 @@ skl_enable_plane_wm_levels(const struct drm_i915_private *dev_priv,
 			level_wm->plane_en = true;
 		}
 	}
+
+	/*
+	 * Unsupported GEN will have plane_res_b = 0 & transition WM for
+	 * them will get disabled here.
+	 */
+	if (wm->trans_wm.plane_res_b && wm->trans_wm.plane_res_b < plane_ddb)
+		wm->trans_wm.plane_en = true;
+	else
+		wm->trans_wm.plane_en = false;
 }
 
 static int
@@ -4640,13 +4649,48 @@ skl_compute_linetime_wm(struct intel_crtc_state *cstate)
 }
 
 static void skl_compute_transition_wm(struct intel_crtc_state *cstate,
+				      struct skl_wm_params *wp,
+				      struct skl_wm_level *wm_l0,
 				      struct skl_wm_level *trans_wm /* out */)
 {
+	struct drm_device *dev = cstate->base.crtc->dev;
+	const struct drm_i915_private *dev_priv = to_i915(dev);
+	uint16_t trans_min, trans_y_tile_min;
+	uint16_t trans_amount = 0; /* This is configurable amount */
+	uint16_t trans_offset_b, res_blocks;
+
 	if (!cstate->base.active)
 		return;
 
-	/* Until we know more, just disable transition WMs */
-	trans_wm->plane_en = false;
+	/* Transition WM are not recommended by HW team for GEN9 */
+	if (INTEL_GEN(dev_priv) <= 9)
+		return;
+
+	/* Transition WM don't have any impact if ipc is disabled */
+	if (!dev_priv->ipc_enabled)
+		return;
+
+	if (INTEL_GEN(dev_priv) >= 10)
+		trans_min = 4;
+
+	trans_offset_b = trans_min + trans_amount;
+	trans_y_tile_min = (uint16_t) mul_round_up_u32_fixed16(2,
+							wp->y_tile_minimum);
+
+	if (wp->y_tiled) {
+		res_blocks = max(wm_l0->plane_res_b, trans_y_tile_min) +
+				trans_offset_b;
+	} else {
+		res_blocks = wm_l0->plane_res_b + trans_offset_b;
+	}
+
+	res_blocks += 1;
+
+	/* WA BUG:1938466 add one block for non y-tile planes */
+	if (!wp->y_tiled && IS_CNL_REVID(dev_priv, CNL_REVID_A0, CNL_REVID_A0))
+		res_blocks += 1;
+
+	trans_wm->plane_res_b = res_blocks;
 }
 
 static int skl_build_pipe_wm(struct intel_crtc_state *cstate,
@@ -4685,7 +4729,8 @@ static int skl_build_pipe_wm(struct intel_crtc_state *cstate,
 					    &wm_params, wm);
 		if (ret)
 			return ret;
-		skl_compute_transition_wm(cstate, &wm->trans_wm);
+		skl_compute_transition_wm(cstate, &wm_params, &wm->wm[0],
+					  &wm->trans_wm);
 	}
 	pipe_wm->linetime = skl_compute_linetime_wm(cstate);
 
-- 
2.13.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* ✓ Fi.CI.BAT: success for Fixed16.16 wrapper cleanup & wm optimization (rev2)
  2017-06-13 13:11 [PATCH 0/9] Fixed16.16 wrapper cleanup & wm optimization Mahesh Kumar
                   ` (8 preceding siblings ...)
  2017-06-13 13:12 ` [PATCH 9/9] drm/i915/gen10: Calculate and enable transition WM Mahesh Kumar
@ 2017-06-13 13:24 ` Patchwork
  9 siblings, 0 replies; 11+ messages in thread
From: Patchwork @ 2017-06-13 13:24 UTC (permalink / raw)
  To: Kumar, Mahesh; +Cc: intel-gfx

== Series Details ==

Series: Fixed16.16 wrapper cleanup & wm optimization (rev2)
URL   : https://patchwork.freedesktop.org/series/25692/
State : success

== Summary ==

Series 25692v2 Fixed16.16 wrapper cleanup & wm optimization
https://patchwork.freedesktop.org/api/1.0/series/25692/revisions/2/mbox/

Test gem_exec_suspend:
        Subgroup basic-s4-devices:
                dmesg-warn -> PASS       (fi-kbl-7560u) fdo#100125 +1
Test kms_busy:
        Subgroup basic-flip-default-b:
                dmesg-warn -> FAIL       (fi-skl-6700hq) fdo#101144
Test kms_pipe_crc_basic:
        Subgroup suspend-read-crc-pipe-b:
                pass       -> DMESG-WARN (fi-skl-6700hq) fdo#100461

fdo#100125 https://bugs.freedesktop.org/show_bug.cgi?id=100125
fdo#101144 https://bugs.freedesktop.org/show_bug.cgi?id=101144
fdo#100461 https://bugs.freedesktop.org/show_bug.cgi?id=100461

fi-bdw-5557u     total:278  pass:267  dwarn:0   dfail:0   fail:0   skip:11  time:447s
fi-bdw-gvtdvm    total:278  pass:256  dwarn:8   dfail:0   fail:0   skip:14  time:430s
fi-bsw-n3050     total:278  pass:242  dwarn:0   dfail:0   fail:0   skip:36  time:590s
fi-bxt-j4205     total:278  pass:259  dwarn:0   dfail:0   fail:0   skip:19  time:518s
fi-byt-j1900     total:278  pass:254  dwarn:0   dfail:0   fail:0   skip:24  time:488s
fi-byt-n2820     total:278  pass:250  dwarn:0   dfail:0   fail:0   skip:28  time:480s
fi-glk-2a        total:278  pass:259  dwarn:0   dfail:0   fail:0   skip:19  time:595s
fi-hsw-4770      total:278  pass:262  dwarn:0   dfail:0   fail:0   skip:16  time:439s
fi-hsw-4770r     total:278  pass:262  dwarn:0   dfail:0   fail:0   skip:16  time:415s
fi-ilk-650       total:278  pass:228  dwarn:0   dfail:0   fail:0   skip:50  time:420s
fi-ivb-3520m     total:278  pass:260  dwarn:0   dfail:0   fail:0   skip:18  time:491s
fi-ivb-3770      total:278  pass:260  dwarn:0   dfail:0   fail:0   skip:18  time:469s
fi-kbl-7500u     total:278  pass:260  dwarn:0   dfail:0   fail:0   skip:18  time:472s
fi-kbl-7560u     total:278  pass:268  dwarn:0   dfail:0   fail:0   skip:10  time:566s
fi-kbl-r         total:278  pass:259  dwarn:1   dfail:0   fail:0   skip:18  time:577s
fi-skl-6260u     total:278  pass:268  dwarn:0   dfail:0   fail:0   skip:10  time:461s
fi-skl-6700hq    total:278  pass:227  dwarn:2   dfail:0   fail:27  skip:22  time:403s
fi-skl-6700k     total:278  pass:256  dwarn:4   dfail:0   fail:0   skip:18  time:462s
fi-skl-6770hq    total:278  pass:268  dwarn:0   dfail:0   fail:0   skip:10  time:474s
fi-skl-gvtdvm    total:278  pass:265  dwarn:0   dfail:0   fail:0   skip:13  time:439s
fi-snb-2520m     total:278  pass:250  dwarn:0   dfail:0   fail:0   skip:28  time:536s
fi-snb-2600      total:278  pass:249  dwarn:0   dfail:0   fail:0   skip:29  time:404s

63f3be17429f13293c42de2c6e8ed6f0fc425145 drm-tip: 2017y-06m-13d-08h-33m-44s UTC integration manifest
0ceb4d4 drm/i915/gen10: Calculate and enable transition WM
b8a767f drm/i915/skl+: Optimize WM calculation
95a47716 drm/i915/skl+: unify cpp value in WM calculation
7e1daf1 drm/i915/skl+: WM calculation don't require height
6c078ac drm/i915/hsw: use intel_compute_linetime_wm function for linetime wm
07465c9 drm/i915: Addition wrapper for fixed16.16 operation
741891d drm/i915: cleanup fixed-point wrappers naming
71b1d26 drm/i915: Always perform internal fixed16 division in 64 bits
8d7c80b drm/i915: take-out common clamping code of fixed16 wrappers

== Logs ==

For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_4940/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2017-06-13 13:24 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-06-13 13:11 [PATCH 0/9] Fixed16.16 wrapper cleanup & wm optimization Mahesh Kumar
2017-06-13 13:11 ` [PATCH 1/9] drm/i915: take-out common clamping code of fixed16 wrappers Mahesh Kumar
2017-06-13 13:11 ` [PATCH 2/9] drm/i915: Always perform internal fixed16 division in 64 bits Mahesh Kumar
2017-06-13 13:11 ` [PATCH 3/9] drm/i915: cleanup fixed-point wrappers naming Mahesh Kumar
2017-06-13 13:11 ` [PATCH 4/9] drm/i915: Addition wrapper for fixed16.16 operation Mahesh Kumar
2017-06-13 13:11 ` [PATCH 5/9] drm/i915/hsw: use intel_compute_linetime_wm function for linetime wm Mahesh Kumar
2017-06-13 13:11 ` [PATCH 6/9] drm/i915/skl+: WM calculation don't require height Mahesh Kumar
2017-06-13 13:12 ` [PATCH 7/9] drm/i915/skl+: unify cpp value in WM calculation Mahesh Kumar
2017-06-13 13:12 ` [PATCH 8/9] drm/i915/skl+: Optimize " Mahesh Kumar
2017-06-13 13:12 ` [PATCH 9/9] drm/i915/gen10: Calculate and enable transition WM Mahesh Kumar
2017-06-13 13:24 ` ✓ Fi.CI.BAT: success for Fixed16.16 wrapper cleanup & wm optimization (rev2) Patchwork

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox