public inbox for intel-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed
* [PATCH v7 0/8] GEN-9 Arbitrated Bandwidth WM WA's & IPC
@ 2016-12-01 15:49 Mahesh Kumar
  2016-12-01 15:49 ` [PATCH v7 1/8] drm/i915/skl: Add variables to check x_tile and y_tile Mahesh Kumar
                   ` (9 more replies)
  0 siblings, 10 replies; 18+ messages in thread
From: Mahesh Kumar @ 2016-12-01 15:49 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

This series implements following set of functionality
	Implement IPC WA's for Broxton/KBL
	Enable IPC in supported platforms
	Convert WM calculation to fixed point calculation
	Calculation of System memory Bandwidth for SKL/KBL/BXT
	Implementation of Arbitrated memory Bandwidth related WM WA's


Mahesh Kumar (8):
  drm/i915/skl: Add variables to check x_tile and y_tile
  drm/i915/bxt: IPC WA for Broxton
  drm/i915/kbl: IPC workaround for kabylake
  drm/i915/bxt: Enable IPC support
  drm/i915/skl+: change WM calc to fixed point 16.16
  drm/i915: Add intel_atomic_get_existing_crtc_state function
  drm/i915: Decode system memory bandwidth
  drm/i915/gen9: WM memory bandwidth related workaround

 drivers/gpu/drm/i915/i915_drv.c      | 175 ++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_drv.h      | 109 ++++++++++++++
 drivers/gpu/drm/i915/i915_reg.h      |  38 +++++
 drivers/gpu/drm/i915/intel_display.c |  24 ++++
 drivers/gpu/drm/i915/intel_drv.h     |  15 ++
 drivers/gpu/drm/i915/intel_pm.c      | 272 +++++++++++++++++++++++++++++------
 6 files changed, 586 insertions(+), 47 deletions(-)

-- 
2.10.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v7 1/8] drm/i915/skl: Add variables to check x_tile and y_tile
  2016-12-01 15:49 [PATCH v7 0/8] GEN-9 Arbitrated Bandwidth WM WA's & IPC Mahesh Kumar
@ 2016-12-01 15:49 ` Mahesh Kumar
  2016-12-01 15:49 ` [PATCH v7 2/8] drm/i915/bxt: IPC WA for Broxton Mahesh Kumar
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 18+ messages in thread
From: Mahesh Kumar @ 2016-12-01 15:49 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

This patch adds variable to check for X_tiled & y_tiled planes, instead
of always checking against framebuffer-modifiers.

Changes:
 - Created separate patch as per Paulo's comment
 - Added x_tiled variable as well
Changes since V2:
 - Incorporate Paulo's comments
 - Rebase

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c | 22 ++++++++++++----------
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 29b6653..a467ffe 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3579,13 +3579,18 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 	struct intel_atomic_state *state =
 		to_intel_atomic_state(cstate->base.state);
 	bool apply_memory_bw_wa = skl_needs_memory_bw_wa(state);
+	bool y_tiled, x_tiled;
 
 	if (latency == 0 || !cstate->base.active || !intel_pstate->base.visible) {
 		*enabled = false;
 		return 0;
 	}
 
-	if (apply_memory_bw_wa && fb->modifier == I915_FORMAT_MOD_X_TILED)
+	y_tiled = fb->modifier == I915_FORMAT_MOD_Y_TILED ||
+				fb->modifier == I915_FORMAT_MOD_Yf_TILED;
+	x_tiled = fb->modifier == I915_FORMAT_MOD_X_TILED;
+
+	if (apply_memory_bw_wa && x_tiled)
 		latency += 15;
 
 	width = drm_rect_width(&intel_pstate->base.src) >> 16;
@@ -3624,16 +3629,15 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 		y_min_scanlines *= 2;
 
 	plane_bytes_per_line = width * cpp;
-	if (fb->modifier == I915_FORMAT_MOD_Y_TILED ||
-	    fb->modifier == I915_FORMAT_MOD_Yf_TILED) {
+	if (y_tiled) {
 		plane_blocks_per_line =
 		      DIV_ROUND_UP(plane_bytes_per_line * y_min_scanlines, 512);
 		plane_blocks_per_line /= y_min_scanlines;
-	} else if (fb->modifier == DRM_FORMAT_MOD_NONE) {
+	} else if (x_tiled) {
+		plane_blocks_per_line = DIV_ROUND_UP(plane_bytes_per_line, 512);
+	} else {
 		plane_blocks_per_line = DIV_ROUND_UP(plane_bytes_per_line, 512)
 					+ 1;
-	} else {
-		plane_blocks_per_line = DIV_ROUND_UP(plane_bytes_per_line, 512);
 	}
 
 	method1 = skl_wm_method1(plane_pixel_rate, cpp, latency);
@@ -3644,8 +3648,7 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 
 	y_tile_minimum = plane_blocks_per_line * y_min_scanlines;
 
-	if (fb->modifier == I915_FORMAT_MOD_Y_TILED ||
-	    fb->modifier == I915_FORMAT_MOD_Yf_TILED) {
+	if (y_tiled) {
 		selected_result = max(method2, y_tile_minimum);
 	} else {
 		if ((cpp * cstate->base.adjusted_mode.crtc_htotal / 512 < 1) &&
@@ -3661,8 +3664,7 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 	res_lines = DIV_ROUND_UP(selected_result, plane_blocks_per_line);
 
 	if (level >= 1 && level <= 7) {
-		if (fb->modifier == I915_FORMAT_MOD_Y_TILED ||
-		    fb->modifier == I915_FORMAT_MOD_Yf_TILED) {
+		if (y_tiled) {
 			res_blocks += y_tile_minimum;
 			res_lines += y_min_scanlines;
 		} else {
-- 
2.10.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v7 2/8] drm/i915/bxt: IPC WA for Broxton
  2016-12-01 15:49 [PATCH v7 0/8] GEN-9 Arbitrated Bandwidth WM WA's & IPC Mahesh Kumar
  2016-12-01 15:49 ` [PATCH v7 1/8] drm/i915/skl: Add variables to check x_tile and y_tile Mahesh Kumar
@ 2016-12-01 15:49 ` Mahesh Kumar
  2016-12-01 15:49 ` [PATCH v7 3/8] drm/i915/kbl: IPC workaround for kabylake Mahesh Kumar
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 18+ messages in thread
From: Mahesh Kumar @ 2016-12-01 15:49 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

Display Workarounds #1135
If IPC is enabled in BXT, display underruns are observed.
WA: The Line Time programmed in the WM_LINETIME register should be
half of the actual calculated Line Time.

Programmed Line Time = 1/2*Calculated Line Time

Changes since V1:
 - Add Workaround number in commit & code

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.c |  2 ++
 drivers/gpu/drm/i915/i915_drv.h |  2 ++
 drivers/gpu/drm/i915/intel_pm.c | 12 ++++++++++--
 3 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index b893e67..f27e4bd 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -1251,6 +1251,8 @@ int i915_driver_load(struct pci_dev *pdev, const struct pci_device_id *ent)
 
 	intel_runtime_pm_enable(dev_priv);
 
+	dev_priv->ipc_enabled = false;
+
 	/* Everything is in place, we can now relax! */
 	DRM_INFO("Initialized %s %d.%d.%d %s for %s on minor %d\n",
 		 driver.name, driver.major, driver.minor, driver.patchlevel,
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 1ec9619..64b0c90 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2306,6 +2306,8 @@ struct drm_i915_private {
 	/* perform PHY state sanity checks? */
 	bool chv_phy_assert[2];
 
+	bool ipc_enabled;
+
 	/* Used to save the pipe-to-encoder mapping for audio */
 	struct intel_encoder *av_enc_map[I915_MAX_PIPES];
 
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index a467ffe..89d8a28 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3755,7 +3755,10 @@ skl_compute_wm_level(const struct drm_i915_private *dev_priv,
 static uint32_t
 skl_compute_linetime_wm(struct intel_crtc_state *cstate)
 {
+	struct drm_atomic_state *state = cstate->base.state;
+	struct drm_i915_private *dev_priv = to_i915(state->dev);
 	uint32_t pixel_rate;
+	uint32_t linetime_wm;
 
 	if (!cstate->base.active)
 		return 0;
@@ -3765,8 +3768,13 @@ skl_compute_linetime_wm(struct intel_crtc_state *cstate)
 	if (WARN_ON(pixel_rate == 0))
 		return 0;
 
-	return DIV_ROUND_UP(8 * cstate->base.adjusted_mode.crtc_htotal * 1000,
-			    pixel_rate);
+	linetime_wm = DIV_ROUND_UP(8 * cstate->base.adjusted_mode.crtc_htotal *
+						1000, pixel_rate);
+	/* Display WA #1135 for BXT:All */
+	if (IS_BROXTON(dev_priv) && dev_priv->ipc_enabled)
+		linetime_wm = DIV_ROUND_UP(linetime_wm, 2);
+
+	return linetime_wm;
 }
 
 static void skl_compute_transition_wm(struct intel_crtc_state *cstate,
-- 
2.10.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v7 3/8] drm/i915/kbl: IPC workaround for kabylake
  2016-12-01 15:49 [PATCH v7 0/8] GEN-9 Arbitrated Bandwidth WM WA's & IPC Mahesh Kumar
  2016-12-01 15:49 ` [PATCH v7 1/8] drm/i915/skl: Add variables to check x_tile and y_tile Mahesh Kumar
  2016-12-01 15:49 ` [PATCH v7 2/8] drm/i915/bxt: IPC WA for Broxton Mahesh Kumar
@ 2016-12-01 15:49 ` Mahesh Kumar
  2016-12-01 15:49 ` [PATCH v7 4/8] drm/i915/bxt: Enable IPC support Mahesh Kumar
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 18+ messages in thread
From: Mahesh Kumar @ 2016-12-01 15:49 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

Display Workarounds #1141
IPC (Isoch Priority Control) may cause underflows.

KBL WA: When IPC is enabled, watermark latency values must be increased
by 4us across all levels. This brings level 0 up to 6us.

Changes since V1:
 - Add Workaround number in commit & code

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 89d8a28..3650c74 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3590,6 +3590,10 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 				fb->modifier == I915_FORMAT_MOD_Yf_TILED;
 	x_tiled = fb->modifier == I915_FORMAT_MOD_X_TILED;
 
+	/* IPC WA for kabylake Display WA #1141 */
+	if (IS_KABYLAKE(dev_priv) && dev_priv->ipc_enabled)
+		latency += 4;
+
 	if (apply_memory_bw_wa && x_tiled)
 		latency += 15;
 
-- 
2.10.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v7 4/8] drm/i915/bxt: Enable IPC support
  2016-12-01 15:49 [PATCH v7 0/8] GEN-9 Arbitrated Bandwidth WM WA's & IPC Mahesh Kumar
                   ` (2 preceding siblings ...)
  2016-12-01 15:49 ` [PATCH v7 3/8] drm/i915/kbl: IPC workaround for kabylake Mahesh Kumar
@ 2016-12-01 15:49 ` Mahesh Kumar
  2016-12-01 15:49 ` [PATCH v7 5/8] drm/i915/skl+: change WM calc to fixed point 16.16 Mahesh Kumar
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 18+ messages in thread
From: Mahesh Kumar @ 2016-12-01 15:49 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

This patch adds IPC support for platforms. This patch enables IPC
only for BXT/KBL platform as for SKL recommendation is to keep is disabled.
IPC (Isochronous Priority Control) is the hardware feature, which
dynamically controles the memory read priority of Display.

When IPC is enabled, plane read requests are sent at high priority until
filling above the transition watermark, then the requests are sent at
lower priority until dropping below the level 0 watermark.
The lower priority requests allow other memory clients to have better
memory access. When IPC is disabled, all plane read requests are sent at
high priority.

Changes since V1:
 - Remove commandline parameter to disable ipc
 - Address Paulo's comments
Changes since V2:
 - Address review comments
 - Set ipc_enabled flag
Changes since V3:
 - move ipc_enabled flag assignment inside intel_ipc_enable function

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.c  |  2 +-
 drivers/gpu/drm/i915/i915_reg.h  |  1 +
 drivers/gpu/drm/i915/intel_drv.h |  1 +
 drivers/gpu/drm/i915/intel_pm.c  | 16 ++++++++++++++++
 4 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index f27e4bd..1c689b6 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -1251,7 +1251,7 @@ int i915_driver_load(struct pci_dev *pdev, const struct pci_device_id *ent)
 
 	intel_runtime_pm_enable(dev_priv);
 
-	dev_priv->ipc_enabled = false;
+	intel_enable_ipc(dev_priv);
 
 	/* Everything is in place, we can now relax! */
 	DRM_INFO("Initialized %s %d.%d.%d %s for %s on minor %d\n",
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 6747d68..649319d 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -6418,6 +6418,7 @@ enum {
 #define  DISP_FBC_WM_DIS		(1<<15)
 #define DISP_ARB_CTL2	_MMIO(0x45004)
 #define  DISP_DATA_PARTITION_5_6	(1<<6)
+#define  DISP_IPC_ENABLE		(1<<3)
 #define DBUF_CTL	_MMIO(0x45008)
 #define  DBUF_POWER_REQUEST		(1<<31)
 #define  DBUF_POWER_STATE		(1<<30)
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index 9bbe5c5..3069512 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -1755,6 +1755,7 @@ bool skl_ddb_allocation_overlaps(const struct skl_ddb_entry **entries,
 uint32_t ilk_pipe_pixel_rate(const struct intel_crtc_state *pipe_config);
 bool ilk_disable_lp_wm(struct drm_device *dev);
 int sanitize_rc6_option(struct drm_i915_private *dev_priv, int enable_rc6);
+void intel_enable_ipc(struct drm_i915_private *dev_priv);
 static inline int intel_enable_rc6(void)
 {
 	return i915.enable_rc6;
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 3650c74..8f0e6f5 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -4655,6 +4655,22 @@ void intel_update_watermarks(struct intel_crtc *crtc)
 		dev_priv->display.update_wm(crtc);
 }
 
+void intel_enable_ipc(struct drm_i915_private *dev_priv)
+{
+	u32 val;
+
+	dev_priv->ipc_enabled = false;
+	if (!(IS_BROXTON(dev_priv) || IS_KABYLAKE(dev_priv)))
+		return;
+
+	val = I915_READ(DISP_ARB_CTL2);
+
+	val |= DISP_IPC_ENABLE;
+
+	I915_WRITE(DISP_ARB_CTL2, val);
+	dev_priv->ipc_enabled = true;
+}
+
 /*
  * Lock protecting IPS related data structures
  */
-- 
2.10.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v7 5/8] drm/i915/skl+: change WM calc to fixed point 16.16
  2016-12-01 15:49 [PATCH v7 0/8] GEN-9 Arbitrated Bandwidth WM WA's & IPC Mahesh Kumar
                   ` (3 preceding siblings ...)
  2016-12-01 15:49 ` [PATCH v7 4/8] drm/i915/bxt: Enable IPC support Mahesh Kumar
@ 2016-12-01 15:49 ` Mahesh Kumar
  2016-12-01 15:49 ` [PATCH v7 6/8] drm/i915: Add intel_atomic_get_existing_crtc_state function Mahesh Kumar
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 18+ messages in thread
From: Mahesh Kumar @ 2016-12-01 15:49 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

This patch changes Watermak calculation to fixed point calculation.
Problem with current calculation is during plane_blocks_per_line
calculation we divide intermediate blocks with min_scanlines and
takes floor of the result because of integer operation.
hence we end-up assigning less blocks than required. Which leads to
flickers.

Changes since V1:
 - Add fixed point data type as per Paulo's review
Changes since V2:
 - use fixed_point instead of fp_16_16
Changes since V3:
 - rebase

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h | 84 +++++++++++++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/intel_pm.c | 69 +++++++++++++++++++--------------
 2 files changed, 124 insertions(+), 29 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 64b0c90..b78dc9a 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -119,6 +119,90 @@ bool __i915_inject_load_failure(const char *func, int line);
 #define i915_inject_load_failure() \
 	__i915_inject_load_failure(__func__, __LINE__)
 
+typedef struct {
+	uint32_t val;
+} uint_fixed_16_16_t;
+
+#define FP_16_16_MAX ({ \
+		uint_fixed_16_16_t fp; \
+		fp.val = UINT_MAX; \
+		fp; \
+})
+
+static inline uint_fixed_16_16_t u32_to_fixed_point(uint32_t val)
+{
+	uint_fixed_16_16_t fp;
+
+	WARN_ON(val >> 16);
+
+	fp.val = val << 16;
+	return fp;
+}
+
+static inline uint32_t fixed_point_to_u32_round_up(uint_fixed_16_16_t fp)
+{
+	return DIV_ROUND_UP(fp.val, 1 << 16);
+}
+
+static inline uint32_t fixed_point_to_u32(uint_fixed_16_16_t fp)
+{
+	return (fp.val / (1 << 16));
+}
+
+static inline uint_fixed_16_16_t min_fixed_point(uint_fixed_16_16_t min1,
+					uint_fixed_16_16_t min2)
+{
+	uint_fixed_16_16_t min;
+
+	min.val = min(min1.val, min2.val);
+	return min;
+}
+
+static inline uint_fixed_16_16_t max_fixed_point(uint_fixed_16_16_t max1,
+					uint_fixed_16_16_t max2)
+{
+	uint_fixed_16_16_t max;
+
+	max.val = max(max1.val, max2.val);
+	return max;
+}
+
+static inline uint_fixed_16_16_t fixed_point_div_round_up(uint32_t val,
+					uint32_t d)
+{
+	uint_fixed_16_16_t fp, res;
+
+	fp = u32_to_fixed_point(val);
+	res.val = DIV_ROUND_UP(fp.val, d);
+	return res;
+}
+
+static inline uint_fixed_16_16_t fixed_point_div_round_up_u64(uint32_t val,
+								uint32_t d)
+{
+	uint_fixed_16_16_t res;
+	uint64_t interm_val;
+
+	interm_val = (uint64_t)val << 16;
+	interm_val = DIV_ROUND_UP_ULL(interm_val, d);
+	WARN_ON(interm_val >> 32);
+	res.val = (uint32_t) interm_val;
+
+	return res;
+}
+
+static inline uint_fixed_16_16_t mul_fixed_point_u32(uint32_t val,
+						uint_fixed_16_16_t mul)
+{
+	uint64_t intermediate_val;
+	uint_fixed_16_16_t fp;
+
+	intermediate_val = (uint64_t) val * mul.val;
+	WARN_ON(intermediate_val >> 32);
+	fp.val = (uint32_t) intermediate_val;
+	return fp;
+}
+
 static inline const char *yesno(bool v)
 {
 	return v ? "yes" : "no";
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 8f0e6f5..cc8fc84 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3503,32 +3503,35 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
  * should allow pixel_rate up to ~2 GHz which seems sufficient since max
  * 2xcdclk is 1350 MHz and the pixel rate should never exceed that.
 */
-static uint32_t skl_wm_method1(uint32_t pixel_rate, uint8_t cpp, uint32_t latency)
+static uint_fixed_16_16_t skl_wm_method1(uint32_t pixel_rate, uint8_t cpp,
+					 uint32_t latency)
 {
-	uint32_t wm_intermediate_val, ret;
+	uint32_t wm_intermediate_val;
+	uint_fixed_16_16_t ret;
 
 	if (latency == 0)
-		return UINT_MAX;
-
-	wm_intermediate_val = latency * pixel_rate * cpp / 512;
-	ret = DIV_ROUND_UP(wm_intermediate_val, 1000);
+		return FP_16_16_MAX;
 
+	wm_intermediate_val = latency * pixel_rate * cpp;
+	ret = fixed_point_div_round_up_u64(wm_intermediate_val, 1000 * 512);
 	return ret;
 }
 
-static uint32_t skl_wm_method2(uint32_t pixel_rate, uint32_t pipe_htotal,
-			       uint32_t latency, uint32_t plane_blocks_per_line)
+static uint_fixed_16_16_t skl_wm_method2(uint32_t pixel_rate,
+			uint32_t pipe_htotal,
+			uint32_t latency,
+			uint_fixed_16_16_t plane_blocks_per_line)
 {
-	uint32_t ret;
 	uint32_t wm_intermediate_val;
+	uint_fixed_16_16_t ret;
 
 	if (latency == 0)
-		return UINT_MAX;
+		return FP_16_16_MAX;
 
 	wm_intermediate_val = latency * pixel_rate;
-	ret = DIV_ROUND_UP(wm_intermediate_val, pipe_htotal * 1000) *
-				plane_blocks_per_line;
-
+	wm_intermediate_val = DIV_ROUND_UP(wm_intermediate_val,
+							pipe_htotal * 1000);
+	ret = mul_fixed_point_u32(wm_intermediate_val, plane_blocks_per_line);
 	return ret;
 }
 
@@ -3568,14 +3571,17 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 	struct drm_plane_state *pstate = &intel_pstate->base;
 	struct drm_framebuffer *fb = pstate->fb;
 	uint32_t latency = dev_priv->wm.skl_latency[level];
-	uint32_t method1, method2;
-	uint32_t plane_bytes_per_line, plane_blocks_per_line;
+	uint_fixed_16_16_t method1, method2;
+	uint_fixed_16_16_t plane_blocks_per_line;
+	uint_fixed_16_16_t selected_result;
+	uint32_t interm_pbpl;
+	uint32_t plane_bytes_per_line;
 	uint32_t res_blocks, res_lines;
-	uint32_t selected_result;
 	uint8_t cpp;
 	uint32_t width = 0, height = 0;
 	uint32_t plane_pixel_rate;
-	uint32_t y_tile_minimum, y_min_scanlines;
+	uint_fixed_16_16_t y_tile_minimum;
+	uint32_t y_min_scanlines;
 	struct intel_atomic_state *state =
 		to_intel_atomic_state(cstate->base.state);
 	bool apply_memory_bw_wa = skl_needs_memory_bw_wa(state);
@@ -3634,14 +3640,16 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 
 	plane_bytes_per_line = width * cpp;
 	if (y_tiled) {
+		interm_pbpl = DIV_ROUND_UP(plane_bytes_per_line *
+						y_min_scanlines, 512);
 		plane_blocks_per_line =
-		      DIV_ROUND_UP(plane_bytes_per_line * y_min_scanlines, 512);
-		plane_blocks_per_line /= y_min_scanlines;
+		      fixed_point_div_round_up(interm_pbpl, y_min_scanlines);
 	} else if (x_tiled) {
-		plane_blocks_per_line = DIV_ROUND_UP(plane_bytes_per_line, 512);
+		interm_pbpl = DIV_ROUND_UP(plane_bytes_per_line, 512);
+		plane_blocks_per_line = u32_to_fixed_point(interm_pbpl);
 	} else {
-		plane_blocks_per_line = DIV_ROUND_UP(plane_bytes_per_line, 512)
-					+ 1;
+		interm_pbpl = DIV_ROUND_UP(plane_bytes_per_line, 512) + 1;
+		plane_blocks_per_line = u32_to_fixed_point(interm_pbpl);
 	}
 
 	method1 = skl_wm_method1(plane_pixel_rate, cpp, latency);
@@ -3650,26 +3658,29 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 				 latency,
 				 plane_blocks_per_line);
 
-	y_tile_minimum = plane_blocks_per_line * y_min_scanlines;
+	y_tile_minimum = mul_fixed_point_u32(y_min_scanlines,
+						plane_blocks_per_line);
 
 	if (y_tiled) {
-		selected_result = max(method2, y_tile_minimum);
+		selected_result = max_fixed_point(method2, y_tile_minimum);
 	} else {
 		if ((cpp * cstate->base.adjusted_mode.crtc_htotal / 512 < 1) &&
 		    (plane_bytes_per_line / 512 < 1))
 			selected_result = method2;
-		else if ((ddb_allocation / plane_blocks_per_line) >= 1)
-			selected_result = min(method1, method2);
+		else if ((ddb_allocation /
+			fixed_point_to_u32_round_up(plane_blocks_per_line)) >= 1)
+			selected_result = min_fixed_point(method1, method2);
 		else
 			selected_result = method1;
 	}
 
-	res_blocks = selected_result + 1;
-	res_lines = DIV_ROUND_UP(selected_result, plane_blocks_per_line);
+	res_blocks = fixed_point_to_u32_round_up(selected_result) + 1;
+	res_lines = DIV_ROUND_UP(selected_result.val,
+					plane_blocks_per_line.val);
 
 	if (level >= 1 && level <= 7) {
 		if (y_tiled) {
-			res_blocks += y_tile_minimum;
+			res_blocks += fixed_point_to_u32_round_up(y_tile_minimum);
 			res_lines += y_min_scanlines;
 		} else {
 			res_blocks++;
-- 
2.10.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v7 6/8] drm/i915: Add intel_atomic_get_existing_crtc_state function
  2016-12-01 15:49 [PATCH v7 0/8] GEN-9 Arbitrated Bandwidth WM WA's & IPC Mahesh Kumar
                   ` (4 preceding siblings ...)
  2016-12-01 15:49 ` [PATCH v7 5/8] drm/i915/skl+: change WM calc to fixed point 16.16 Mahesh Kumar
@ 2016-12-01 15:49 ` Mahesh Kumar
  2016-12-01 15:49 ` [PATCH v7 7/8] drm/i915: Decode system memory bandwidth Mahesh Kumar
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 18+ messages in thread
From: Mahesh Kumar @ 2016-12-01 15:49 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

This patch Adds a function to extract intel_crtc_state from the
atomic_state, if not available it returns NULL.

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
---
 drivers/gpu/drm/i915/intel_drv.h | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index 3069512..dbee392 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -1804,6 +1804,20 @@ intel_atomic_get_crtc_state(struct drm_atomic_state *state,
 	return to_intel_crtc_state(crtc_state);
 }
 
+static inline struct intel_crtc_state *
+intel_atomic_get_existing_crtc_state(struct drm_atomic_state *state,
+				      struct intel_crtc *crtc)
+{
+	struct drm_crtc_state *crtc_state;
+
+	crtc_state = drm_atomic_get_existing_crtc_state(state, &crtc->base);
+
+	if (crtc_state)
+		return to_intel_crtc_state(crtc_state);
+	else
+		return NULL;
+}
+
 static inline struct intel_plane_state *
 intel_atomic_get_existing_plane_state(struct drm_atomic_state *state,
 				      struct intel_plane *plane)
-- 
2.10.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v7 7/8] drm/i915: Decode system memory bandwidth
  2016-12-01 15:49 [PATCH v7 0/8] GEN-9 Arbitrated Bandwidth WM WA's & IPC Mahesh Kumar
                   ` (5 preceding siblings ...)
  2016-12-01 15:49 ` [PATCH v7 6/8] drm/i915: Add intel_atomic_get_existing_crtc_state function Mahesh Kumar
@ 2016-12-01 15:49 ` Mahesh Kumar
  2016-12-08 23:55   ` Paulo Zanoni
  2016-12-01 15:49 ` [PATCH v7 8/8] drm/i915/gen9: WM memory bandwidth related workaround Mahesh Kumar
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 18+ messages in thread
From: Mahesh Kumar @ 2016-12-01 15:49 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

This patch adds support to decode system memory bandwidth
which will be used for arbitrated display memory percentage
calculation in GEN9 based system.

Changes from v1:
 - Address comments from Paulo
 - implement decode function for SKL/KBL also
Changes from v2:
 - Rewrite the code as per HW team inputs
 - Addresses review comments
Changes from v3:
 - Fix compilation warning

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.c | 173 ++++++++++++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_drv.h |  12 +++
 drivers/gpu/drm/i915/i915_reg.h |  37 +++++++++
 3 files changed, 222 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 1c689b6..0ac7122 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -979,6 +979,173 @@ static void intel_sanitize_options(struct drm_i915_private *dev_priv)
 	DRM_DEBUG_DRIVER("use GPU sempahores? %s\n", yesno(i915.semaphores));
 }
 
+static inline enum rank skl_memdev_get_channel_rank(uint32_t val)
+{
+	uint8_t l_rank, s_rank;
+	uint8_t l_size, s_size;
+	enum rank ch_rank = DRAM_RANK_SINGLE;
+
+	l_size = (val >> SKL_DRAM_SIZE_L_SHIFT) & SKL_DRAM_SIZE_MASK;
+	s_size = (val >> SKL_DRAM_SIZE_S_SHIFT) & SKL_DRAM_SIZE_MASK;
+	l_rank = (val >> SKL_DRAM_RANK_L_SHIFT) & SKL_DRAM_RANK_MASK;
+	s_rank = (val >> SKL_DRAM_RANK_S_SHIFT) & SKL_DRAM_RANK_MASK;
+
+	/*
+	 * If any of the slot has dual rank memory consider
+	 * dual rank memory channel
+	 */
+	if (l_rank == SKL_DRAM_RANK_DUAL || s_rank == SKL_DRAM_RANK_DUAL)
+		ch_rank = DRAM_RANK_DUAL;
+
+	/*
+	 * If both the slot has single rank memory then configuration
+	 * is dual rank memory
+	 */
+	if ((l_size && l_rank == SKL_DRAM_RANK_SINGLE) &&
+		(s_size && s_rank == SKL_DRAM_RANK_SINGLE))
+		ch_rank = DRAM_RANK_DUAL;
+	return ch_rank;
+}
+
+static int
+skl_get_memdev_info(struct drm_i915_private *dev_priv)
+{
+	struct memdev_info *memdev_info = &dev_priv->memdev_info;
+	uint32_t mem_freq_khz;
+	uint32_t val;
+	enum rank ch0_rank = DRAM_RANK_INVALID, ch1_rank = DRAM_RANK_INVALID;
+
+	val = I915_READ(SKL_MC_BIOS_DATA_0_0_0_MCHBAR_PCU);
+	mem_freq_khz = (val & SKL_REQ_DATA_MASK) *
+				SKL_MEMORY_FREQ_MULTIPLIER_KHZ;
+
+	val = I915_READ(SKL_MAD_DIMM_CH0_0_0_0_MCHBAR_MCMAIN);
+	if (val != 0x0) {
+		memdev_info->num_channels++;
+		ch0_rank = skl_memdev_get_channel_rank(val);
+	}
+
+	val = I915_READ(SKL_MAD_DIMM_CH1_0_0_0_MCHBAR_MCMAIN);
+	if (val != 0x0) {
+		memdev_info->num_channels++;
+		ch1_rank = skl_memdev_get_channel_rank(val);
+	}
+
+	if (memdev_info->num_channels == 0) {
+		DRM_ERROR("Number of mem channels are zero\n");
+		return -EINVAL;
+	}
+
+	memdev_info->bandwidth_kbps = (memdev_info->num_channels *
+							mem_freq_khz * 8);
+
+	if (memdev_info->bandwidth_kbps == 0) {
+		DRM_ERROR("Couldn't get system memory bandwidth\n");
+		return -EINVAL;
+	}
+	memdev_info->valid = true;
+
+	/*
+	 * If any of channel is single rank channel,
+	 * consider single rank memory
+	 */
+	if (ch0_rank == DRAM_RANK_SINGLE || ch1_rank == DRAM_RANK_SINGLE)
+		memdev_info->rank = DRAM_RANK_SINGLE;
+	else
+		memdev_info->rank = max(ch0_rank, ch1_rank);
+
+	return 0;
+}
+
+static int
+bxt_get_memdev_info(struct drm_i915_private *dev_priv)
+{
+	struct memdev_info *memdev_info = &dev_priv->memdev_info;
+	uint32_t dram_channels;
+	uint32_t mem_freq_khz, val;
+	uint8_t num_active_channels;
+	int i;
+
+	val = I915_READ(BXT_P_CR_MC_BIOS_REQ_0_0_0);
+	mem_freq_khz = ((val & BXT_REQ_DATA_MASK) *
+				BXT_MEMORY_FREQ_MULTIPLIER_KHZ);
+
+	dram_channels = (val >> BXT_DRAM_CHANNEL_ACTIVE_SHIFT) &
+					BXT_DRAM_CHANNEL_ACTIVE_MASK;
+	num_active_channels = hweight32(dram_channels);
+
+	memdev_info->bandwidth_kbps = (mem_freq_khz * num_active_channels * 4);
+
+	if (memdev_info->bandwidth_kbps == 0) {
+		DRM_ERROR("Couldn't get system memory bandwidth\n");
+		return -EINVAL;
+	}
+	memdev_info->valid = true;
+
+	/*
+	 * Now read each DUNIT8/9/10/11 to check the rank of each dimms.
+	 */
+	for (i = 0; i < BXT_D_CR_DRP0_DUNIT_MAX; i++) {
+		val = I915_READ(BXT_D_CR_DRP0_DUNIT(i));
+		if (val != 0xFFFFFFFF) {
+			uint8_t rank;
+			enum rank ch_rank;
+
+			memdev_info->num_channels++;
+			rank = val & BXT_DRAM_RANK_MASK;
+			if (rank == BXT_DRAM_RANK_SINGLE)
+				ch_rank = DRAM_RANK_SINGLE;
+			else if (rank == BXT_DRAM_RANK_DUAL)
+				ch_rank = DRAM_RANK_DUAL;
+			else
+				ch_rank = DRAM_RANK_INVALID;
+
+			/*
+			 * If any of channel is having single rank memory
+			 * consider memory as single rank
+			 */
+			if (memdev_info->rank == DRAM_RANK_INVALID)
+				memdev_info->rank = ch_rank;
+			else if (ch_rank == DRAM_RANK_SINGLE)
+				memdev_info->rank = DRAM_RANK_SINGLE;
+		}
+	}
+	return 0;
+}
+
+static void
+intel_get_memdev_info(struct drm_i915_private *dev_priv)
+{
+	struct memdev_info *memdev_info = &dev_priv->memdev_info;
+	int ret;
+
+	memdev_info->valid = false;
+	memdev_info->rank = DRAM_RANK_INVALID;
+	memdev_info->num_channels = 0;
+
+	if (!IS_GEN9(dev_priv))
+		return;
+
+	if (IS_BROXTON(dev_priv))
+		ret = bxt_get_memdev_info(dev_priv);
+	else
+		ret = skl_get_memdev_info(dev_priv);
+	if (ret)
+		return;
+
+	DRM_DEBUG_DRIVER("DRAM bandwidth: %u KBps total-channels: %u\n",
+				memdev_info->bandwidth_kbps,
+				memdev_info->num_channels);
+	if (memdev_info->rank == DRAM_RANK_INVALID)
+		DRM_INFO("Counld not get memory rank info\n");
+	else {
+		DRM_DEBUG_DRIVER("DRAM rank: %s rank\n",
+				(memdev_info->rank == DRAM_RANK_DUAL) ?
+						"dual" : "single");
+	}
+}
+
+
 /**
  * i915_driver_init_hw - setup state requiring device access
  * @dev_priv: device private
@@ -1081,6 +1248,12 @@ static int i915_driver_init_hw(struct drm_i915_private *dev_priv)
 			DRM_DEBUG_DRIVER("can't enable MSI");
 	}
 
+	/*
+	 * Fill the memdev structure to get the system raw bandwidth
+	 * This will be used by WM algorithm, to implement GEN9 based WA
+	 */
+	intel_get_memdev_info(dev_priv);
+
 	return 0;
 
 out_ggtt:
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index b78dc9a..69213a4 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2297,6 +2297,18 @@ struct drm_i915_private {
 		bool distrust_bios_wm;
 	} wm;
 
+	struct memdev_info {
+		bool valid;
+		uint32_t bandwidth_kbps;
+		uint8_t num_channels;
+		enum rank {
+			DRAM_RANK_INVALID = 0,
+			DRAM_RANK_SINGLE,
+			DRAM_RANK_DUAL
+		} rank;
+	} memdev_info;
+
+
 	struct i915_runtime_pm pm;
 
 	struct {
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 649319d..e7efdd0 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -8015,6 +8015,43 @@ enum {
 #define  DC_STATE_DEBUG_MASK_CORES	(1<<0)
 #define  DC_STATE_DEBUG_MASK_MEMORY_UP	(1<<1)
 
+#define BXT_P_CR_MC_BIOS_REQ_0_0_0	_MMIO(MCHBAR_MIRROR_BASE_SNB + 0x7114)
+#define BXT_REQ_DATA_MASK			0x3F
+#define BXT_DRAM_ACTIVE_CHANNEL_SHIFT		12
+#define BXT_DRAM_ACTIVE_CHANNEL_MASK		0xF
+/*
+ * BIOS programs this field of REQ_DATA [5:0] in integer
+ * multiple of 133333 KHz (133.33MHz)
+ */
+#define	BXT_MEMORY_FREQ_MULTIPLIER_KHZ		133333
+#define BXT_D_CR_DRP0_DUNIT8			0x1000
+#define BXT_D_CR_DRP0_DUNIT9			0x1200
+#define BXT_D_CR_DRP0_DUNIT_MAX			4
+#define _MMIO_MCHBAR_DUNIT(x, a, b) _MMIO(MCHBAR_MIRROR_BASE_SNB + (a) + (x)*((b)-(a)))
+#define BXT_D_CR_DRP0_DUNIT(x)	_MMIO_MCHBAR_DUNIT(x, BXT_D_CR_DRP0_DUNIT8, BXT_D_CR_DRP0_DUNIT9)
+#define BXT_DRAM_CHANNEL_ACTIVE_SHIFT		12
+#define BXT_DRAM_CHANNEL_ACTIVE_MASK		0xF
+#define BXT_DRAM_RANK_MASK			0x3
+#define BXT_DRAM_RANK_SINGLE			0x1
+#define BXT_DRAM_RANK_DUAL			0x3
+
+/*
+ * SKL memory frequeny multiplier is 266667 KHz (266.67 MHz)
+ */
+#define	SKL_MEMORY_FREQ_MULTIPLIER_KHZ		266667
+#define SKL_MC_BIOS_DATA_0_0_0_MCHBAR_PCU	_MMIO(MCHBAR_MIRROR_BASE_SNB + 0x5E04)
+#define SKL_REQ_DATA_MASK			(0xF << 0)
+#define SKL_MAD_DIMM_CH0_0_0_0_MCHBAR_MCMAIN	_MMIO(MCHBAR_MIRROR_BASE_SNB + 0x500C)
+#define SKL_MAD_DIMM_CH1_0_0_0_MCHBAR_MCMAIN	_MMIO(MCHBAR_MIRROR_BASE_SNB + 0x5010)
+#define SKL_DRAM_SIZE_MASK			0x1F
+#define SKL_DRAM_SIZE_L_SHIFT			0
+#define SKL_DRAM_SIZE_S_SHIFT			16
+#define SKL_DRAM_RANK_MASK			0x1
+#define SKL_DRAM_RANK_L_SHIFT			10
+#define SKL_DRAM_RANK_S_SHIFT			26
+#define SKL_DRAM_RANK_SINGLE			0x0
+#define SKL_DRAM_RANK_DUAL			0x1
+
 /* Please see hsw_read_dcomp() and hsw_write_dcomp() before using this register,
  * since on HSW we can't write to it using I915_WRITE. */
 #define D_COMP_HSW			_MMIO(MCHBAR_MIRROR_BASE_SNB + 0x5F0C)
-- 
2.10.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v7 8/8] drm/i915/gen9: WM memory bandwidth related workaround
  2016-12-01 15:49 [PATCH v7 0/8] GEN-9 Arbitrated Bandwidth WM WA's & IPC Mahesh Kumar
                   ` (6 preceding siblings ...)
  2016-12-01 15:49 ` [PATCH v7 7/8] drm/i915: Decode system memory bandwidth Mahesh Kumar
@ 2016-12-01 15:49 ` Mahesh Kumar
  2016-12-15 17:07   ` Paulo Zanoni
  2016-12-01 16:15 ` ✓ Fi.CI.BAT: success for GEN-9 Arbitrated Bandwidth WM WA's & IPC (rev3) Patchwork
  2016-12-07 19:35 ` [PATCH v7 0/8] GEN-9 Arbitrated Bandwidth WM WA's & IPC Paulo Zanoni
  9 siblings, 1 reply; 18+ messages in thread
From: Mahesh Kumar @ 2016-12-01 15:49 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

This patch implemnets Workariunds related to display arbitrated memory
bandwidth. These WA are applicabe for all gen-9 based platforms.

Changes since v1:
 - Rebase on top of Paulo's patch series
Changes since v2:
 - Address review comments
 - Rebase/rework as per other patch changes in series
Changes since v3:
 - Rework based on Maarten's comments

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h      |  11 +++
 drivers/gpu/drm/i915/intel_display.c |  24 ++++++
 drivers/gpu/drm/i915/intel_pm.c      | 155 +++++++++++++++++++++++++++++++++--
 3 files changed, 181 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 69213a4..3126259 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1216,6 +1216,13 @@ enum intel_sbi_destination {
 	SBI_MPHY,
 };
 
+/* SKL+ Watermark arbitrated display bandwidth Workarounds */
+enum watermark_memory_wa {
+	WATERMARK_WA_NONE,
+	WATERMARK_WA_X_TILED,
+	WATERMARK_WA_Y_TILED,
+};
+
 #define QUIRK_PIPEA_FORCE (1<<0)
 #define QUIRK_LVDS_SSC_DISABLE (1<<1)
 #define QUIRK_INVERT_BRIGHTNESS (1<<2)
@@ -1788,6 +1795,10 @@ struct skl_ddb_allocation {
 
 struct skl_wm_values {
 	unsigned dirty_pipes;
+	/* any WaterMark memory workaround Required */
+	enum watermark_memory_wa mem_wa;
+	uint32_t pipe_bw_kbps[I915_MAX_PIPES];
+	bool pipe_ytiled[I915_MAX_PIPES];
 	struct skl_ddb_allocation ddb;
 };
 
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 5d11002..f8da488 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -14574,6 +14574,28 @@ static void intel_atomic_track_fbs(struct drm_atomic_state *state)
 				  to_intel_plane(plane)->frontbuffer_bit);
 }
 
+static void
+intel_update_wm_bw_wa(struct drm_atomic_state *state)
+{
+	struct intel_atomic_state *intel_state = to_intel_atomic_state(state);
+	struct drm_i915_private *dev_priv = to_i915(state->dev);
+	const struct drm_crtc *crtc;
+	const struct drm_crtc_state *cstate;
+	struct skl_wm_values *results = &intel_state->wm_results;
+	struct skl_wm_values *hw_vals = &dev_priv->wm.skl_hw;
+	int i;
+
+	if (!IS_GEN9(dev_priv))
+		return;
+
+	for_each_crtc_in_state(state, crtc, cstate, i) {
+		hw_vals->pipe_bw_kbps[i] = results->pipe_bw_kbps[i];
+		hw_vals->pipe_ytiled[i] = results->pipe_ytiled[i];
+	}
+
+	hw_vals->mem_wa = results->mem_wa;
+}
+
 /**
  * intel_atomic_commit - commit validated state object
  * @dev: DRM device
@@ -14614,6 +14636,8 @@ static int intel_atomic_commit(struct drm_device *dev,
 	intel_shared_dpll_commit(state);
 	intel_atomic_track_fbs(state);
 
+	intel_update_wm_bw_wa(state);
+
 	if (intel_state->modeset) {
 		memcpy(dev_priv->min_pixclk, intel_state->min_pixclk,
 		       sizeof(intel_state->min_pixclk));
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index cc8fc84..fda6e1e 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -2878,11 +2878,7 @@ bool ilk_disable_lp_wm(struct drm_device *dev)
 
 #define SKL_SAGV_BLOCK_TIME	30 /* µs */
 
-/*
- * FIXME: We still don't have the proper code detect if we need to apply the WA,
- * so assume we'll always need it in order to avoid underruns.
- */
-static bool skl_needs_memory_bw_wa(struct intel_atomic_state *state)
+static bool intel_needs_memory_bw_wa(struct intel_atomic_state *state)
 {
 	struct drm_i915_private *dev_priv = to_i915(state->base.dev);
 
@@ -3056,7 +3052,7 @@ bool intel_can_enable_sagv(struct drm_atomic_state *state)
 
 		latency = dev_priv->wm.skl_latency[level];
 
-		if (skl_needs_memory_bw_wa(intel_state) &&
+		if (intel_needs_memory_bw_wa(intel_state) &&
 		    plane->base.state->fb->modifier ==
 		    I915_FORMAT_MOD_X_TILED)
 			latency += 15;
@@ -3584,7 +3580,7 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 	uint32_t y_min_scanlines;
 	struct intel_atomic_state *state =
 		to_intel_atomic_state(cstate->base.state);
-	bool apply_memory_bw_wa = skl_needs_memory_bw_wa(state);
+	enum watermark_memory_wa mem_wa;
 	bool y_tiled, x_tiled;
 
 	if (latency == 0 || !cstate->base.active || !intel_pstate->base.visible) {
@@ -3600,7 +3596,8 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 	if (IS_KABYLAKE(dev_priv) && dev_priv->ipc_enabled)
 		latency += 4;
 
-	if (apply_memory_bw_wa && x_tiled)
+	mem_wa = state->wm_results.mem_wa;
+	if (mem_wa != WATERMARK_WA_NONE && x_tiled)
 		latency += 15;
 
 	width = drm_rect_width(&intel_pstate->base.src) >> 16;
@@ -3635,7 +3632,7 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 		y_min_scanlines = 4;
 	}
 
-	if (apply_memory_bw_wa)
+	if (mem_wa == WATERMARK_WA_Y_TILED)
 		y_min_scanlines *= 2;
 
 	plane_bytes_per_line = width * cpp;
@@ -4063,6 +4060,15 @@ skl_compute_ddb(struct drm_atomic_state *state)
 	}
 
 	/*
+	 * If Watermark workaround is changed we need to recalculate
+	 * watermark values for all active pipes
+	 */
+	if (intel_state->wm_results.mem_wa != dev_priv->wm.skl_hw.mem_wa) {
+		realloc_pipes = ~0;
+		intel_state->wm_results.dirty_pipes = ~0;
+	}
+
+	/*
 	 * We're not recomputing for the pipes not included in the commit, so
 	 * make sure we start with the current state.
 	 */
@@ -4087,6 +4093,133 @@ skl_compute_ddb(struct drm_atomic_state *state)
 	return 0;
 }
 
+static int
+skl_compute_memory_bandwidth_wm_wa(struct drm_atomic_state *state)
+{
+	struct drm_device *dev = state->dev;
+	struct drm_i915_private *dev_priv = to_i915(dev);
+	struct intel_atomic_state *intel_state = to_intel_atomic_state(state);
+	struct memdev_info *memdev_info = &dev_priv->memdev_info;
+	struct skl_wm_values *results = &intel_state->wm_results;
+	struct drm_crtc *crtc;
+	struct drm_crtc_state *cstate;
+	int active_pipes = 0;
+	uint32_t max_pipe_bw_kbps = 0, total_pipe_bw_kbps;
+	int display_bw_percentage;
+	bool y_tile_enabled = false;
+	int ret, i;
+
+	if (!intel_needs_memory_bw_wa(intel_state)) {
+		results->mem_wa = WATERMARK_WA_NONE;
+		return 0;
+	}
+
+	if (!memdev_info->valid)
+		goto exit;
+
+	ret = drm_modeset_lock(&dev->mode_config.connection_mutex,
+			state->acquire_ctx);
+	if (ret)
+		return ret;
+
+	memcpy(results->pipe_bw_kbps, dev_priv->wm.skl_hw.pipe_bw_kbps,
+			sizeof(results->pipe_bw_kbps));
+	memcpy(results->pipe_ytiled, dev_priv->wm.skl_hw.pipe_ytiled,
+			sizeof(results->pipe_ytiled));
+
+	for_each_crtc_in_state(state, crtc, cstate, i) {
+		struct drm_plane *plane;
+		const struct drm_plane_state *pstate;
+		int active_planes = 0;
+		uint32_t max_plane_bw_kbps = 0;
+
+		results->pipe_ytiled[i] = false;
+
+		if (!cstate->active) {
+			results->pipe_bw_kbps[i] = 0;
+			continue;
+		}
+
+		drm_atomic_crtc_state_for_each_plane_state(plane, pstate,
+								cstate) {
+			struct drm_framebuffer *fb;
+			uint32_t plane_bw_kbps;
+			enum plane_id id = to_intel_plane(plane)->id;
+
+			if (!pstate->visible)
+				continue;
+
+			if (WARN_ON(!pstate->fb))
+				continue;
+
+			if (id == PLANE_CURSOR)
+				continue;
+
+			fb = pstate->fb;
+			if ((fb->modifier == I915_FORMAT_MOD_Y_TILED ||
+				fb->modifier == I915_FORMAT_MOD_Yf_TILED))
+				results->pipe_ytiled[i] = true;
+
+			plane_bw_kbps = skl_adjusted_plane_pixel_rate(
+						to_intel_crtc_state(cstate),
+						to_intel_plane_state(pstate));
+			max_plane_bw_kbps = max(plane_bw_kbps,
+							max_plane_bw_kbps);
+			active_planes++;
+		}
+		results->pipe_bw_kbps[i] = max_plane_bw_kbps * active_planes;
+	}
+
+	for_each_pipe(dev_priv, i) {
+		if (results->pipe_bw_kbps[i]) {
+			max_pipe_bw_kbps = max(max_pipe_bw_kbps,
+					results->pipe_bw_kbps[i]);
+			active_pipes++;
+		}
+		if (results->pipe_ytiled[i])
+			y_tile_enabled = true;
+	}
+
+	total_pipe_bw_kbps = max_pipe_bw_kbps * active_pipes;
+	display_bw_percentage = DIV_ROUND_UP_ULL(total_pipe_bw_kbps * 100,
+						memdev_info->bandwidth_kbps);
+
+	/*
+	 * If there is any Ytile plane enabled and arbitrated display
+	 * bandwidth > 20% of raw system memory bandwidth
+	 * Enale Y-tile related WA
+	 *
+	 * If memory is dual channel single rank, Xtile limit = 35%, else Xtile
+	 * limit = 60%
+	 * If there is no Ytile plane enabled and
+	 * arbitrated display bandwidth > Xtile limit
+	 * Enable X-tile realated WA
+	 */
+	if (y_tile_enabled && (display_bw_percentage > 20))
+		results->mem_wa = WATERMARK_WA_Y_TILED;
+	else {
+		int x_tile_percentage = 60;
+		enum rank rank = DRAM_RANK_SINGLE;
+
+		if (memdev_info->rank != DRAM_RANK_INVALID)
+			rank = memdev_info->rank;
+
+		if ((rank == DRAM_RANK_SINGLE) &&
+					(memdev_info->num_channels == 2))
+			x_tile_percentage = 35;
+
+		if (display_bw_percentage > x_tile_percentage)
+			results->mem_wa = WATERMARK_WA_X_TILED;
+	}
+
+	return 0;
+
+exit:
+	results->mem_wa = WATERMARK_WA_Y_TILED;
+	return 0;
+}
+
+
 static void
 skl_copy_wm_for_pipe(struct skl_wm_values *dst,
 		     struct skl_wm_values *src,
@@ -4162,6 +4295,10 @@ skl_compute_wm(struct drm_atomic_state *state)
 	/* Clear all dirty flags */
 	results->dirty_pipes = 0;
 
+	ret = skl_compute_memory_bandwidth_wm_wa(state);
+	if (ret)
+		return ret;
+
 	ret = skl_compute_ddb(state);
 	if (ret)
 		return ret;
-- 
2.10.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* ✓ Fi.CI.BAT: success for GEN-9 Arbitrated Bandwidth WM WA's & IPC (rev3)
  2016-12-01 15:49 [PATCH v7 0/8] GEN-9 Arbitrated Bandwidth WM WA's & IPC Mahesh Kumar
                   ` (7 preceding siblings ...)
  2016-12-01 15:49 ` [PATCH v7 8/8] drm/i915/gen9: WM memory bandwidth related workaround Mahesh Kumar
@ 2016-12-01 16:15 ` Patchwork
  2016-12-07 19:35 ` [PATCH v7 0/8] GEN-9 Arbitrated Bandwidth WM WA's & IPC Paulo Zanoni
  9 siblings, 0 replies; 18+ messages in thread
From: Patchwork @ 2016-12-01 16:15 UTC (permalink / raw)
  To: Kumar, Mahesh; +Cc: intel-gfx

== Series Details ==

Series: GEN-9 Arbitrated Bandwidth WM WA's & IPC (rev3)
URL   : https://patchwork.freedesktop.org/series/15562/
State : success

== Summary ==

Series 15562v3 GEN-9 Arbitrated Bandwidth WM WA's & IPC
https://patchwork.freedesktop.org/api/1.0/series/15562/revisions/3/mbox/


fi-bdw-5557u     total:245  pass:226  dwarn:0   dfail:0   fail:4   skip:15 
fi-bsw-n3050     total:245  pass:203  dwarn:0   dfail:0   fail:2   skip:40 
fi-bxt-t5700     total:245  pass:213  dwarn:0   dfail:0   fail:4   skip:28 
fi-byt-j1900     total:245  pass:214  dwarn:0   dfail:0   fail:3   skip:28 
fi-byt-n2820     total:245  pass:210  dwarn:0   dfail:0   fail:3   skip:32 
fi-hsw-4770      total:245  pass:221  dwarn:0   dfail:0   fail:4   skip:20 
fi-hsw-4770r     total:245  pass:221  dwarn:0   dfail:0   fail:4   skip:20 
fi-ilk-650       total:245  pass:189  dwarn:0   dfail:0   fail:3   skip:53 
fi-ivb-3520m     total:245  pass:219  dwarn:0   dfail:0   fail:4   skip:22 
fi-ivb-3770      total:245  pass:219  dwarn:0   dfail:0   fail:4   skip:22 
fi-kbl-7500u     total:245  pass:219  dwarn:0   dfail:0   fail:4   skip:22 
fi-skl-6260u     total:245  pass:227  dwarn:0   dfail:0   fail:4   skip:14 
fi-skl-6700hq    total:245  pass:220  dwarn:0   dfail:0   fail:4   skip:21 
fi-skl-6700k     total:245  pass:219  dwarn:1   dfail:0   fail:4   skip:21 
fi-skl-6770hq    total:245  pass:227  dwarn:0   dfail:0   fail:4   skip:14 
fi-snb-2520m     total:245  pass:210  dwarn:0   dfail:0   fail:3   skip:32 
fi-snb-2600      total:245  pass:209  dwarn:0   dfail:0   fail:3   skip:33 

15f4397c9b50b0212dcb99fb2b5b8a3867133e81 drm-tip: 2016y-12m-01d-13h-26m-26s UTC integration manifest
b0ac9a9 drm/i915/gen9: WM memory bandwidth related workaround
f84d26c drm/i915: Decode system memory bandwidth
338d933 drm/i915: Add intel_atomic_get_existing_crtc_state function
2b11985b drm/i915/skl+: change WM calc to fixed point 16.16
8e23bea drm/i915/bxt: Enable IPC support
fcc35ec drm/i915/kbl: IPC workaround for kabylake
31150d7 drm/i915/bxt: IPC WA for Broxton
c11e0c0 drm/i915/skl: Add variables to check x_tile and y_tile

== Logs ==

For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_3163/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v7 0/8] GEN-9 Arbitrated Bandwidth WM WA's & IPC
  2016-12-01 15:49 [PATCH v7 0/8] GEN-9 Arbitrated Bandwidth WM WA's & IPC Mahesh Kumar
                   ` (8 preceding siblings ...)
  2016-12-01 16:15 ` ✓ Fi.CI.BAT: success for GEN-9 Arbitrated Bandwidth WM WA's & IPC (rev3) Patchwork
@ 2016-12-07 19:35 ` Paulo Zanoni
  2016-12-08 16:00   ` Daniel Vetter
  9 siblings, 1 reply; 18+ messages in thread
From: Paulo Zanoni @ 2016-12-07 19:35 UTC (permalink / raw)
  To: Mahesh Kumar, intel-gfx; +Cc: maarten.lankhorst

Em Qui, 2016-12-01 às 21:19 +0530, Mahesh Kumar escreveu:
> This series implements following set of functionality
> 	Implement IPC WA's for Broxton/KBL
> 	Enable IPC in supported platforms
> 	Convert WM calculation to fixed point calculation
> 	Calculation of System memory Bandwidth for SKL/KBL/BXT
> 	Implementation of Arbitrated memory Bandwidth related WM WA's

Pushed patches 1, 2, 3, 5 and 6, with small bikesheds applied. Thanks
for the patches!

(actually I had a little problem with dim, so right now only dinq has
the patches, drm-tip doesn't, soon they will appear)

Let's resume the discussion on the remaining patches. For a possible
next series, make sure to rebase and send only what's not applied.

> 
> 
> Mahesh Kumar (8):
>   drm/i915/skl: Add variables to check x_tile and y_tile
>   drm/i915/bxt: IPC WA for Broxton
>   drm/i915/kbl: IPC workaround for kabylake
>   drm/i915/bxt: Enable IPC support
>   drm/i915/skl+: change WM calc to fixed point 16.16
>   drm/i915: Add intel_atomic_get_existing_crtc_state function
>   drm/i915: Decode system memory bandwidth
>   drm/i915/gen9: WM memory bandwidth related workaround
> 
>  drivers/gpu/drm/i915/i915_drv.c      | 175 ++++++++++++++++++++++
>  drivers/gpu/drm/i915/i915_drv.h      | 109 ++++++++++++++
>  drivers/gpu/drm/i915/i915_reg.h      |  38 +++++
>  drivers/gpu/drm/i915/intel_display.c |  24 ++++
>  drivers/gpu/drm/i915/intel_drv.h     |  15 ++
>  drivers/gpu/drm/i915/intel_pm.c      | 272
> +++++++++++++++++++++++++++++------
>  6 files changed, 586 insertions(+), 47 deletions(-)
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v7 0/8] GEN-9 Arbitrated Bandwidth WM WA's & IPC
  2016-12-07 19:35 ` [PATCH v7 0/8] GEN-9 Arbitrated Bandwidth WM WA's & IPC Paulo Zanoni
@ 2016-12-08 16:00   ` Daniel Vetter
  2016-12-08 16:12     ` Zanoni, Paulo R
  0 siblings, 1 reply; 18+ messages in thread
From: Daniel Vetter @ 2016-12-08 16:00 UTC (permalink / raw)
  To: Paulo Zanoni; +Cc: intel-gfx, maarten.lankhorst

On Wed, Dec 07, 2016 at 05:35:09PM -0200, Paulo Zanoni wrote:
> Em Qui, 2016-12-01 às 21:19 +0530, Mahesh Kumar escreveu:
> > This series implements following set of functionality
> > 	Implement IPC WA's for Broxton/KBL
> > 	Enable IPC in supported platforms
> > 	Convert WM calculation to fixed point calculation
> > 	Calculation of System memory Bandwidth for SKL/KBL/BXT
> > 	Implementation of Arbitrated memory Bandwidth related WM WA's
> 
> Pushed patches 1, 2, 3, 5 and 6, with small bikesheds applied. Thanks
> for the patches!
> 
> (actually I had a little problem with dim, so right now only dinq has
> the patches, drm-tip doesn't, soon they will appear)

Hm, what did go boom with dim? If the pushing worked, then you can re-run
just the drm-tip rebuilding with

$ dim rebuild-tip

Might also be worth it to upgrade to latest dim, we've dropped a few
bugfixes in the branch push/pull logic in a few places since the big
conversion, you might hit them.

If all this doesn't help, pls ping me on irc with a pastebin of what's
going on. For testing you can always run

$ dim pq

To repush dinq - dim will complain if you'll do a non-ff push and abort,
so it's safe.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v7 0/8] GEN-9 Arbitrated Bandwidth WM WA's & IPC
  2016-12-08 16:00   ` Daniel Vetter
@ 2016-12-08 16:12     ` Zanoni, Paulo R
  2016-12-08 16:18       ` Daniel Vetter
  0 siblings, 1 reply; 18+ messages in thread
From: Zanoni, Paulo R @ 2016-12-08 16:12 UTC (permalink / raw)
  To: daniel@ffwll.ch; +Cc: intel-gfx@lists.freedesktop.org, Lankhorst, Maarten

Em Qui, 2016-12-08 às 17:00 +0100, Daniel Vetter escreveu:
> On Wed, Dec 07, 2016 at 05:35:09PM -0200, Paulo Zanoni wrote:
> > 
> > Em Qui, 2016-12-01 às 21:19 +0530, Mahesh Kumar escreveu:
> > > 
> > > This series implements following set of functionality
> > > 	Implement IPC WA's for Broxton/KBL
> > > 	Enable IPC in supported platforms
> > > 	Convert WM calculation to fixed point calculation
> > > 	Calculation of System memory Bandwidth for SKL/KBL/BXT
> > > 	Implementation of Arbitrated memory Bandwidth related WM WA's
> > 
> > Pushed patches 1, 2, 3, 5 and 6, with small bikesheds applied.
> > Thanks
> > for the patches!
> > 
> > (actually I had a little problem with dim, so right now only dinq
> > has
> > the patches, drm-tip doesn't, soon they will appear)
> 
> Hm, what did go boom with dim?

This was my first commit/push since the change. I had followed the
instructions on your email (including the dim setup stage), and then
when I did "dim push-queued" it complained that I didn't have the audio
remotes in my tree. After the error message I was not sure what was the
correct way to proceed.

I suppose we could change dim setup so that it really adds every single
remote that's needed? Also, why doesn't it just add the remotes itself?

>  If the pushing worked, then you can re-run
> just the drm-tip rebuilding with
> 
> $ dim rebuild-tip

Good to know, thanks for that.

> 
> Might also be worth it to upgrade to latest dim, we've dropped a few
> bugfixes in the branch push/pull logic in a few places since the big
> conversion, you might hit them.

I always upgrade to the latest version before using it.

> 
> If all this doesn't help, pls ping me on irc with a pastebin of
> what's
> going on. For testing you can always run

I sent you an email yesterday with the error message pasted... Maybe
you'll find it soon :).


> 
> $ dim pq
> 
> To repush dinq - dim will complain if you'll do a non-ff push and
> abort,
> so it's safe.
> -Daniel
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v7 0/8] GEN-9 Arbitrated Bandwidth WM WA's & IPC
  2016-12-08 16:12     ` Zanoni, Paulo R
@ 2016-12-08 16:18       ` Daniel Vetter
  0 siblings, 0 replies; 18+ messages in thread
From: Daniel Vetter @ 2016-12-08 16:18 UTC (permalink / raw)
  To: Zanoni, Paulo R; +Cc: intel-gfx@lists.freedesktop.org, Lankhorst, Maarten

On Thu, Dec 08, 2016 at 04:12:37PM +0000, Zanoni, Paulo R wrote:
> Em Qui, 2016-12-08 às 17:00 +0100, Daniel Vetter escreveu:
> > On Wed, Dec 07, 2016 at 05:35:09PM -0200, Paulo Zanoni wrote:
> > > 
> > > Em Qui, 2016-12-01 às 21:19 +0530, Mahesh Kumar escreveu:
> > > > 
> > > > This series implements following set of functionality
> > > > 	Implement IPC WA's for Broxton/KBL
> > > > 	Enable IPC in supported platforms
> > > > 	Convert WM calculation to fixed point calculation
> > > > 	Calculation of System memory Bandwidth for SKL/KBL/BXT
> > > > 	Implementation of Arbitrated memory Bandwidth related WM WA's
> > > 
> > > Pushed patches 1, 2, 3, 5 and 6, with small bikesheds applied.
> > > Thanks
> > > for the patches!
> > > 
> > > (actually I had a little problem with dim, so right now only dinq
> > > has
> > > the patches, drm-tip doesn't, soon they will appear)
> > 
> > Hm, what did go boom with dim?
> 
> This was my first commit/push since the change. I had followed the
> instructions on your email (including the dim setup stage), and then
> when I did "dim push-queued" it complained that I didn't have the audio
> remotes in my tree. After the error message I was not sure what was the
> correct way to proceed.
> 
> I suppose we could change dim setup so that it really adds every single
> remote that's needed? Also, why doesn't it just add the remotes itself?

dim doesn't require that you have a separate kernel repo for it, so for
creating somewhat invasive stuff like naming remotes it defers to the
user. It should print the exact git cmdline you need to run, only thing to
fill in would be the name you want to give that remote.

I guess we could switch over to just auto-creating remotes. If you want to
do that, the function you need to change is url_to_remote. Instead of the
error output simply create the remote, and then set the local remote
variable to the remote name. I'd be happy to ack such a patch for dim,
since you're not the only one who got confused by this. Seems better to
just create the remote ...

> >  If the pushing worked, then you can re-run
> > just the drm-tip rebuilding with
> > 
> > $ dim rebuild-tip
> 
> Good to know, thanks for that.
> 
> > 
> > Might also be worth it to upgrade to latest dim, we've dropped a few
> > bugfixes in the branch push/pull logic in a few places since the big
> > conversion, you might hit them.
> 
> I always upgrade to the latest version before using it.
> 
> > 
> > If all this doesn't help, pls ping me on irc with a pastebin of
> > what's
> > going on. For testing you can always run
> 
> I sent you an email yesterday with the error message pasted... Maybe
> you'll find it soon :).

I random walk through my various inboxes ;-)
-Daniel

> 
> 
> > 
> > $ dim pq
> > 
> > To repush dinq - dim will complain if you'll do a non-ff push and
> > abort,
> > so it's safe.
> > -Daniel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v7 7/8] drm/i915: Decode system memory bandwidth
  2016-12-01 15:49 ` [PATCH v7 7/8] drm/i915: Decode system memory bandwidth Mahesh Kumar
@ 2016-12-08 23:55   ` Paulo Zanoni
  2017-02-15 15:00     ` Mahesh Kumar
  0 siblings, 1 reply; 18+ messages in thread
From: Paulo Zanoni @ 2016-12-08 23:55 UTC (permalink / raw)
  To: Mahesh Kumar, intel-gfx; +Cc: maarten.lankhorst

Em Qui, 2016-12-01 às 21:19 +0530, Mahesh Kumar escreveu:
> This patch adds support to decode system memory bandwidth
> which will be used for arbitrated display memory percentage
> calculation in GEN9 based system.
> 
> Changes from v1:
>  - Address comments from Paulo
>  - implement decode function for SKL/KBL also
> Changes from v2:
>  - Rewrite the code as per HW team inputs
>  - Addresses review comments
> Changes from v3:
>  - Fix compilation warning
> 
> Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>

As a general comment, indentation is weird on all multi-line
statements. Also, most comments are missing periods and are not 80-
column aligned. And a lot of the comments just say what the code
already does instead of explaining why the code does what it does, so
perhaps we could just remove them. Comments should explain why the code
is written the way it is, not translate from C to English.


> ---
>  drivers/gpu/drm/i915/i915_drv.c | 173
> ++++++++++++++++++++++++++++++++++++++++
>  drivers/gpu/drm/i915/i915_drv.h |  12 +++
>  drivers/gpu/drm/i915/i915_reg.h |  37 +++++++++
>  3 files changed, 222 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.c
> b/drivers/gpu/drm/i915/i915_drv.c
> index 1c689b6..0ac7122 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -979,6 +979,173 @@ static void intel_sanitize_options(struct
> drm_i915_private *dev_priv)
>  	DRM_DEBUG_DRIVER("use GPU sempahores? %s\n",
> yesno(i915.semaphores));
>  }
>  
> +static inline enum rank skl_memdev_get_channel_rank(uint32_t val)

Why inline? This is already static, leave it up to the compiler to
decide if it's better to inline or not.


> +{
> +	uint8_t l_rank, s_rank;
> +	uint8_t l_size, s_size;
> +	enum rank ch_rank = DRAM_RANK_SINGLE;

Optional suggestion: get rid of this variable, just return the
appropriate values once we discover them.


> +
> +	l_size = (val >> SKL_DRAM_SIZE_L_SHIFT) &
> SKL_DRAM_SIZE_MASK;
> +	s_size = (val >> SKL_DRAM_SIZE_S_SHIFT) &
> SKL_DRAM_SIZE_MASK;
> +	l_rank = (val >> SKL_DRAM_RANK_L_SHIFT) &
> SKL_DRAM_RANK_MASK;
> +	s_rank = (val >> SKL_DRAM_RANK_S_SHIFT) &
> SKL_DRAM_RANK_MASK;

Validate our assumptions:

WARN_ON(l_size == 0 && s_size == 0);

Or we could even do the appropriate check and return DRAM_RANK_INVALID,
but then we'd have to restructure how our caller calls us, maybe moving
some code to this function.


> +
> +	/*
> +	 * If any of the slot has dual rank memory consider
> +	 * dual rank memory channel
> +	 */

The comment says just what the code already says. What would be
interesting to see in the comment is an explanation of why we do it the
way we do.


> +	if (l_rank == SKL_DRAM_RANK_DUAL || s_rank ==
> SKL_DRAM_RANK_DUAL)
> +		ch_rank = DRAM_RANK_DUAL;
> 
> +
> +	/*
> +	 * If both the slot has single rank memory then
> configuration
> +	 * is dual rank memory
> +	 */

The comment says just what the code already says. What would be
interesting to see in the comment is an explanation of why we do it the
way we do.



> +	if ((l_size && l_rank == SKL_DRAM_RANK_SINGLE) &&
> +		(s_size && s_rank == SKL_DRAM_RANK_SINGLE))
> +		ch_rank = DRAM_RANK_DUAL;
> +	return ch_rank;
> +}
> +
> +static int
> +skl_get_memdev_info(struct drm_i915_private *dev_priv)
> +{
> +	struct memdev_info *memdev_info = &dev_priv->memdev_info;
> +	uint32_t mem_freq_khz;
> +	uint32_t val;
> +	enum rank ch0_rank = DRAM_RANK_INVALID, ch1_rank =
> DRAM_RANK_INVALID;
> +
> +	val = I915_READ(SKL_MC_BIOS_DATA_0_0_0_MCHBAR_PCU);
> +	mem_freq_khz = (val & SKL_REQ_DATA_MASK) *
> +				SKL_MEMORY_FREQ_MULTIPLIER_KHZ;

Optional suggestion: perhaps extend the memory_freq_multiplier to HZ
and then later cut the last 3 digits so the rounding errors get
restricted to the units we'll cut? Also, if we do the math using the "4
* num / 3" we can try to reduce the rounding errors even more.

I get annoyed that it says my system bandwidth is 25600032, while the
correct number is exactly 25600000.

This applies to the BXT code too.

> +
> +	val = I915_READ(SKL_MAD_DIMM_CH0_0_0_0_MCHBAR_MCMAIN);
> +	if (val != 0x0) {
> +		memdev_info->num_channels++;
> +		ch0_rank = skl_memdev_get_channel_rank(val);
> +	}

Optional suggestion:

Here's how I would have implemented this:

ch0_rank = skl_memdev_get_channel_rank(0);
if (ch0_rank != DRAM_RANK_INVALID)
	memdev_info->num_channels++;

This way we'd move all the logic around this register to
skl_memdev_get_channel_rank(), and we'd also be able to get rid of the
initializations here.

But this is just a suggestion in case you agree with me. Feel free to
leave the code the way it is.


> +
> +	val = I915_READ(SKL_MAD_DIMM_CH1_0_0_0_MCHBAR_MCMAIN);
> +	if (val != 0x0) {
> +		memdev_info->num_channels++;
> +		ch1_rank = skl_memdev_get_channel_rank(val);
> +	}
> +
> +	if (memdev_info->num_channels == 0) {
> +		DRM_ERROR("Number of mem channels are zero\n");

I'm not a native English speaker, but I would suppose that s/are/is/
would make the sentence correct (the number is zero). While at it, why
not s/mem/memory/ too to make the code sound a little more formal?


> +		return -EINVAL;
> +	}
> +
> +	memdev_info->bandwidth_kbps = (memdev_info->num_channels *
> +							mem_freq_khz
> * 8);
> +
> +	if (memdev_info->bandwidth_kbps == 0) {
> +		DRM_ERROR("Couldn't get system memory bandwidth\n");
> +		return -EINVAL;
> +	}
> +	memdev_info->valid = true;

My previous comment about ownership of this variable still applies.


> +
> +	/*
> +	 * If any of channel is single rank channel,

Again, not a native English speaker, but "any of channel" sounds
incorrect to me. But anyway, the comment only says what the code
already says. Why do we do it like this?


> +	 * consider single rank memory
> +	 */
> +	if (ch0_rank == DRAM_RANK_SINGLE || ch1_rank ==
> DRAM_RANK_SINGLE)
> +		memdev_info->rank = DRAM_RANK_SINGLE;
> +	else
> +		memdev_info->rank = max(ch0_rank, ch1_rank);

I can't think of any situation where this wouldn't end with
DRAM_RANK_DUAL, so we might just assign it to the variable.


> +
> +	return 0;
> +}
> +
> +static int
> +bxt_get_memdev_info(struct drm_i915_private *dev_priv)
> +{
> +	struct memdev_info *memdev_info = &dev_priv->memdev_info;
> +	uint32_t dram_channels;
> +	uint32_t mem_freq_khz, val;
> +	uint8_t num_active_channels;
> +	int i;
> +
> +	val = I915_READ(BXT_P_CR_MC_BIOS_REQ_0_0_0);
> +	mem_freq_khz = ((val & BXT_REQ_DATA_MASK) *
> +				BXT_MEMORY_FREQ_MULTIPLIER_KHZ);
> +
> +	dram_channels = (val >> BXT_DRAM_CHANNEL_ACTIVE_SHIFT) &
> +					BXT_DRAM_CHANNEL_ACTIVE_MASK
> ;
> +	num_active_channels = hweight32(dram_channels);
> +
> +	memdev_info->bandwidth_kbps = (mem_freq_khz *
> num_active_channels * 4);
> +
> +	if (memdev_info->bandwidth_kbps == 0) {
> +		DRM_ERROR("Couldn't get system memory bandwidth\n");
> +		return -EINVAL;
> +	}
> +	memdev_info->valid = true;
> +
> +	/*
> +	 * Now read each DUNIT8/9/10/11 to check the rank of each
> dimms.
> +	 */
> +	for (i = 0; i < BXT_D_CR_DRP0_DUNIT_MAX; i++) {
> +		val = I915_READ(BXT_D_CR_DRP0_DUNIT(i));
> +		if (val != 0xFFFFFFFF) {
> +			uint8_t rank;
> +			enum rank ch_rank;
> +
> +			memdev_info->num_channels++;
> +			rank = val & BXT_DRAM_RANK_MASK;

Shouldn't the code here be: "if one of these two bits are enabled then
we're RANK_SINGLE, if both bits are enabled then we're RANK_DUAL,
otherwise invalid"? If not, why?

In other words: is rank=0x2 really invalid? Isn't it just single rank?


> +			if (rank == BXT_DRAM_RANK_SINGLE)
> +				ch_rank = DRAM_RANK_SINGLE;
> +			else if (rank == BXT_DRAM_RANK_DUAL)
> +				ch_rank = DRAM_RANK_DUAL;
> +			else
> +				ch_rank = DRAM_RANK_INVALID;
> +
> +			/*
> +			 * If any of channel is having single rank
> memory

Again, "any of channel".

> +			 * consider memory as single rank
> +			 */
> +			if (memdev_info->rank == DRAM_RANK_INVALID)
> +				memdev_info->rank = ch_rank;
> +			else if (ch_rank == DRAM_RANK_SINGLE)
> +				memdev_info->rank =
> DRAM_RANK_SINGLE;
> +		}
> +	}

SKL returns -EINVAL if we end up with DRAM_RANK_INVALID in memdev_info-
>rank. I think we could do that in BXT too: either we have all the
information to not apply the workaround, or we give up.


> +	return 0;
> +}
> +
> +static void
> +intel_get_memdev_info(struct drm_i915_private *dev_priv)
> +{
> +	struct memdev_info *memdev_info = &dev_priv->memdev_info;
> +	int ret;
> +
> +	memdev_info->valid = false;
> +	memdev_info->rank = DRAM_RANK_INVALID;
> +	memdev_info->num_channels = 0;
> +
> +	if (!IS_GEN9(dev_priv))
> +		return;
> +
> +	if (IS_BROXTON(dev_priv))
> +		ret = bxt_get_memdev_info(dev_priv);
> +	else
> +		ret = skl_get_memdev_info(dev_priv);
> +	if (ret)
> +		return;
> +
> +	DRM_DEBUG_DRIVER("DRAM bandwidth: %u KBps total-channels:
> %u\n",
> +				memdev_info->bandwidth_kbps,
> +				memdev_info->num_channels);
> +	if (memdev_info->rank == DRAM_RANK_INVALID)
> +		DRM_INFO("Counld not get memory rank info\n");

Spelling mistake here. And since you use "Couldn't" instead of "Could
not" in all the other messages, I'd keep the same pattern.

But if you accept my suggestion above, we'll never print this message
anyway.


> +	else {
> +		DRM_DEBUG_DRIVER("DRAM rank: %s rank\n",
> +				(memdev_info->rank ==
> DRAM_RANK_DUAL) ?
> +						"dual" : "single");
> +	}
> +}
> +
> +

Two white spaces here.


>  /**
>   * i915_driver_init_hw - setup state requiring device access
>   * @dev_priv: device private
> @@ -1081,6 +1248,12 @@ static int i915_driver_init_hw(struct
> drm_i915_private *dev_priv)
>  			DRM_DEBUG_DRIVER("can't enable MSI");
>  	}
>  
> +	/*
> +	 * Fill the memdev structure to get the system raw bandwidth
> +	 * This will be used by WM algorithm, to implement GEN9
> based WA
> +	 */
> +	intel_get_memdev_info(dev_priv);
> +
>  	return 0;
>  
>  out_ggtt:
> diff --git a/drivers/gpu/drm/i915/i915_drv.h
> b/drivers/gpu/drm/i915/i915_drv.h
> index b78dc9a..69213a4 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2297,6 +2297,18 @@ struct drm_i915_private {
>  		bool distrust_bios_wm;
>  	} wm;
>  
> +	struct memdev_info {
> +		bool valid;
> +		uint32_t bandwidth_kbps;
> +		uint8_t num_channels;
> +		enum rank {
> +			DRAM_RANK_INVALID = 0,
> +			DRAM_RANK_SINGLE,
> +			DRAM_RANK_DUAL
> +		} rank;

This was previously an anonymous enum (enum { ... } rank;), but now
it's "enum rank", which is too generic for a .h file. Either use a more
specific name (enum memdev_rank?) or try to make it anonymous again.


> +	} memdev_info;
> +
> +
>  	struct i915_runtime_pm pm;
>  
>  	struct {
> diff --git a/drivers/gpu/drm/i915/i915_reg.h
> b/drivers/gpu/drm/i915/i915_reg.h
> index 649319d..e7efdd0 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -8015,6 +8015,43 @@ enum {
>  #define  DC_STATE_DEBUG_MASK_CORES	(1<<0)
>  #define  DC_STATE_DEBUG_MASK_MEMORY_UP	(1<<1)
>  
> +#define BXT_P_CR_MC_BIOS_REQ_0_0_0	_MMIO(MCHBAR_MIRROR_BASE_S
> NB + 0x7114)
> +#define BXT_REQ_DATA_MASK			0x3F
> +#define BXT_DRAM_ACTIVE_CHANNEL_SHIFT		12
> +#define BXT_DRAM_ACTIVE_CHANNEL_MASK		0xF

We're not using these two definitions anywhere.


> +/*
> + * BIOS programs this field of REQ_DATA [5:0] in integer
> + * multiple of 133333 KHz (133.33MHz)
> + */

Now that we're using decimal on the definition, I don't think this
comment is useful.


> +#define	BXT_MEMORY_FREQ_MULTIPLIER_KHZ		133333

We don't use tabs like that in our definitions.


> +#define BXT_D_CR_DRP0_DUNIT8			0x1000
> +#define BXT_D_CR_DRP0_DUNIT9			0x1200
> +#define BXT_D_CR_DRP0_DUNIT_MAX			4
> +#define _MMIO_MCHBAR_DUNIT(x, a, b) _MMIO(MCHBAR_MIRROR_BASE_SNB +
> (a) + (x)*((b)-(a)))
> +#define BXT_D_CR_DRP0_DUNIT(x)	_MMIO_MCHBAR_DUNIT(x,
> BXT_D_CR_DRP0_DUNIT8, BXT_D_CR_DRP0_DUNIT9)

But then BXT_D_CR_DRP0_DUNIT(1) means DUNIT9 instead of DUNIT1...
That's very unintuitive.


> +#define BXT_DRAM_CHANNEL_ACTIVE_SHIFT		12
> +#define BXT_DRAM_CHANNEL_ACTIVE_MASK		0xF

These two definitions don't belong to the register immediately above.


> +#define BXT_DRAM_RANK_MASK			0x3
> +#define BXT_DRAM_RANK_SINGLE			0x1
> +#define BXT_DRAM_RANK_DUAL			0x3
> +
> +/*
> + * SKL memory frequeny multiplier is 266667 KHz (266.67 MHz)

s/frequeny/frequency/

Also, this is not how we do one-line comments.

And now that the number is in decimal we probably don't even need the
comment here since the definition is a little more obvious.

> + */
> +#define	SKL_MEMORY_FREQ_MULTIPLIER_KHZ		266667

We don't use tabs like that in our definitions.


> +#define SKL_MC_BIOS_DATA_0_0_0_MCHBAR_PCU	_MMIO(MCHBAR_MIRROR
> _BASE_SNB + 0x5E04)
> +#define SKL_REQ_DATA_MASK			(0xF << 0)

Blank lines separating different registers would be good.


> +#define SKL_MAD_DIMM_CH0_0_0_0_MCHBAR_MCMAIN	_MMIO(MCHBAR_MIR
> ROR_BASE_SNB + 0x500C)
> +#define SKL_MAD_DIMM_CH1_0_0_0_MCHBAR_MCMAIN	_MMIO(MCHBAR_MIR
> ROR_BASE_SNB + 0x5010)
> +#define SKL_DRAM_SIZE_MASK			0x1F

0x3F


> +#define SKL_DRAM_SIZE_L_SHIFT			0
> +#define SKL_DRAM_SIZE_S_SHIFT			16
> +#define SKL_DRAM_RANK_MASK			0x1
> +#define SKL_DRAM_RANK_L_SHIFT			10
> +#define SKL_DRAM_RANK_S_SHIFT			26
> +#define SKL_DRAM_RANK_SINGLE			0x0
> +#define SKL_DRAM_RANK_DUAL			0x1
> +
>  /* Please see hsw_read_dcomp() and hsw_write_dcomp() before using
> this register,
>   * since on HSW we can't write to it using I915_WRITE. */
>  #define D_COMP_HSW			_MMIO(MCHBAR_MIRROR_BASE_S
> NB + 0x5F0C)
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v7 8/8] drm/i915/gen9: WM memory bandwidth related workaround
  2016-12-01 15:49 ` [PATCH v7 8/8] drm/i915/gen9: WM memory bandwidth related workaround Mahesh Kumar
@ 2016-12-15 17:07   ` Paulo Zanoni
  2017-02-15 15:00     ` Mahesh Kumar
  0 siblings, 1 reply; 18+ messages in thread
From: Paulo Zanoni @ 2016-12-15 17:07 UTC (permalink / raw)
  To: Mahesh Kumar, intel-gfx; +Cc: maarten.lankhorst

Em Qui, 2016-12-01 às 21:19 +0530, Mahesh Kumar escreveu:
> This patch implemnets Workariunds related to display arbitrated
> memory
> bandwidth. These WA are applicabe for all gen-9 based platforms.

3 typos above.

The WA is already implemented. What the patch does is that it opts-out
of the WA in case we conclude it's safe. Please say this in the commit
message.

> 
> Changes since v1:
>  - Rebase on top of Paulo's patch series
> Changes since v2:
>  - Address review comments
>  - Rebase/rework as per other patch changes in series
> Changes since v3:
>  - Rework based on Maarten's comments
> 
> Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_drv.h      |  11 +++
>  drivers/gpu/drm/i915/intel_display.c |  24 ++++++
>  drivers/gpu/drm/i915/intel_pm.c      | 155
> +++++++++++++++++++++++++++++++++--
>  3 files changed, 181 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h
> b/drivers/gpu/drm/i915/i915_drv.h
> index 69213a4..3126259 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1216,6 +1216,13 @@ enum intel_sbi_destination {
>  	SBI_MPHY,
>  };
>  
> +/* SKL+ Watermark arbitrated display bandwidth Workarounds */
> +enum watermark_memory_wa {
> +	WATERMARK_WA_NONE,
> +	WATERMARK_WA_X_TILED,
> +	WATERMARK_WA_Y_TILED,
> +};
> +
>  #define QUIRK_PIPEA_FORCE (1<<0)
>  #define QUIRK_LVDS_SSC_DISABLE (1<<1)
>  #define QUIRK_INVERT_BRIGHTNESS (1<<2)
> @@ -1788,6 +1795,10 @@ struct skl_ddb_allocation {
>  
>  struct skl_wm_values {
>  	unsigned dirty_pipes;
> +	/* any WaterMark memory workaround Required */

CapitaliZation is weird Here. Besides, no need for the comment.


> +	enum watermark_memory_wa mem_wa;
> +	uint32_t pipe_bw_kbps[I915_MAX_PIPES];
> +	bool pipe_ytiled[I915_MAX_PIPES];
>  	struct skl_ddb_allocation ddb;
>  };
>  
> diff --git a/drivers/gpu/drm/i915/intel_display.c
> b/drivers/gpu/drm/i915/intel_display.c
> index 5d11002..f8da488 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -14574,6 +14574,28 @@ static void intel_atomic_track_fbs(struct
> drm_atomic_state *state)
>  				  to_intel_plane(plane)-
> >frontbuffer_bit);
>  }
>  
> +static void
> +intel_update_wm_bw_wa(struct drm_atomic_state *state)
> +{
> +	struct intel_atomic_state *intel_state =
> to_intel_atomic_state(state);
> +	struct drm_i915_private *dev_priv = to_i915(state->dev);
> +	const struct drm_crtc *crtc;
> +	const struct drm_crtc_state *cstate;
> +	struct skl_wm_values *results = &intel_state->wm_results;
> +	struct skl_wm_values *hw_vals = &dev_priv->wm.skl_hw;
> +	int i;
> +
> +	if (!IS_GEN9(dev_priv))
> +		return;
> +
> +	for_each_crtc_in_state(state, crtc, cstate, i) {
> +		hw_vals->pipe_bw_kbps[i] = results->pipe_bw_kbps[i];
> +		hw_vals->pipe_ytiled[i] = results->pipe_ytiled[i];
> +	}
> +
> +	hw_vals->mem_wa = results->mem_wa;
> +}

I think we can just patch skl_copy_wm_for_pipe() to also copy the
fields we need instead of adding this function. Whouldn't that be much
simpler? Anyway, this one looks correct, so no need to change if you
don't want.


> +
>  /**
>   * intel_atomic_commit - commit validated state object
>   * @dev: DRM device
> @@ -14614,6 +14636,8 @@ static int intel_atomic_commit(struct
> drm_device *dev,
>  	intel_shared_dpll_commit(state);
>  	intel_atomic_track_fbs(state);
>  
> +	intel_update_wm_bw_wa(state);
> +
>  	if (intel_state->modeset) {
>  		memcpy(dev_priv->min_pixclk, intel_state-
> >min_pixclk,
>  		       sizeof(intel_state->min_pixclk));
> diff --git a/drivers/gpu/drm/i915/intel_pm.c
> b/drivers/gpu/drm/i915/intel_pm.c
> index cc8fc84..fda6e1e 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -2878,11 +2878,7 @@ bool ilk_disable_lp_wm(struct drm_device *dev)
>  
>  #define SKL_SAGV_BLOCK_TIME	30 /* µs */
>  
> -/*
> - * FIXME: We still don't have the proper code detect if we need to
> apply the WA,
> - * so assume we'll always need it in order to avoid underruns.
> - */
> -static bool skl_needs_memory_bw_wa(struct intel_atomic_state *state)
> +static bool intel_needs_memory_bw_wa(struct intel_atomic_state
> *state)

Why are we renaming this function?

Also, in my last review I suggested that we kill this function. I don't
think it makes sense to have this function anymore.

Besides, function intel_can_enable_sagv() needs to look at the value we
assigned to enum watermark_memory_wa, not get a true/false based on
platform version.


>  {
>  	struct drm_i915_private *dev_priv = to_i915(state-
> >base.dev);
>  
> @@ -3056,7 +3052,7 @@ bool intel_can_enable_sagv(struct
> drm_atomic_state *state)
>  
>  		latency = dev_priv->wm.skl_latency[level];
>  
> -		if (skl_needs_memory_bw_wa(intel_state) &&
> +		if (intel_needs_memory_bw_wa(intel_state) &&
>  		    plane->base.state->fb->modifier ==
>  		    I915_FORMAT_MOD_X_TILED)
>  			latency += 15;
> @@ -3584,7 +3580,7 @@ static int skl_compute_plane_wm(const struct
> drm_i915_private *dev_priv,
>  	uint32_t y_min_scanlines;
>  	struct intel_atomic_state *state =
>  		to_intel_atomic_state(cstate->base.state);
> -	bool apply_memory_bw_wa = skl_needs_memory_bw_wa(state);
> +	enum watermark_memory_wa mem_wa;
>  	bool y_tiled, x_tiled;
>  
>  	if (latency == 0 || !cstate->base.active || !intel_pstate-
> >base.visible) {
> @@ -3600,7 +3596,8 @@ static int skl_compute_plane_wm(const struct
> drm_i915_private *dev_priv,
>  	if (IS_KABYLAKE(dev_priv) && dev_priv->ipc_enabled)
>  		latency += 4;
>  
> -	if (apply_memory_bw_wa && x_tiled)
> +	mem_wa = state->wm_results.mem_wa;

This assignment could have been done during declaration.


> +	if (mem_wa != WATERMARK_WA_NONE && x_tiled)
>  		latency += 15;
>  
>  	width = drm_rect_width(&intel_pstate->base.src) >> 16;
> @@ -3635,7 +3632,7 @@ static int skl_compute_plane_wm(const struct
> drm_i915_private *dev_priv,
>  		y_min_scanlines = 4;
>  	}
>  
> -	if (apply_memory_bw_wa)
> +	if (mem_wa == WATERMARK_WA_Y_TILED)
>  		y_min_scanlines *= 2;
>  
>  	plane_bytes_per_line = width * cpp;
> @@ -4063,6 +4060,15 @@ skl_compute_ddb(struct drm_atomic_state
> *state)
>  	}
>  
>  	/*
> +	 * If Watermark workaround is changed we need to recalculate
> +	 * watermark values for all active pipes
> +	 */
> +	if (intel_state->wm_results.mem_wa != dev_priv-
> >wm.skl_hw.mem_wa) {
> +		realloc_pipes = ~0;
> +		intel_state->wm_results.dirty_pipes = ~0;
> +	}

But then skl_ddb_add_affected_planes() only actually adds to the commit
the planes that have different DDB partitioning. It seems to me that it
may be possible to have a case where the WA changes and the DDB stays
the same, so we need to find a way to include every CRTC there. Maybe
we could just go and call the function that adds every CRTC here.


> +
> +	/*
>  	 * We're not recomputing for the pipes not included in the
> commit, so
>  	 * make sure we start with the current state.
>  	 */
> @@ -4087,6 +4093,133 @@ skl_compute_ddb(struct drm_atomic_state
> *state)
>  	return 0;
>  }
>  
> +static int
> +skl_compute_memory_bandwidth_wm_wa(struct drm_atomic_state *state)
> +{
> +	struct drm_device *dev = state->dev;
> +	struct drm_i915_private *dev_priv = to_i915(dev);
> +	struct intel_atomic_state *intel_state =
> to_intel_atomic_state(state);
> +	struct memdev_info *memdev_info = &dev_priv->memdev_info;
> +	struct skl_wm_values *results = &intel_state->wm_results;
> +	struct drm_crtc *crtc;
> +	struct drm_crtc_state *cstate;
> +	int active_pipes = 0;
> +	uint32_t max_pipe_bw_kbps = 0, total_pipe_bw_kbps;
> +	int display_bw_percentage;
> +	bool y_tile_enabled = false;
> +	int ret, i;
> +
> +	if (!intel_needs_memory_bw_wa(intel_state)) {
> +		results->mem_wa = WATERMARK_WA_NONE;
> +		return 0;
> +	}
> +
> +	if (!memdev_info->valid)
> +		goto exit;

There's no reason to use a label if it's only called here. Just move
all that code to the if statement.


> +
> +	ret = drm_modeset_lock(&dev->mode_config.connection_mutex,
> +			state->acquire_ctx);
> +	if (ret)
> +		return ret;

Why are we getting this here? Why this lock specifically? What's it
protecting?

Doc says:
- ww mutex protecting connector state and routing
- protects connector->encoder and encoder->crtc links 

We're not touching any of that here.


> +
> +	memcpy(results->pipe_bw_kbps, dev_priv-
> >wm.skl_hw.pipe_bw_kbps,
> +			sizeof(results->pipe_bw_kbps));
> +	memcpy(results->pipe_ytiled, dev_priv-
> >wm.skl_hw.pipe_ytiled,
> +			sizeof(results->pipe_ytiled));
> +
> +	for_each_crtc_in_state(state, crtc, cstate, i) {
> +		struct drm_plane *plane;
> +		const struct drm_plane_state *pstate;
> +		int active_planes = 0;
> +		uint32_t max_plane_bw_kbps = 0;
> +
> +		results->pipe_ytiled[i] = false;
> +
> +		if (!cstate->active) {
> +			results->pipe_bw_kbps[i] = 0;
> +			continue;
> +		}
> +
> +		drm_atomic_crtc_state_for_each_plane_state(plane,
> pstate,
> +								csta
> te) {
> +			struct drm_framebuffer *fb;
> +			uint32_t plane_bw_kbps;
> +			enum plane_id id = to_intel_plane(plane)-
> >id;
> +
> +			if (!pstate->visible)
> +				continue;
> +
> +			if (WARN_ON(!pstate->fb))
> +				continue;
> +
> +			if (id == PLANE_CURSOR)
> +				continue;
> +
> +			fb = pstate->fb;
> +			if ((fb->modifier == I915_FORMAT_MOD_Y_TILED
> ||
> +				fb->modifier ==
> I915_FORMAT_MOD_Yf_TILED))
> +				results->pipe_ytiled[i] = true;
> +
> +			plane_bw_kbps =
> skl_adjusted_plane_pixel_rate(
> +						to_intel_crtc_state(
> cstate),
> +						to_intel_plane_state
> (pstate));
> +			max_plane_bw_kbps = max(plane_bw_kbps,
> +							max_plane_bw
> _kbps);
> +			active_planes++;
> +		}
> +		results->pipe_bw_kbps[i] = max_plane_bw_kbps *
> active_planes;
> +	}
> +
> +	for_each_pipe(dev_priv, i) {
> +		if (results->pipe_bw_kbps[i]) {
> +			max_pipe_bw_kbps = max(max_pipe_bw_kbps,
> +					results->pipe_bw_kbps[i]);
> +			active_pipes++;
> +		}
> +		if (results->pipe_ytiled[i])
> +			y_tile_enabled = true;
> +	}
> +
> +	total_pipe_bw_kbps = max_pipe_bw_kbps * active_pipes;
> +	display_bw_percentage = DIV_ROUND_UP_ULL(total_pipe_bw_kbps
> * 100,
> +						memdev_info-
> >bandwidth_kbps);

Shouldn't this be just DIV_ROUND_UP()? I don't see 64 bit variables
here, nor the possibility of overflows.


> +
> +	/*
> +	 * If there is any Ytile plane enabled and arbitrated
> display
> +	 * bandwidth > 20% of raw system memory bandwidth
> +	 * Enale Y-tile related WA
> +	 *
> +	 * If memory is dual channel single rank, Xtile limit = 35%,
> else Xtile
> +	 * limit = 60%
> +	 * If there is no Ytile plane enabled and
> +	 * arbitrated display bandwidth > Xtile limit
> +	 * Enable X-tile realated WA
> +	 */
> +	if (y_tile_enabled && (display_bw_percentage > 20))
> +		results->mem_wa = WATERMARK_WA_Y_TILED;
> +	else {
> +		int x_tile_percentage = 60;
> +		enum rank rank = DRAM_RANK_SINGLE;
> +
> +		if (memdev_info->rank != DRAM_RANK_INVALID)
> +			rank = memdev_info->rank;

I really think that the previous patch should prevent this. If
memdev_info->rank is invalid then memdev_info->valid should also be
false and we'd have returned at the beginning of the function. With
that, this part of the code could be simplified a little bit since then
we'd have no need to choose a default rank.



> +
> +		if ((rank == DRAM_RANK_SINGLE) &&
> +					(memdev_info->num_channels
> == 2))
> +			x_tile_percentage = 35;
> +
> +		if (display_bw_percentage > x_tile_percentage)
> +			results->mem_wa = WATERMARK_WA_X_TILED;
> +	}
> +
> +	return 0;
> +
> +exit:
> +	results->mem_wa = WATERMARK_WA_Y_TILED;
> +	return 0;
> +}
> +
> +

Two blank lines here.


>  static void
>  skl_copy_wm_for_pipe(struct skl_wm_values *dst,
>  		     struct skl_wm_values *src,
> @@ -4162,6 +4295,10 @@ skl_compute_wm(struct drm_atomic_state *state)
>  	/* Clear all dirty flags */
>  	results->dirty_pipes = 0;
>  
> +	ret = skl_compute_memory_bandwidth_wm_wa(state);
> +	if (ret)
> +		return ret;
> +
>  	ret = skl_compute_ddb(state);
>  	if (ret)
>  		return ret;
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v7 7/8] drm/i915: Decode system memory bandwidth
  2016-12-08 23:55   ` Paulo Zanoni
@ 2017-02-15 15:00     ` Mahesh Kumar
  0 siblings, 0 replies; 18+ messages in thread
From: Mahesh Kumar @ 2017-02-15 15:00 UTC (permalink / raw)
  To: Paulo Zanoni, intel-gfx; +Cc: maarten.lankhorst

Hi,


On Friday 09 December 2016 05:25 AM, Paulo Zanoni wrote:
> Em Qui, 2016-12-01 às 21:19 +0530, Mahesh Kumar escreveu:
>> This patch adds support to decode system memory bandwidth
>> which will be used for arbitrated display memory percentage
>> calculation in GEN9 based system.
>>
>> Changes from v1:
>>   - Address comments from Paulo
>>   - implement decode function for SKL/KBL also
>> Changes from v2:
>>   - Rewrite the code as per HW team inputs
>>   - Addresses review comments
>> Changes from v3:
>>   - Fix compilation warning
>>
>> Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
> As a general comment, indentation is weird on all multi-line
> statements. Also, most comments are missing periods and are not 80-
> column aligned. And a lot of the comments just say what the code
> already does instead of explaining why the code does what it does, so
> perhaps we could just remove them. Comments should explain why the code
> is written the way it is, not translate from C to English.
>
>
>> ---
>>   drivers/gpu/drm/i915/i915_drv.c | 173
>> ++++++++++++++++++++++++++++++++++++++++
>>   drivers/gpu/drm/i915/i915_drv.h |  12 +++
>>   drivers/gpu/drm/i915/i915_reg.h |  37 +++++++++
>>   3 files changed, 222 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_drv.c
>> b/drivers/gpu/drm/i915/i915_drv.c
>> index 1c689b6..0ac7122 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.c
>> +++ b/drivers/gpu/drm/i915/i915_drv.c
>> @@ -979,6 +979,173 @@ static void intel_sanitize_options(struct
>> drm_i915_private *dev_priv)
>>   	DRM_DEBUG_DRIVER("use GPU sempahores? %s\n",
>> yesno(i915.semaphores));
>>   }
>>   
>> +static inline enum rank skl_memdev_get_channel_rank(uint32_t val)
> Why inline? This is already static, leave it up to the compiler to
> decide if it's better to inline or not.
removed inline in new series
>
>> +{
>> +	uint8_t l_rank, s_rank;
>> +	uint8_t l_size, s_size;
>> +	enum rank ch_rank = DRAM_RANK_SINGLE;
> Optional suggestion: get rid of this variable, just return the
> appropriate values once we discover them.
done
>
>> +
>> +	l_size = (val >> SKL_DRAM_SIZE_L_SHIFT) &
>> SKL_DRAM_SIZE_MASK;
>> +	s_size = (val >> SKL_DRAM_SIZE_S_SHIFT) &
>> SKL_DRAM_SIZE_MASK;
>> +	l_rank = (val >> SKL_DRAM_RANK_L_SHIFT) &
>> SKL_DRAM_RANK_MASK;
>> +	s_rank = (val >> SKL_DRAM_RANK_S_SHIFT) &
>> SKL_DRAM_RANK_MASK;
> Validate our assumptions:
>
> WARN_ON(l_size == 0 && s_size == 0);
>
> Or we could even do the appropriate check and return DRAM_RANK_INVALID,
> but then we'd have to restructure how our caller calls us, maybe moving
> some code to this function.
made the adjustment
>
>> +
>> +	/*
>> +	 * If any of the slot has dual rank memory consider
>> +	 * dual rank memory channel
>> +	 */
> The comment says just what the code already says. What would be
> interesting to see in the comment is an explanation of why we do it the
> way we do.
updated
>
>> +	if (l_rank == SKL_DRAM_RANK_DUAL || s_rank ==
>> SKL_DRAM_RANK_DUAL)
>> +		ch_rank = DRAM_RANK_DUAL;
>>
>> +
>> +	/*
>> +	 * If both the slot has single rank memory then
>> configuration
>> +	 * is dual rank memory
>> +	 */
> The comment says just what the code already says. What would be
> interesting to see in the comment is an explanation of why we do it the
> way we do.
>
>
>> +	if ((l_size && l_rank == SKL_DRAM_RANK_SINGLE) &&
>> +		(s_size && s_rank == SKL_DRAM_RANK_SINGLE))
>> +		ch_rank = DRAM_RANK_DUAL;
>> +	return ch_rank;
>> +}
>> +
>> +static int
>> +skl_get_memdev_info(struct drm_i915_private *dev_priv)
>> +{
>> +	struct memdev_info *memdev_info = &dev_priv->memdev_info;
>> +	uint32_t mem_freq_khz;
>> +	uint32_t val;
>> +	enum rank ch0_rank = DRAM_RANK_INVALID, ch1_rank =
>> DRAM_RANK_INVALID;
>> +
>> +	val = I915_READ(SKL_MC_BIOS_DATA_0_0_0_MCHBAR_PCU);
>> +	mem_freq_khz = (val & SKL_REQ_DATA_MASK) *
>> +				SKL_MEMORY_FREQ_MULTIPLIER_KHZ;
> Optional suggestion: perhaps extend the memory_freq_multiplier to HZ
> and then later cut the last 3 digits so the rounding errors get
> restricted to the units we'll cut? Also, if we do the math using the "4
> * num / 3" we can try to reduce the rounding errors even more.
>
> I get annoyed that it says my system bandwidth is 25600032, while the
> correct number is exactly 25600000.
>
> This applies to the BXT code too.
Modified to work on Hz & then convert to KHz
>
>> +
>> +	val = I915_READ(SKL_MAD_DIMM_CH0_0_0_0_MCHBAR_MCMAIN);
>> +	if (val != 0x0) {
>> +		memdev_info->num_channels++;
>> +		ch0_rank = skl_memdev_get_channel_rank(val);
>> +	}
> Optional suggestion:
>
> Here's how I would have implemented this:
>
> ch0_rank = skl_memdev_get_channel_rank(0);
> if (ch0_rank != DRAM_RANK_INVALID)
> 	memdev_info->num_channels++;
>
> This way we'd move all the logic around this register to
> skl_memdev_get_channel_rank(), and we'd also be able to get rid of the
> initializations here.
>
> But this is just a suggestion in case you agree with me. Feel free to
> leave the code the way it is.
suggestion taken :)
>
>> +
>> +	val = I915_READ(SKL_MAD_DIMM_CH1_0_0_0_MCHBAR_MCMAIN);
>> +	if (val != 0x0) {
>> +		memdev_info->num_channels++;
>> +		ch1_rank = skl_memdev_get_channel_rank(val);
>> +	}
>> +
>> +	if (memdev_info->num_channels == 0) {
>> +		DRM_ERROR("Number of mem channels are zero\n");
> I'm not a native English speaker, but I would suppose that s/are/is/
> would make the sentence correct (the number is zero). While at it, why
> not s/mem/memory/ too to make the code sound a little more formal?
>
>
>> +		return -EINVAL;
>> +	}
>> +
>> +	memdev_info->bandwidth_kbps = (memdev_info->num_channels *
>> +							mem_freq_khz
>> * 8);
>> +
>> +	if (memdev_info->bandwidth_kbps == 0) {
>> +		DRM_ERROR("Couldn't get system memory bandwidth\n");
>> +		return -EINVAL;
>> +	}
>> +	memdev_info->valid = true;
> My previous comment about ownership of this variable still applies.
>
Now assigning memdev_info-> valid to true, only if bandwidth & rank both 
the information are valid
>> +
>> +	/*
>> +	 * If any of channel is single rank channel,
> Again, not a native English speaker, but "any of channel" sounds
> incorrect to me. But anyway, the comment only says what the code
> already says. Why do we do it like this?
>
>
>> +	 * consider single rank memory
>> +	 */
>> +	if (ch0_rank == DRAM_RANK_SINGLE || ch1_rank ==
>> DRAM_RANK_SINGLE)
>> +		memdev_info->rank = DRAM_RANK_SINGLE;
>> +	else
>> +		memdev_info->rank = max(ch0_rank, ch1_rank);
> I can't think of any situation where this wouldn't end with
> DRAM_RANK_DUAL, so we might just assign it to the variable.
If there is only one dual-rank DRAM installed out of 4-dimms, then 
either ch0_rank or ch1_rank will be I915_DRAM_DUAL_RANK & other will be 
INVALID,
in that case we'll have rank as DRAM_RANK_DUAL.
>
>> +
>> +	return 0;
>> +}
>> +
>> +static int
>> +bxt_get_memdev_info(struct drm_i915_private *dev_priv)
>> +{
>> +	struct memdev_info *memdev_info = &dev_priv->memdev_info;
>> +	uint32_t dram_channels;
>> +	uint32_t mem_freq_khz, val;
>> +	uint8_t num_active_channels;
>> +	int i;
>> +
>> +	val = I915_READ(BXT_P_CR_MC_BIOS_REQ_0_0_0);
>> +	mem_freq_khz = ((val & BXT_REQ_DATA_MASK) *
>> +				BXT_MEMORY_FREQ_MULTIPLIER_KHZ);
>> +
>> +	dram_channels = (val >> BXT_DRAM_CHANNEL_ACTIVE_SHIFT) &
>> +					BXT_DRAM_CHANNEL_ACTIVE_MASK
>> ;
>> +	num_active_channels = hweight32(dram_channels);
>> +
>> +	memdev_info->bandwidth_kbps = (mem_freq_khz *
>> num_active_channels * 4);
>> +
>> +	if (memdev_info->bandwidth_kbps == 0) {
>> +		DRM_ERROR("Couldn't get system memory bandwidth\n");
>> +		return -EINVAL;
>> +	}
>> +	memdev_info->valid = true;
>> +
>> +	/*
>> +	 * Now read each DUNIT8/9/10/11 to check the rank of each
>> dimms.
>> +	 */
>> +	for (i = 0; i < BXT_D_CR_DRP0_DUNIT_MAX; i++) {
>> +		val = I915_READ(BXT_D_CR_DRP0_DUNIT(i));
>> +		if (val != 0xFFFFFFFF) {
>> +			uint8_t rank;
>> +			enum rank ch_rank;
>> +
>> +			memdev_info->num_channels++;
>> +			rank = val & BXT_DRAM_RANK_MASK;
> Shouldn't the code here be: "if one of these two bits are enabled then
> we're RANK_SINGLE, if both bits are enabled then we're RANK_DUAL,
> otherwise invalid"? If not, why?
>
> In other words: is rank=0x2 really invalid? Isn't it just single rank?
rank=0x2 is an invalid entry. "Enable Rank 0: Must be set to 1 to enable 
use of this rank. Note: Setting this bit to 0 is not a functional mode"
"Enable Rank 0" bit can't be a 0 anyhow.
>
>
>> +			if (rank == BXT_DRAM_RANK_SINGLE)
>> +				ch_rank = DRAM_RANK_SINGLE;
>> +			else if (rank == BXT_DRAM_RANK_DUAL)
>> +				ch_rank = DRAM_RANK_DUAL;
>> +			else
>> +				ch_rank = DRAM_RANK_INVALID;
>> +
>> +			/*
>> +			 * If any of channel is having single rank
>> memory
> Again, "any of channel".
>
>> +			 * consider memory as single rank
>> +			 */
>> +			if (memdev_info->rank == DRAM_RANK_INVALID)
>> +				memdev_info->rank = ch_rank;
>> +			else if (ch_rank == DRAM_RANK_SINGLE)
>> +				memdev_info->rank =
>> DRAM_RANK_SINGLE;
>> +		}
>> +	}
> SKL returns -EINVAL if we end up with DRAM_RANK_INVALID in memdev_info-
>> rank. I think we could do that in BXT too: either we have all the
> information to not apply the workaround, or we give up.
done, returning invalid if rank information is not available.
>
>
>> +	return 0;
>> +}
>> +
>> +static void
>> +intel_get_memdev_info(struct drm_i915_private *dev_priv)
>> +{
>> +	struct memdev_info *memdev_info = &dev_priv->memdev_info;
>> +	int ret;
>> +
>> +	memdev_info->valid = false;
>> +	memdev_info->rank = DRAM_RANK_INVALID;
>> +	memdev_info->num_channels = 0;
>> +
>> +	if (!IS_GEN9(dev_priv))
>> +		return;
>> +
>> +	if (IS_BROXTON(dev_priv))
>> +		ret = bxt_get_memdev_info(dev_priv);
>> +	else
>> +		ret = skl_get_memdev_info(dev_priv);
>> +	if (ret)
>> +		return;
>> +
>> +	DRM_DEBUG_DRIVER("DRAM bandwidth: %u KBps total-channels:
>> %u\n",
>> +				memdev_info->bandwidth_kbps,
>> +				memdev_info->num_channels);
>> +	if (memdev_info->rank == DRAM_RANK_INVALID)
>> +		DRM_INFO("Counld not get memory rank info\n");
> Spelling mistake here. And since you use "Couldn't" instead of "Could
> not" in all the other messages, I'd keep the same pattern.
>
> But if you accept my suggestion above, we'll never print this message
> anyway.
>
>
>> +	else {
>> +		DRM_DEBUG_DRIVER("DRAM rank: %s rank\n",
>> +				(memdev_info->rank ==
>> DRAM_RANK_DUAL) ?
>> +						"dual" : "single");
>> +	}
>> +}
>> +
>> +
> Two white spaces here.
>
>
>>   /**
>>    * i915_driver_init_hw - setup state requiring device access
>>    * @dev_priv: device private
>> @@ -1081,6 +1248,12 @@ static int i915_driver_init_hw(struct
>> drm_i915_private *dev_priv)
>>   			DRM_DEBUG_DRIVER("can't enable MSI");
>>   	}
>>   
>> +	/*
>> +	 * Fill the memdev structure to get the system raw bandwidth
>> +	 * This will be used by WM algorithm, to implement GEN9
>> based WA
>> +	 */
>> +	intel_get_memdev_info(dev_priv);
>> +
>>   	return 0;
>>   
>>   out_ggtt:
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h
>> b/drivers/gpu/drm/i915/i915_drv.h
>> index b78dc9a..69213a4 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -2297,6 +2297,18 @@ struct drm_i915_private {
>>   		bool distrust_bios_wm;
>>   	} wm;
>>   
>> +	struct memdev_info {
>> +		bool valid;
>> +		uint32_t bandwidth_kbps;
>> +		uint8_t num_channels;
>> +		enum rank {
>> +			DRAM_RANK_INVALID = 0,
>> +			DRAM_RANK_SINGLE,
>> +			DRAM_RANK_DUAL
>> +		} rank;
> This was previously an anonymous enum (enum { ... } rank;), but now
> it's "enum rank", which is too generic for a .h file. Either use a more
> specific name (enum memdev_rank?) or try to make it anonymous again.
changed to memdev_rank.
>
>> +	} memdev_info;
>> +
>> +
>>   	struct i915_runtime_pm pm;
>>   
>>   	struct {
>> diff --git a/drivers/gpu/drm/i915/i915_reg.h
>> b/drivers/gpu/drm/i915/i915_reg.h
>> index 649319d..e7efdd0 100644
>> --- a/drivers/gpu/drm/i915/i915_reg.h
>> +++ b/drivers/gpu/drm/i915/i915_reg.h
>> @@ -8015,6 +8015,43 @@ enum {
>>   #define  DC_STATE_DEBUG_MASK_CORES	(1<<0)
>>   #define  DC_STATE_DEBUG_MASK_MEMORY_UP	(1<<1)
>>   
>> +#define BXT_P_CR_MC_BIOS_REQ_0_0_0	_MMIO(MCHBAR_MIRROR_BASE_S
>> NB + 0x7114)
>> +#define BXT_REQ_DATA_MASK			0x3F
>> +#define BXT_DRAM_ACTIVE_CHANNEL_SHIFT		12
>> +#define BXT_DRAM_ACTIVE_CHANNEL_MASK		0xF
> We're not using these two definitions anywhere.
These are duplicate for definition below, my mistake, removing them.
>
>> +/*
>> + * BIOS programs this field of REQ_DATA [5:0] in integer
>> + * multiple of 133333 KHz (133.33MHz)
>> + */
> Now that we're using decimal on the definition, I don't think this
> comment is useful.
>
>
>> +#define	BXT_MEMORY_FREQ_MULTIPLIER_KHZ		133333
> We don't use tabs like that in our definitions.
>
>
>> +#define BXT_D_CR_DRP0_DUNIT8			0x1000
>> +#define BXT_D_CR_DRP0_DUNIT9			0x1200
>> +#define BXT_D_CR_DRP0_DUNIT_MAX			4
>> +#define _MMIO_MCHBAR_DUNIT(x, a, b) _MMIO(MCHBAR_MIRROR_BASE_SNB +
>> (a) + (x)*((b)-(a)))
>> +#define BXT_D_CR_DRP0_DUNIT(x)	_MMIO_MCHBAR_DUNIT(x,
>> BXT_D_CR_DRP0_DUNIT8, BXT_D_CR_DRP0_DUNIT9)
> But then BXT_D_CR_DRP0_DUNIT(1) means DUNIT9 instead of DUNIT1...
> That's very unintuitive.
now changed it to pass 8-11 to this macro, (from start to end of dunit)
>
>
>> +#define BXT_DRAM_CHANNEL_ACTIVE_SHIFT		12
>> +#define BXT_DRAM_CHANNEL_ACTIVE_MASK		0xF
> These two definitions don't belong to the register immediately above.
>
>
>> +#define BXT_DRAM_RANK_MASK			0x3
>> +#define BXT_DRAM_RANK_SINGLE			0x1
>> +#define BXT_DRAM_RANK_DUAL			0x3
>> +
>> +/*
>> + * SKL memory frequeny multiplier is 266667 KHz (266.67 MHz)
> s/frequeny/frequency/
>
> Also, this is not how we do one-line comments.
>
> And now that the number is in decimal we probably don't even need the
> comment here since the definition is a little more obvious.
>
>> + */
>> +#define	SKL_MEMORY_FREQ_MULTIPLIER_KHZ		266667
> We don't use tabs like that in our definitions.
>
>
>> +#define SKL_MC_BIOS_DATA_0_0_0_MCHBAR_PCU	_MMIO(MCHBAR_MIRROR
>> _BASE_SNB + 0x5E04)
>> +#define SKL_REQ_DATA_MASK			(0xF << 0)
> Blank lines separating different registers would be good.
>
>
>> +#define SKL_MAD_DIMM_CH0_0_0_0_MCHBAR_MCMAIN	_MMIO(MCHBAR_MIR
>> ROR_BASE_SNB + 0x500C)
>> +#define SKL_MAD_DIMM_CH1_0_0_0_MCHBAR_MCMAIN	_MMIO(MCHBAR_MIR
>> ROR_BASE_SNB + 0x5010)
>> +#define SKL_DRAM_SIZE_MASK			0x1F
> 0x3F
fixed.
patch is pushed as a new series @ 
https://patchwork.freedesktop.org/series/18842/

Regards,
-Mahesh

>
>
>> +#define SKL_DRAM_SIZE_L_SHIFT			0
>> +#define SKL_DRAM_SIZE_S_SHIFT			16
>> +#define SKL_DRAM_RANK_MASK			0x1
>> +#define SKL_DRAM_RANK_L_SHIFT			10
>> +#define SKL_DRAM_RANK_S_SHIFT			26
>> +#define SKL_DRAM_RANK_SINGLE			0x0
>> +#define SKL_DRAM_RANK_DUAL			0x1
>> +
>>   /* Please see hsw_read_dcomp() and hsw_write_dcomp() before using
>> this register,
>>    * since on HSW we can't write to it using I915_WRITE. */
>>   #define D_COMP_HSW			_MMIO(MCHBAR_MIRROR_BASE_S
>> NB + 0x5F0C)

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v7 8/8] drm/i915/gen9: WM memory bandwidth related workaround
  2016-12-15 17:07   ` Paulo Zanoni
@ 2017-02-15 15:00     ` Mahesh Kumar
  0 siblings, 0 replies; 18+ messages in thread
From: Mahesh Kumar @ 2017-02-15 15:00 UTC (permalink / raw)
  To: Paulo Zanoni, intel-gfx; +Cc: maarten.lankhorst

Hi,


On Thursday 15 December 2016 10:37 PM, Paulo Zanoni wrote:
> Em Qui, 2016-12-01 às 21:19 +0530, Mahesh Kumar escreveu:
>> This patch implemnets Workariunds related to display arbitrated
>> memory
>> bandwidth. These WA are applicabe for all gen-9 based platforms.
> 3 typos above.
>
> The WA is already implemented. What the patch does is that it opts-out
> of the WA in case we conclude it's safe. Please say this in the commit
> message.
updated the commit msg
>> Changes since v1:
>>   - Rebase on top of Paulo's patch series
>> Changes since v2:
>>   - Address review comments
>>   - Rebase/rework as per other patch changes in series
>> Changes since v3:
>>   - Rework based on Maarten's comments
>>
>> Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
>> ---
>>   drivers/gpu/drm/i915/i915_drv.h      |  11 +++
>>   drivers/gpu/drm/i915/intel_display.c |  24 ++++++
>>   drivers/gpu/drm/i915/intel_pm.c      | 155
>> +++++++++++++++++++++++++++++++++--
>>   3 files changed, 181 insertions(+), 9 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h
>> b/drivers/gpu/drm/i915/i915_drv.h
>> index 69213a4..3126259 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -1216,6 +1216,13 @@ enum intel_sbi_destination {
>>   	SBI_MPHY,
>>   };
>>   
>> +/* SKL+ Watermark arbitrated display bandwidth Workarounds */
>> +enum watermark_memory_wa {
>> +	WATERMARK_WA_NONE,
>> +	WATERMARK_WA_X_TILED,
>> +	WATERMARK_WA_Y_TILED,
>> +};
>> +
>>   #define QUIRK_PIPEA_FORCE (1<<0)
>>   #define QUIRK_LVDS_SSC_DISABLE (1<<1)
>>   #define QUIRK_INVERT_BRIGHTNESS (1<<2)
>> @@ -1788,6 +1795,10 @@ struct skl_ddb_allocation {
>>   
>>   struct skl_wm_values {
>>   	unsigned dirty_pipes;
>> +	/* any WaterMark memory workaround Required */
> CapitaliZation is weird Here. Besides, no need for the comment.
>
>
>> +	enum watermark_memory_wa mem_wa;
>> +	uint32_t pipe_bw_kbps[I915_MAX_PIPES];
>> +	bool pipe_ytiled[I915_MAX_PIPES];
>>   	struct skl_ddb_allocation ddb;
>>   };
>>   
>> diff --git a/drivers/gpu/drm/i915/intel_display.c
>> b/drivers/gpu/drm/i915/intel_display.c
>> index 5d11002..f8da488 100644
>> --- a/drivers/gpu/drm/i915/intel_display.c
>> +++ b/drivers/gpu/drm/i915/intel_display.c
>> @@ -14574,6 +14574,28 @@ static void intel_atomic_track_fbs(struct
>> drm_atomic_state *state)
>>   				  to_intel_plane(plane)-
>>> frontbuffer_bit);
>>   }
>>   
>> +static void
>> +intel_update_wm_bw_wa(struct drm_atomic_state *state)
>> +{
>> +	struct intel_atomic_state *intel_state =
>> to_intel_atomic_state(state);
>> +	struct drm_i915_private *dev_priv = to_i915(state->dev);
>> +	const struct drm_crtc *crtc;
>> +	const struct drm_crtc_state *cstate;
>> +	struct skl_wm_values *results = &intel_state->wm_results;
>> +	struct skl_wm_values *hw_vals = &dev_priv->wm.skl_hw;
>> +	int i;
>> +
>> +	if (!IS_GEN9(dev_priv))
>> +		return;
>> +
>> +	for_each_crtc_in_state(state, crtc, cstate, i) {
>> +		hw_vals->pipe_bw_kbps[i] = results->pipe_bw_kbps[i];
>> +		hw_vals->pipe_ytiled[i] = results->pipe_ytiled[i];
>> +	}
>> +
>> +	hw_vals->mem_wa = results->mem_wa;
>> +}
> I think we can just patch skl_copy_wm_for_pipe() to also copy the
> fields we need instead of adding this function. Whouldn't that be much
> simpler? Anyway, this one looks correct, so no need to change if you
> don't want.
>
>
>> +
>>   /**
>>    * intel_atomic_commit - commit validated state object
>>    * @dev: DRM device
>> @@ -14614,6 +14636,8 @@ static int intel_atomic_commit(struct
>> drm_device *dev,
>>   	intel_shared_dpll_commit(state);
>>   	intel_atomic_track_fbs(state);
>>   
>> +	intel_update_wm_bw_wa(state);
>> +
>>   	if (intel_state->modeset) {
>>   		memcpy(dev_priv->min_pixclk, intel_state-
>>> min_pixclk,
>>   		       sizeof(intel_state->min_pixclk));
>> diff --git a/drivers/gpu/drm/i915/intel_pm.c
>> b/drivers/gpu/drm/i915/intel_pm.c
>> index cc8fc84..fda6e1e 100644
>> --- a/drivers/gpu/drm/i915/intel_pm.c
>> +++ b/drivers/gpu/drm/i915/intel_pm.c
>> @@ -2878,11 +2878,7 @@ bool ilk_disable_lp_wm(struct drm_device *dev)
>>   
>>   #define SKL_SAGV_BLOCK_TIME	30 /* µs */
>>   
>> -/*
>> - * FIXME: We still don't have the proper code detect if we need to
>> apply the WA,
>> - * so assume we'll always need it in order to avoid underruns.
>> - */
>> -static bool skl_needs_memory_bw_wa(struct intel_atomic_state *state)
>> +static bool intel_needs_memory_bw_wa(struct intel_atomic_state
>> *state)
> Why are we renaming this function?
>
> Also, in my last review I suggested that we kill this function. I don't
> think it makes sense to have this function anymore.
>
> Besides, function intel_can_enable_sagv() needs to look at the value we
> assigned to enum watermark_memory_wa, not get a true/false based on
> platform version.
removed the function
>
>
>>   {
>>   	struct drm_i915_private *dev_priv = to_i915(state-
>>> base.dev);
>>   
>> @@ -3056,7 +3052,7 @@ bool intel_can_enable_sagv(struct
>> drm_atomic_state *state)
>>   
>>   		latency = dev_priv->wm.skl_latency[level];
>>   
>> -		if (skl_needs_memory_bw_wa(intel_state) &&
>> +		if (intel_needs_memory_bw_wa(intel_state) &&
>>   		    plane->base.state->fb->modifier ==
>>   		    I915_FORMAT_MOD_X_TILED)
>>   			latency += 15;
>> @@ -3584,7 +3580,7 @@ static int skl_compute_plane_wm(const struct
>> drm_i915_private *dev_priv,
>>   	uint32_t y_min_scanlines;
>>   	struct intel_atomic_state *state =
>>   		to_intel_atomic_state(cstate->base.state);
>> -	bool apply_memory_bw_wa = skl_needs_memory_bw_wa(state);
>> +	enum watermark_memory_wa mem_wa;
>>   	bool y_tiled, x_tiled;
>>   
>>   	if (latency == 0 || !cstate->base.active || !intel_pstate-
>>> base.visible) {
>> @@ -3600,7 +3596,8 @@ static int skl_compute_plane_wm(const struct
>> drm_i915_private *dev_priv,
>>   	if (IS_KABYLAKE(dev_priv) && dev_priv->ipc_enabled)
>>   		latency += 4;
>>   
>> -	if (apply_memory_bw_wa && x_tiled)
>> +	mem_wa = state->wm_results.mem_wa;
> This assignment could have been done during declaration.
done
>
>
>> +	if (mem_wa != WATERMARK_WA_NONE && x_tiled)
>>   		latency += 15;
>>   
>>   	width = drm_rect_width(&intel_pstate->base.src) >> 16;
>> @@ -3635,7 +3632,7 @@ static int skl_compute_plane_wm(const struct
>> drm_i915_private *dev_priv,
>>   		y_min_scanlines = 4;
>>   	}
>>   
>> -	if (apply_memory_bw_wa)
>> +	if (mem_wa == WATERMARK_WA_Y_TILED)
>>   		y_min_scanlines *= 2;
>>   
>>   	plane_bytes_per_line = width * cpp;
>> @@ -4063,6 +4060,15 @@ skl_compute_ddb(struct drm_atomic_state
>> *state)
>>   	}
>>   
>>   	/*
>> +	 * If Watermark workaround is changed we need to recalculate
>> +	 * watermark values for all active pipes
>> +	 */
>> +	if (intel_state->wm_results.mem_wa != dev_priv-
>>> wm.skl_hw.mem_wa) {
>> +		realloc_pipes = ~0;
>> +		intel_state->wm_results.dirty_pipes = ~0;
>> +	}
> But then skl_ddb_add_affected_planes() only actually adds to the commit
> the planes that have different DDB partitioning. It seems to me that it
> may be possible to have a case where the WA changes and the DDB stays
> the same, so we need to find a way to include every CRTC there. Maybe
> we could just go and call the function that adds every CRTC here.
modified this part of code to calculate WM for all pipes if WA is 
changing & add pipe only if WM values are changing.
We are holding a ww_mutex, so this operation should be race condition free.
>
>> +
>> +	/*
>>   	 * We're not recomputing for the pipes not included in the
>> commit, so
>>   	 * make sure we start with the current state.
>>   	 */
>> @@ -4087,6 +4093,133 @@ skl_compute_ddb(struct drm_atomic_state
>> *state)
>>   	return 0;
>>   }
>>   
>> +static int
>> +skl_compute_memory_bandwidth_wm_wa(struct drm_atomic_state *state)
>> +{
>> +	struct drm_device *dev = state->dev;
>> +	struct drm_i915_private *dev_priv = to_i915(dev);
>> +	struct intel_atomic_state *intel_state =
>> to_intel_atomic_state(state);
>> +	struct memdev_info *memdev_info = &dev_priv->memdev_info;
>> +	struct skl_wm_values *results = &intel_state->wm_results;
>> +	struct drm_crtc *crtc;
>> +	struct drm_crtc_state *cstate;
>> +	int active_pipes = 0;
>> +	uint32_t max_pipe_bw_kbps = 0, total_pipe_bw_kbps;
>> +	int display_bw_percentage;
>> +	bool y_tile_enabled = false;
>> +	int ret, i;
>> +
>> +	if (!intel_needs_memory_bw_wa(intel_state)) {
>> +		results->mem_wa = WATERMARK_WA_NONE;
>> +		return 0;
>> +	}
>> +
>> +	if (!memdev_info->valid)
>> +		goto exit;
> There's no reason to use a label if it's only called here. Just move
> all that code to the if statement.
done
>
>
>> +
>> +	ret = drm_modeset_lock(&dev->mode_config.connection_mutex,
>> +			state->acquire_ctx);
>> +	if (ret)
>> +		return ret;
> Why are we getting this here? Why this lock specifically? What's it
> protecting?
>
> Doc says:
> - ww mutex protecting connector state and routing
> - protects connector->encoder and encoder->crtc links
>
> We're not touching any of that here.
Created a new ww_mutex lock to prevent wm related calculation, when it 
may affect other CRTC's also.
using that now instead of connection_mutex.
>
>
>> +
>> +	memcpy(results->pipe_bw_kbps, dev_priv-
>>> wm.skl_hw.pipe_bw_kbps,
>> +			sizeof(results->pipe_bw_kbps));
>> +	memcpy(results->pipe_ytiled, dev_priv-
>>> wm.skl_hw.pipe_ytiled,
>> +			sizeof(results->pipe_ytiled));
>> +
>> +	for_each_crtc_in_state(state, crtc, cstate, i) {
>> +		struct drm_plane *plane;
>> +		const struct drm_plane_state *pstate;
>> +		int active_planes = 0;
>> +		uint32_t max_plane_bw_kbps = 0;
>> +
>> +		results->pipe_ytiled[i] = false;
>> +
>> +		if (!cstate->active) {
>> +			results->pipe_bw_kbps[i] = 0;
>> +			continue;
>> +		}
>> +
>> +		drm_atomic_crtc_state_for_each_plane_state(plane,
>> pstate,
>> +								csta
>> te) {
>> +			struct drm_framebuffer *fb;
>> +			uint32_t plane_bw_kbps;
>> +			enum plane_id id = to_intel_plane(plane)-
>>> id;
>> +
>> +			if (!pstate->visible)
>> +				continue;
>> +
>> +			if (WARN_ON(!pstate->fb))
>> +				continue;
>> +
>> +			if (id == PLANE_CURSOR)
>> +				continue;
>> +
>> +			fb = pstate->fb;
>> +			if ((fb->modifier == I915_FORMAT_MOD_Y_TILED
>> ||
>> +				fb->modifier ==
>> I915_FORMAT_MOD_Yf_TILED))
>> +				results->pipe_ytiled[i] = true;
>> +
>> +			plane_bw_kbps =
>> skl_adjusted_plane_pixel_rate(
>> +						to_intel_crtc_state(
>> cstate),
>> +						to_intel_plane_state
>> (pstate));
>> +			max_plane_bw_kbps = max(plane_bw_kbps,
>> +							max_plane_bw
>> _kbps);
>> +			active_planes++;
>> +		}
>> +		results->pipe_bw_kbps[i] = max_plane_bw_kbps *
>> active_planes;
>> +	}
>> +
>> +	for_each_pipe(dev_priv, i) {
>> +		if (results->pipe_bw_kbps[i]) {
>> +			max_pipe_bw_kbps = max(max_pipe_bw_kbps,
>> +					results->pipe_bw_kbps[i]);
>> +			active_pipes++;
>> +		}
>> +		if (results->pipe_ytiled[i])
>> +			y_tile_enabled = true;
>> +	}
>> +
>> +	total_pipe_bw_kbps = max_pipe_bw_kbps * active_pipes;
>> +	display_bw_percentage = DIV_ROUND_UP_ULL(total_pipe_bw_kbps
>> * 100,
>> +						memdev_info-
>>> bandwidth_kbps);
> Shouldn't this be just DIV_ROUND_UP()? I don't see 64 bit variables
> here, nor the possibility of overflows.
If we have 3 4K pannels & each have 3 planes enabled with scaling then 
this value theoretically may go beyond 32bits,
type-casted the variable now to 64bit.
536MHz * 3 planes * 3 pipes * pipe_scaler_ratio * plane_scaler_ratio * 
100 = 536000 * 3 * 3 * (1.9 *1.9  (hscale & vscale)) * (1.9 * 1.9 (pipe 
hscale & vscale)) * 100 = 6286685040 = 0x1 76B7 3370

>
>
>> +
>> +	/*
>> +	 * If there is any Ytile plane enabled and arbitrated
>> display
>> +	 * bandwidth > 20% of raw system memory bandwidth
>> +	 * Enale Y-tile related WA
>> +	 *
>> +	 * If memory is dual channel single rank, Xtile limit = 35%,
>> else Xtile
>> +	 * limit = 60%
>> +	 * If there is no Ytile plane enabled and
>> +	 * arbitrated display bandwidth > Xtile limit
>> +	 * Enable X-tile realated WA
>> +	 */
>> +	if (y_tile_enabled && (display_bw_percentage > 20))
>> +		results->mem_wa = WATERMARK_WA_Y_TILED;
>> +	else {
>> +		int x_tile_percentage = 60;
>> +		enum rank rank = DRAM_RANK_SINGLE;
>> +
>> +		if (memdev_info->rank != DRAM_RANK_INVALID)
>> +			rank = memdev_info->rank;
> I really think that the previous patch should prevent this. If
> memdev_info->rank is invalid then memdev_info->valid should also be
> false and we'd have returned at the beginning of the function. With
> that, this part of the code could be simplified a little bit since then
> we'd have no need to choose a default rank.
>
done

Regards,
-Mahesh
>
>> +
>> +		if ((rank == DRAM_RANK_SINGLE) &&
>> +					(memdev_info->num_channels
>> == 2))
>> +			x_tile_percentage = 35;
>> +
>> +		if (display_bw_percentage > x_tile_percentage)
>> +			results->mem_wa = WATERMARK_WA_X_TILED;
>> +	}
>> +
>> +	return 0;
>> +
>> +exit:
>> +	results->mem_wa = WATERMARK_WA_Y_TILED;
>> +	return 0;
>> +}
>> +
>> +
> Two blank lines here.
>
>
>>   static void
>>   skl_copy_wm_for_pipe(struct skl_wm_values *dst,
>>   		     struct skl_wm_values *src,
>> @@ -4162,6 +4295,10 @@ skl_compute_wm(struct drm_atomic_state *state)
>>   	/* Clear all dirty flags */
>>   	results->dirty_pipes = 0;
>>   
>> +	ret = skl_compute_memory_bandwidth_wm_wa(state);
>> +	if (ret)
>> +		return ret;
>> +
>>   	ret = skl_compute_ddb(state);
>>   	if (ret)
>>   		return ret;

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2017-02-15 14:58 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-12-01 15:49 [PATCH v7 0/8] GEN-9 Arbitrated Bandwidth WM WA's & IPC Mahesh Kumar
2016-12-01 15:49 ` [PATCH v7 1/8] drm/i915/skl: Add variables to check x_tile and y_tile Mahesh Kumar
2016-12-01 15:49 ` [PATCH v7 2/8] drm/i915/bxt: IPC WA for Broxton Mahesh Kumar
2016-12-01 15:49 ` [PATCH v7 3/8] drm/i915/kbl: IPC workaround for kabylake Mahesh Kumar
2016-12-01 15:49 ` [PATCH v7 4/8] drm/i915/bxt: Enable IPC support Mahesh Kumar
2016-12-01 15:49 ` [PATCH v7 5/8] drm/i915/skl+: change WM calc to fixed point 16.16 Mahesh Kumar
2016-12-01 15:49 ` [PATCH v7 6/8] drm/i915: Add intel_atomic_get_existing_crtc_state function Mahesh Kumar
2016-12-01 15:49 ` [PATCH v7 7/8] drm/i915: Decode system memory bandwidth Mahesh Kumar
2016-12-08 23:55   ` Paulo Zanoni
2017-02-15 15:00     ` Mahesh Kumar
2016-12-01 15:49 ` [PATCH v7 8/8] drm/i915/gen9: WM memory bandwidth related workaround Mahesh Kumar
2016-12-15 17:07   ` Paulo Zanoni
2017-02-15 15:00     ` Mahesh Kumar
2016-12-01 16:15 ` ✓ Fi.CI.BAT: success for GEN-9 Arbitrated Bandwidth WM WA's & IPC (rev3) Patchwork
2016-12-07 19:35 ` [PATCH v7 0/8] GEN-9 Arbitrated Bandwidth WM WA's & IPC Paulo Zanoni
2016-12-08 16:00   ` Daniel Vetter
2016-12-08 16:12     ` Zanoni, Paulo R
2016-12-08 16:18       ` Daniel Vetter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox