public inbox for intel-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed
* [RFC 0/2] Add Pooled EU support
@ 2015-07-10 17:35 Arun Siluvery
  2015-07-10 17:35 ` [RFC 1/2] drm/i915: Offsets for golden context BB modification Arun Siluvery
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Arun Siluvery @ 2015-07-10 17:35 UTC (permalink / raw)
  To: intel-gfx

These patches enabled Pooled EU support for BXT, they are implemented
by Armin Reese. I am sending these patches in its current form for comments.

These patches modify Golden batch to have a set of modification values
where we can change the commands based on Gen. The commands to enable
Pooled EU are inserted after MI_BATCH_BUFFER_END. If the given Gen
supports this feature, modification values are used to replace
MI_BATCH_BUFFER_END so we send commands to enable Pooled EU. These
commands need to be part of this batch because they are to be
initialized only once. Userspace will have option to query the
availability of this feature, those changes are not included in
this series.

I would like to upstream this feature and really appreciate any
comments in this regard.

Armin Reese (2):
  drm/i915: Offsets for golden context BB modification
  drm/i915/bxt: Enable pooled EUs for BXT

 drivers/gpu/drm/i915/i915_gem_render_state.c  | 125 +++++++++++++++++++++++++-
 drivers/gpu/drm/i915/i915_gem_render_state.h  |   7 ++
 drivers/gpu/drm/i915/intel_renderstate.h      |   6 +-
 drivers/gpu/drm/i915/intel_renderstate_gen6.c |   4 +
 drivers/gpu/drm/i915/intel_renderstate_gen7.c |   4 +
 drivers/gpu/drm/i915/intel_renderstate_gen8.c |   4 +
 drivers/gpu/drm/i915/intel_renderstate_gen9.c |  18 ++--
 7 files changed, 157 insertions(+), 11 deletions(-)

-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [RFC 1/2] drm/i915: Offsets for golden context BB modification
  2015-07-10 17:35 [RFC 0/2] Add Pooled EU support Arun Siluvery
@ 2015-07-10 17:35 ` Arun Siluvery
  2015-07-10 17:35 ` [RFC 2/2] drm/i915/bxt: Enable pooled EUs for BXT Arun Siluvery
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 8+ messages in thread
From: Arun Siluvery @ 2015-07-10 17:35 UTC (permalink / raw)
  To: intel-gfx

From: Armin Reese <armin.c.reese@intel.com>

Golden context batch buffers now contain a set of offsets
at which contents can optionally be modified.  This allows
the driver to customize the GC batch for various chipsets
in a GEN family.

v1 - Originally, the i915 driver was only allowed to
insert MI_BATCH_BUFFER_END commands at locations
specified by "alternate_bbend_offsets".

v2 - The previous "alternate_bbend_offsets" field has
been generalized in this version to allow the driver to
modify ranges of DWORDs in the golden context BB.  These
ranges are described in "mod_values" structs.

Signed-off-by:  Armin Reese <armin.c.reese@intel.com>
Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_gem_render_state.c  | 65 +++++++++++++++++++++++++--
 drivers/gpu/drm/i915/i915_gem_render_state.h  |  7 +++
 drivers/gpu/drm/i915/intel_renderstate.h      |  6 ++-
 drivers/gpu/drm/i915/intel_renderstate_gen6.c |  4 ++
 drivers/gpu/drm/i915/intel_renderstate_gen7.c |  4 ++
 drivers/gpu/drm/i915/intel_renderstate_gen8.c |  4 ++
 drivers/gpu/drm/i915/intel_renderstate_gen9.c |  4 ++
 7 files changed, 89 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_render_state.c b/drivers/gpu/drm/i915/i915_gem_render_state.c
index a0201fc..818233d 100644
--- a/drivers/gpu/drm/i915/i915_gem_render_state.c
+++ b/drivers/gpu/drm/i915/i915_gem_render_state.c
@@ -45,6 +45,49 @@ render_state_get_rodata(struct drm_device *dev, const int gen)
 	return NULL;
 }
 
+/**
+ * Offsets for golden context "value modifications" defined in
+ * intel_renderstate_genx.c are locations in the batch buffer
+ * where the driver is allowed to modify one or more DWORDs
+ * to customize instructions for the GEN platform present.
+ */
+static int gc_modify_values(struct drm_device *dev,
+			 const struct intel_renderstate_rodata *rodata,
+			 u32 *d)
+{
+	/* Init index to "nothing to modify" value */
+	int mod_index = -1;
+	int mod_val[1];
+	int num_vals = 0;
+
+	/* Write required value(s) to the specified offset(s) */
+	if ((mod_index >= 0) &&
+	    (num_vals > 0)) {
+		if (mod_index < rodata->mod_value_items) {
+			int mod_offset, mod_max_cnt, i;
+
+			mod_offset =
+				rodata->mod_values[mod_index].offset;
+			mod_max_cnt =
+				rodata->mod_values[mod_index].max_cnt;
+
+			/* Check for DWORD aligned address, et al */
+			if ((num_vals > mod_max_cnt) ||
+			    (mod_offset <= 0) ||
+			    (mod_offset > PAGE_SIZE - (mod_max_cnt * sizeof(u32))) ||
+			    ((mod_offset & (sizeof(u32) - 1)) != 0))
+				return -EINVAL;
+
+			for (i = 0; i < num_vals; i++) {
+				d[(mod_offset/sizeof(u32)) + i] = mod_val[i];
+			}
+		} else
+			return -EINVAL;
+	}
+
+	return 0;
+}
+
 static int render_state_init(struct render_state *so, struct drm_device *dev)
 {
 	int ret;
@@ -73,7 +116,7 @@ free_gem:
 	return ret;
 }
 
-static int render_state_setup(struct render_state *so)
+static int render_state_setup(struct render_state *so, struct drm_device *dev)
 {
 	const struct intel_renderstate_rodata *rodata = so->rodata;
 	unsigned int i = 0, reloc_index = 0;
@@ -89,10 +132,11 @@ static int render_state_setup(struct render_state *so)
 	d = kmap(page);
 
 	while (i < rodata->batch_items) {
-		u32 s = rodata->batch[i];
+		u32 s = rodata->batch[i];	/* DWORD from R/O batch */
 
 		if (i * 4  == rodata->reloc[reloc_index]) {
-			u64 r = s + so->ggtt_offset;
+			u64 r = s + /* Object offset stored in reloc DWORD */
+				so->ggtt_offset; /* Batch buffer gfx offset */
 			s = lower_32_bits(r);
 			if (so->gen >= 8) {
 				if (i + 1 >= rodata->batch_items ||
@@ -108,8 +152,21 @@ static int render_state_setup(struct render_state *so)
 
 		d[i++] = s;
 	}
+
+	/* Any golden context BB entries to modify? */
+	if ((rodata->mod_values[0].offset != -1) &&
+	    (rodata->mod_values[0].max_cnt > 0))
+		ret = gc_modify_values(dev, rodata, d);
+	else
+		ret = 0;
+
 	kunmap(page);
 
+	if (ret) {
+		DRM_ERROR("error modifying golden context batch buffer contents\n");
+		return ret;
+	}
+
 	ret = i915_gem_object_set_to_gtt_domain(so->obj, false);
 	if (ret)
 		return ret;
@@ -143,7 +200,7 @@ int i915_gem_render_state_prepare(struct intel_engine_cs *ring,
 	if (so->rodata == NULL)
 		return 0;
 
-	ret = render_state_setup(so);
+	ret = render_state_setup(so, ring->dev);
 	if (ret) {
 		i915_gem_render_state_fini(so);
 		return ret;
diff --git a/drivers/gpu/drm/i915/i915_gem_render_state.h b/drivers/gpu/drm/i915/i915_gem_render_state.h
index 7aa7372..0898559 100644
--- a/drivers/gpu/drm/i915/i915_gem_render_state.h
+++ b/drivers/gpu/drm/i915/i915_gem_render_state.h
@@ -26,8 +26,15 @@
 
 #include <linux/types.h>
 
+struct mod_value_info {
+	u32 offset;
+	u32 max_cnt;
+};
+
 struct intel_renderstate_rodata {
 	const u32 *reloc;
+	const struct mod_value_info *mod_values;
+	const u32 mod_value_items;
 	const u32 *batch;
 	const u32 batch_items;
 };
diff --git a/drivers/gpu/drm/i915/intel_renderstate.h b/drivers/gpu/drm/i915/intel_renderstate.h
index 5bd6985..1a80279 100644
--- a/drivers/gpu/drm/i915/intel_renderstate.h
+++ b/drivers/gpu/drm/i915/intel_renderstate.h
@@ -34,8 +34,12 @@ extern const struct intel_renderstate_rodata gen9_null_state;
 #define RO_RENDERSTATE(_g)						\
 	const struct intel_renderstate_rodata gen ## _g ## _null_state = { \
 		.reloc = gen ## _g ## _null_state_relocs,		\
+		.mod_values = gen ## _g ## _mod_values,			\
+		.mod_value_items = (sizeof(gen ## _g ## _mod_values) /	\
+				    sizeof(struct mod_value_info)),	\
 		.batch = gen ## _g ## _null_state_batch,		\
-		.batch_items = sizeof(gen ## _g ## _null_state_batch)/4, \
+		.batch_items = (sizeof(gen ## _g ## _null_state_batch) / \
+				sizeof(u32))				\
 	}
 
 #endif /* INTEL_RENDERSTATE_H */
diff --git a/drivers/gpu/drm/i915/intel_renderstate_gen6.c b/drivers/gpu/drm/i915/intel_renderstate_gen6.c
index 11c8e7b..ef29069 100644
--- a/drivers/gpu/drm/i915/intel_renderstate_gen6.c
+++ b/drivers/gpu/drm/i915/intel_renderstate_gen6.c
@@ -34,6 +34,10 @@ static const u32 gen6_null_state_relocs[] = {
 	-1,
 };
 
+static const struct mod_value_info gen6_mod_values[] = {
+	{-1, 0},
+};
+
 static const u32 gen6_null_state_batch[] = {
 	0x69040000,
 	0x790d0001,
diff --git a/drivers/gpu/drm/i915/intel_renderstate_gen7.c b/drivers/gpu/drm/i915/intel_renderstate_gen7.c
index 6551806..408e901 100644
--- a/drivers/gpu/drm/i915/intel_renderstate_gen7.c
+++ b/drivers/gpu/drm/i915/intel_renderstate_gen7.c
@@ -33,6 +33,10 @@ static const u32 gen7_null_state_relocs[] = {
 	-1,
 };
 
+static const struct mod_value_info gen7_mod_values[] = {
+	{-1, 0},
+};
+
 static const u32 gen7_null_state_batch[] = {
 	0x69040000,
 	0x61010008,
diff --git a/drivers/gpu/drm/i915/intel_renderstate_gen8.c b/drivers/gpu/drm/i915/intel_renderstate_gen8.c
index 95288a3..d5b1383 100644
--- a/drivers/gpu/drm/i915/intel_renderstate_gen8.c
+++ b/drivers/gpu/drm/i915/intel_renderstate_gen8.c
@@ -33,6 +33,10 @@ static const u32 gen8_null_state_relocs[] = {
 	-1,
 };
 
+static const struct mod_value_info gen8_mod_values[] = {
+	{-1, 0},
+};
+
 static const u32 gen8_null_state_batch[] = {
 	0x7a000004,
 	0x01000000,
diff --git a/drivers/gpu/drm/i915/intel_renderstate_gen9.c b/drivers/gpu/drm/i915/intel_renderstate_gen9.c
index 16a7ec2..7cb43a2 100644
--- a/drivers/gpu/drm/i915/intel_renderstate_gen9.c
+++ b/drivers/gpu/drm/i915/intel_renderstate_gen9.c
@@ -33,6 +33,10 @@ static const u32 gen9_null_state_relocs[] = {
 	-1,
 };
 
+static const struct mod_value_info gen9_mod_values[] = {
+	{-1, 0},
+};
+
 static const u32 gen9_null_state_batch[] = {
 	0x7a000004,
 	0x01000000,
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [RFC 2/2] drm/i915/bxt: Enable pooled EUs for BXT
  2015-07-10 17:35 [RFC 0/2] Add Pooled EU support Arun Siluvery
  2015-07-10 17:35 ` [RFC 1/2] drm/i915: Offsets for golden context BB modification Arun Siluvery
@ 2015-07-10 17:35 ` Arun Siluvery
  2015-07-11 19:05 ` [RFC 0/2] Add Pooled EU support Chris Wilson
  2015-07-13 10:16 ` Mika Kuoppala
  3 siblings, 0 replies; 8+ messages in thread
From: Arun Siluvery @ 2015-07-10 17:35 UTC (permalink / raw)
  To: intel-gfx

From: Armin Reese <armin.c.reese@intel.com>

The pooled EU feature for BXT will be enabled by the GEN9
golden context BB.  Pooling EUs allows more execution units
to be available for rendering operations and should result
in improved performance.  The golden context batch buffer
is used to enable the feature so it will be available as
the system's default render state.

v1 - Original patch

v2 - Rebased

v3 - Allow the driver to modify multiple 'modification ranges'
within the golden context batch buffer.

Signed-off-by: Armin Reese <armin.c.reese@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_render_state.c  | 120 +++++++++++++++++++-------
 drivers/gpu/drm/i915/intel_renderstate_gen6.c |   2 +-
 drivers/gpu/drm/i915/intel_renderstate_gen7.c |   2 +-
 drivers/gpu/drm/i915/intel_renderstate_gen8.c |   2 +-
 drivers/gpu/drm/i915/intel_renderstate_gen9.c |  16 ++--
 5 files changed, 102 insertions(+), 40 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_render_state.c b/drivers/gpu/drm/i915/i915_gem_render_state.c
index 818233d..7e25610 100644
--- a/drivers/gpu/drm/i915/i915_gem_render_state.c
+++ b/drivers/gpu/drm/i915/i915_gem_render_state.c
@@ -45,44 +45,104 @@ render_state_get_rodata(struct drm_device *dev, const int gen)
 	return NULL;
 }
 
+/* This function writes the platform's golden context 'modification values'
+ * to the specified ranges in the golden context batch buffer
+ */
+static int gc_write_values( const struct intel_renderstate_rodata *rodata,
+				int mod_index,
+				int range_offset,
+				int num_vals,
+				u32 *mod_val,
+				u32 *d)
+{
+	/* mod_index 	- The index to the i'th 'mod_value_info' element
+	 * 		  in intel_renderstate_genx.c
+	 * range_offset - If the 'mod_value_info' element specifies a range of
+	 * 		  values (more than one value can be modified), the
+	 * 		  range_offset is the u32 position in this range at
+	 * 		  which writing new values occurs.  Normally, range
+	 *		 _offset is '0'.
+	 * num_vals	- Number of u32 values to be written in the range
+	 * mod_val	- Array of u32 values to write into this golden
+	 * 		  context BB range
+	 * d		- Pointer to the golden context batch buffer
+	 *
+	 */
+	if (mod_index < rodata->mod_value_items) {
+		int mod_offset, mod_max_cnt, i;
+
+		/* Start byte offset of modification range in golden context
+		 * batch buffer */
+		mod_offset =
+			rodata->mod_values[mod_index].offset;
+		/* Max # of u32 values which can be written for this range */
+		mod_max_cnt =
+			rodata->mod_values[mod_index].max_cnt;
+
+		/* Check for DWORD aligned address, et al */
+		if ((num_vals + range_offset > mod_max_cnt) ||
+		    (mod_offset <= 0) ||
+		    (mod_offset > PAGE_SIZE - (mod_max_cnt * sizeof(u32))) ||
+		    ((mod_offset & (sizeof(u32) - 1)) != 0))
+			return -EINVAL;
+
+		/* Copy the new values into golden context BB */
+		for (i = 0; i < num_vals; i++) {
+			d[(mod_offset/sizeof(u32)) + range_offset + i] = mod_val[i];
+		}
+	} else
+		return -EINVAL;
+
+	return 0;
+}
+
 /**
  * Offsets for golden context "value modifications" defined in
  * intel_renderstate_genx.c are locations in the batch buffer
  * where the driver is allowed to modify one or more DWORDs
  * to customize instructions for the GEN platform present.
  */
-static int gc_modify_values(struct drm_device *dev,
+static int gc_modify_platform_values(struct drm_device *dev,
 			 const struct intel_renderstate_rodata *rodata,
 			 u32 *d)
 {
-	/* Init index to "nothing to modify" value */
-	int mod_index = -1;
-	int mod_val[1];
-	int num_vals = 0;
-
-	/* Write required value(s) to the specified offset(s) */
-	if ((mod_index >= 0) &&
-	    (num_vals > 0)) {
-		if (mod_index < rodata->mod_value_items) {
-			int mod_offset, mod_max_cnt, i;
-
-			mod_offset =
-				rodata->mod_values[mod_index].offset;
-			mod_max_cnt =
-				rodata->mod_values[mod_index].max_cnt;
-
-			/* Check for DWORD aligned address, et al */
-			if ((num_vals > mod_max_cnt) ||
-			    (mod_offset <= 0) ||
-			    (mod_offset > PAGE_SIZE - (mod_max_cnt * sizeof(u32))) ||
-			    ((mod_offset & (sizeof(u32) - 1)) != 0))
-				return -EINVAL;
-
-			for (i = 0; i < num_vals; i++) {
-				d[(mod_offset/sizeof(u32)) + i] = mod_val[i];
-			}
-		} else
-			return -EINVAL;
+	/* Modifications to golden context BB for Broxton */
+	if (IS_BROXTON(dev)) {
+		int ret, mod_val[1];
+		/* Step 1 ...
+		 * Remove SKL's MI_BATCH_BUFFER_END command in the mod_index 0
+		 * range to allow the golden context batch buffer to proceed to
+		 * the BXT specific commands setting up pooled EUs
+		 */
+		mod_val[0] = 0;
+		ret = gc_write_values(rodata,
+					0, /* mod_index 0 */
+					0, /* range_offset */
+					1, /* num_vals */
+					mod_val, d);
+		if (ret < 0)
+			return ret;
+
+		/* TODO - Integrate Jeff McGee's patches to determine
+		 * whether the BXT is 2x6 or 3x6.  For now, assume 2x6
+		 */
+		#define IS_BROXTON_2x6 1  /* Eventually get rid of this */
+
+		if (IS_BROXTON_2x6) {
+			/* Step 2 ...
+			 * Replace existing PoolBitField-Slice0 value in the
+			 * mod_index 1 range with '0', the value for 2x6
+			 * pooled EUs
+			 */
+			mod_val[0] = 0;
+			ret = gc_write_values(rodata,
+						1, /* mod_index 1 */
+						0, /* range_offset, u32 */
+						1, /* num_vals */
+						mod_val, d);
+			if (ret < 0)
+				return ret;
+		}
 	}
 
 	return 0;
@@ -156,7 +216,7 @@ static int render_state_setup(struct render_state *so, struct drm_device *dev)
 	/* Any golden context BB entries to modify? */
 	if ((rodata->mod_values[0].offset != -1) &&
 	    (rodata->mod_values[0].max_cnt > 0))
-		ret = gc_modify_values(dev, rodata, d);
+		ret = gc_modify_platform_values(dev, rodata, d);
 	else
 		ret = 0;
 
diff --git a/drivers/gpu/drm/i915/intel_renderstate_gen6.c b/drivers/gpu/drm/i915/intel_renderstate_gen6.c
index ef29069..ef861d1 100644
--- a/drivers/gpu/drm/i915/intel_renderstate_gen6.c
+++ b/drivers/gpu/drm/i915/intel_renderstate_gen6.c
@@ -35,7 +35,7 @@ static const u32 gen6_null_state_relocs[] = {
 };
 
 static const struct mod_value_info gen6_mod_values[] = {
-	{-1, 0},
+	{-1, 0}
 };
 
 static const u32 gen6_null_state_batch[] = {
diff --git a/drivers/gpu/drm/i915/intel_renderstate_gen7.c b/drivers/gpu/drm/i915/intel_renderstate_gen7.c
index 408e901..53013a2 100644
--- a/drivers/gpu/drm/i915/intel_renderstate_gen7.c
+++ b/drivers/gpu/drm/i915/intel_renderstate_gen7.c
@@ -34,7 +34,7 @@ static const u32 gen7_null_state_relocs[] = {
 };
 
 static const struct mod_value_info gen7_mod_values[] = {
-	{-1, 0},
+	{-1, 0}
 };
 
 static const u32 gen7_null_state_batch[] = {
diff --git a/drivers/gpu/drm/i915/intel_renderstate_gen8.c b/drivers/gpu/drm/i915/intel_renderstate_gen8.c
index d5b1383..0115efd 100644
--- a/drivers/gpu/drm/i915/intel_renderstate_gen8.c
+++ b/drivers/gpu/drm/i915/intel_renderstate_gen8.c
@@ -34,7 +34,7 @@ static const u32 gen8_null_state_relocs[] = {
 };
 
 static const struct mod_value_info gen8_mod_values[] = {
-	{-1, 0},
+	{-1, 0}
 };
 
 static const u32 gen8_null_state_batch[] = {
diff --git a/drivers/gpu/drm/i915/intel_renderstate_gen9.c b/drivers/gpu/drm/i915/intel_renderstate_gen9.c
index 7cb43a2..0475f40 100644
--- a/drivers/gpu/drm/i915/intel_renderstate_gen9.c
+++ b/drivers/gpu/drm/i915/intel_renderstate_gen9.c
@@ -34,7 +34,9 @@ static const u32 gen9_null_state_relocs[] = {
 };
 
 static const struct mod_value_info gen9_mod_values[] = {
-	{-1, 0},
+	{0x00000dd4, 2},
+	{0x00000de8, 4},
+	{-1, 0}
 };
 
 static const u32 gen9_null_state_batch[] = {
@@ -923,16 +925,16 @@ static const u32 gen9_null_state_batch[] = {
 	0x00000001,
 	0x00000000,
 	0x00000000,
-	0x05000000,	 /* cmds end */
-	0x00000000,
-	0x00000000,
-	0x00000000,
-	0x00000000,
-	0x00000000,
+	0x05000000,	 /* 2 mod_values */
 	0x00000000,
+	0x69040302,
+	0x70050004,
+	0x80000000,
+	0x00777000,	 /* 4 mod_values */
 	0x00000000,
 	0x00000000,
 	0x00000000,
+	0x05000000,	 /* cmds end */
 	0x00000000,
 	0x00000000,	 /* state start */
 	0x00000000,
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [RFC 0/2] Add Pooled EU support
  2015-07-10 17:35 [RFC 0/2] Add Pooled EU support Arun Siluvery
  2015-07-10 17:35 ` [RFC 1/2] drm/i915: Offsets for golden context BB modification Arun Siluvery
  2015-07-10 17:35 ` [RFC 2/2] drm/i915/bxt: Enable pooled EUs for BXT Arun Siluvery
@ 2015-07-11 19:05 ` Chris Wilson
  2015-07-11 19:09   ` Chris Wilson
  2015-07-13 10:16 ` Mika Kuoppala
  3 siblings, 1 reply; 8+ messages in thread
From: Chris Wilson @ 2015-07-11 19:05 UTC (permalink / raw)
  To: Arun Siluvery; +Cc: intel-gfx

On Fri, Jul 10, 2015 at 06:35:18PM +0100, Arun Siluvery wrote:
> These patches enabled Pooled EU support for BXT, they are implemented
> by Armin Reese. I am sending these patches in its current form for comments.
> 
> These patches modify Golden batch to have a set of modification values
> where we can change the commands based on Gen. The commands to enable
> Pooled EU are inserted after MI_BATCH_BUFFER_END. If the given Gen
> supports this feature, modification values are used to replace
> MI_BATCH_BUFFER_END so we send commands to enable Pooled EU. These
> commands need to be part of this batch because they are to be
> initialized only once. Userspace will have option to query the
> availability of this feature, those changes are not included in
> this series.

Would it not just be simpler to execute 2 batches? First holding the
basic and common state for the gen, the second using subgen. That we
have a chunk of binary data is nasty, but at least we can point to the
generator and be able to decipher it and recreate it as required. Doing
binary patching on top, on that path lies madness.

What is the minimum instruction sequence required to be able to setup the
default EU state? Is it small enough that carrying it as code in the
kernel is viable (and readable)?

(That actually is critical here as currently we have to juggle multiple
sources and look very carefully at what is being patched - I am not
confident that we will not introduce mistakes in a week's time, let
alone a year or two.)
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC 0/2] Add Pooled EU support
  2015-07-11 19:05 ` [RFC 0/2] Add Pooled EU support Chris Wilson
@ 2015-07-11 19:09   ` Chris Wilson
  2015-07-13 15:00     ` Siluvery, Arun
  0 siblings, 1 reply; 8+ messages in thread
From: Chris Wilson @ 2015-07-11 19:09 UTC (permalink / raw)
  To: Arun Siluvery, intel-gfx

On Sat, Jul 11, 2015 at 08:05:05PM +0100, Chris Wilson wrote:
> On Fri, Jul 10, 2015 at 06:35:18PM +0100, Arun Siluvery wrote:
> > These patches enabled Pooled EU support for BXT, they are implemented
> > by Armin Reese. I am sending these patches in its current form for comments.
> > 
> > These patches modify Golden batch to have a set of modification values
> > where we can change the commands based on Gen. The commands to enable
> > Pooled EU are inserted after MI_BATCH_BUFFER_END. If the given Gen
> > supports this feature, modification values are used to replace
> > MI_BATCH_BUFFER_END so we send commands to enable Pooled EU. These
> > commands need to be part of this batch because they are to be
> > initialized only once. Userspace will have option to query the
> > availability of this feature, those changes are not included in
> > this series.
> 
> Would it not just be simpler to execute 2 batches? First holding the
> basic and common state for the gen, the second using subgen. That we
> have a chunk of binary data is nasty, but at least we can point to the
> generator and be able to decipher it and recreate it as required. Doing
> binary patching on top, on that path lies madness.
> 
> What is the minimum instruction sequence required to be able to setup the
> default EU state? Is it small enough that carrying it as code in the
> kernel is viable (and readable)?
> 
> (That actually is critical here as currently we have to juggle multiple
> sources and look very carefully at what is being patched - I am not
> confident that we will not introduce mistakes in a week's time, let
> alone a year or two.)

The alternative is to just say that the patch table is also
autogenerated and for that to be simple and clear, and far more
documentated as it relies on a strict protocol.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC 0/2] Add Pooled EU support
  2015-07-10 17:35 [RFC 0/2] Add Pooled EU support Arun Siluvery
                   ` (2 preceding siblings ...)
  2015-07-11 19:05 ` [RFC 0/2] Add Pooled EU support Chris Wilson
@ 2015-07-13 10:16 ` Mika Kuoppala
  3 siblings, 0 replies; 8+ messages in thread
From: Mika Kuoppala @ 2015-07-13 10:16 UTC (permalink / raw)
  To: Arun Siluvery, intel-gfx

Arun Siluvery <arun.siluvery@linux.intel.com> writes:

> These patches enabled Pooled EU support for BXT, they are implemented
> by Armin Reese. I am sending these patches in its current form for comments.
>
> These patches modify Golden batch to have a set of modification values
> where we can change the commands based on Gen. The commands to enable
> Pooled EU are inserted after MI_BATCH_BUFFER_END. If the given Gen
> supports this feature, modification values are used to replace
> MI_BATCH_BUFFER_END so we send commands to enable Pooled EU. These
> commands need to be part of this batch because they are to be
> initialized only once. Userspace will have option to query the
> availability of this feature, those changes are not included in
> this series.
>
> I would like to upstream this feature and really appreciate any
> comments in this regard.
>

Latest command stream programming guide has this to say
in context initialization:

"Render CS Only: Render state need not be initialized"

If it is so that we get a proper render state from hw,
with 'Restore Inhibit', then we can get rid of golden
context for skl+.

-Mika

> Armin Reese (2):
>   drm/i915: Offsets for golden context BB modification
>   drm/i915/bxt: Enable pooled EUs for BXT
>
>  drivers/gpu/drm/i915/i915_gem_render_state.c  | 125 +++++++++++++++++++++++++-
>  drivers/gpu/drm/i915/i915_gem_render_state.h  |   7 ++
>  drivers/gpu/drm/i915/intel_renderstate.h      |   6 +-
>  drivers/gpu/drm/i915/intel_renderstate_gen6.c |   4 +
>  drivers/gpu/drm/i915/intel_renderstate_gen7.c |   4 +
>  drivers/gpu/drm/i915/intel_renderstate_gen8.c |   4 +
>  drivers/gpu/drm/i915/intel_renderstate_gen9.c |  18 ++--
>  7 files changed, 157 insertions(+), 11 deletions(-)
>
> -- 
> 1.9.1
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC 0/2] Add Pooled EU support
  2015-07-11 19:09   ` Chris Wilson
@ 2015-07-13 15:00     ` Siluvery, Arun
  2015-07-13 19:58       ` Chris Wilson
  0 siblings, 1 reply; 8+ messages in thread
From: Siluvery, Arun @ 2015-07-13 15:00 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

On 11/07/2015 20:09, Chris Wilson wrote:
> On Sat, Jul 11, 2015 at 08:05:05PM +0100, Chris Wilson wrote:
>> On Fri, Jul 10, 2015 at 06:35:18PM +0100, Arun Siluvery wrote:
>>> These patches enabled Pooled EU support for BXT, they are implemented
>>> by Armin Reese. I am sending these patches in its current form for comments.
>>>
>>> These patches modify Golden batch to have a set of modification values
>>> where we can change the commands based on Gen. The commands to enable
>>> Pooled EU are inserted after MI_BATCH_BUFFER_END. If the given Gen
>>> supports this feature, modification values are used to replace
>>> MI_BATCH_BUFFER_END so we send commands to enable Pooled EU. These
>>> commands need to be part of this batch because they are to be
>>> initialized only once. Userspace will have option to query the
>>> availability of this feature, those changes are not included in
>>> this series.
>>
>> Would it not just be simpler to execute 2 batches? First holding the
>> basic and common state for the gen, the second using subgen. That we
>> have a chunk of binary data is nasty, but at least we can point to the
>> generator and be able to decipher it and recreate it as required. Doing
>> binary patching on top, on that path lies madness.

I like this idea of sending 2 batches if that is acceptable. In this 
case we don't have to touch the golden batch and hence the generator 
tool and also not worry about using the correct binary blob as header.

the setup in this case would be,

1. send golden batch
2. prepare and send batch to configure pooled EU as per subslice and EU 
count

Why we have a separate tool in the first place, is it not possible to 
carry all of them in code or are there any restrictions in doing so?

>>
>> What is the minimum instruction sequence required to be able to setup the
>> default EU state? Is it small enough that carrying it as code in the
>> kernel is viable (and readable)?
>>
setting up of pooled EU configuration is only few instructions, it can 
be added to the driver.
>> (That actually is critical here as currently we have to juggle multiple
>> sources and look very carefully at what is being patched - I am not
>> confident that we will not introduce mistakes in a week's time, let
>> alone a year or two.)
>
> The alternative is to just say that the patch table is also
> autogenerated and for that to be simple and clear, and far more
> documentated as it relies on a strict protocol.

The patch table is also auto generated using intel_null_state_gen tool 
but it is patched based on Gen.

regards
Arun

> -Chris
>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC 0/2] Add Pooled EU support
  2015-07-13 15:00     ` Siluvery, Arun
@ 2015-07-13 19:58       ` Chris Wilson
  0 siblings, 0 replies; 8+ messages in thread
From: Chris Wilson @ 2015-07-13 19:58 UTC (permalink / raw)
  To: Siluvery, Arun; +Cc: intel-gfx

On Mon, Jul 13, 2015 at 04:00:08PM +0100, Siluvery, Arun wrote:
> On 11/07/2015 20:09, Chris Wilson wrote:
> >On Sat, Jul 11, 2015 at 08:05:05PM +0100, Chris Wilson wrote:
> >>On Fri, Jul 10, 2015 at 06:35:18PM +0100, Arun Siluvery wrote:
> >>>These patches enabled Pooled EU support for BXT, they are implemented
> >>>by Armin Reese. I am sending these patches in its current form for comments.
> >>>
> >>>These patches modify Golden batch to have a set of modification values
> >>>where we can change the commands based on Gen. The commands to enable
> >>>Pooled EU are inserted after MI_BATCH_BUFFER_END. If the given Gen
> >>>supports this feature, modification values are used to replace
> >>>MI_BATCH_BUFFER_END so we send commands to enable Pooled EU. These
> >>>commands need to be part of this batch because they are to be
> >>>initialized only once. Userspace will have option to query the
> >>>availability of this feature, those changes are not included in
> >>>this series.
> >>
> >>Would it not just be simpler to execute 2 batches? First holding the
> >>basic and common state for the gen, the second using subgen. That we
> >>have a chunk of binary data is nasty, but at least we can point to the
> >>generator and be able to decipher it and recreate it as required. Doing
> >>binary patching on top, on that path lies madness.
> 
> I like this idea of sending 2 batches if that is acceptable. In this
> case we don't have to touch the golden batch and hence the generator
> tool and also not worry about using the correct binary blob as
> header.
> 
> the setup in this case would be,
> 
> 1. send golden batch
> 2. prepare and send batch to configure pooled EU as per subslice and
> EU count
> 
> Why we have a separate tool in the first place, is it not possible
> to carry all of them in code or are there any restrictions in doing
> so?

Basically we really didn't want to have to carry all the 3dstate
defintions and generators in the kernel. We thought that carrying a small
binary blob (autogenerated from a clear source) would be easier to
maintain and keep the complexity (and code size) out of the kernel.

We may have got the balance wrong, it just takes a bit of trial and
error over the years to decide if the original decision was the correct
one.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2015-07-13 19:58 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-07-10 17:35 [RFC 0/2] Add Pooled EU support Arun Siluvery
2015-07-10 17:35 ` [RFC 1/2] drm/i915: Offsets for golden context BB modification Arun Siluvery
2015-07-10 17:35 ` [RFC 2/2] drm/i915/bxt: Enable pooled EUs for BXT Arun Siluvery
2015-07-11 19:05 ` [RFC 0/2] Add Pooled EU support Chris Wilson
2015-07-11 19:09   ` Chris Wilson
2015-07-13 15:00     ` Siluvery, Arun
2015-07-13 19:58       ` Chris Wilson
2015-07-13 10:16 ` Mika Kuoppala

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox