public inbox for intel-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed
* [PATCH 1/5] drm/i915/hsw: Set correct Haswell PTE encodings.
@ 2013-07-04 18:02 Ben Widawsky
  2013-07-04 18:02 ` [PATCH 2/5] drm/i915: Define some of the eLLC magic Ben Widawsky
                   ` (6 more replies)
  0 siblings, 7 replies; 30+ messages in thread
From: Ben Widawsky @ 2013-07-04 18:02 UTC (permalink / raw)
  To: Intel GFX; +Cc: Daniel Vetter, Ben Widawsky, Ben Widawsky

From: Ben Widawsky <benjamin.widawsky@intel.com>

The cacheability controls have changed, and the bits have been
rearranged in general.

v2: Remove comments for snb/ivb cache leves, that's a separate change.

v3: Resolve conflicts due to patch series reordering.

v4: Rebased on top of Kenneth Graunke's ->pet_encode refactoring.

v5: Removed eLLC bits for separate patch.

In the internal repository this was:
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 66929ea..42262d0 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -33,6 +33,7 @@
 
 /* PPGTT stuff */
 #define GEN6_GTT_ADDR_ENCODE(addr)	((addr) | (((addr) >> 28) & 0xff0))
+#define HSW_GTT_ADDR_ENCODE(addr)	((addr) | (((addr) >> 28) & 0x7f0))
 
 #define GEN6_PDE_VALID			(1 << 0)
 /* gen6+ has bit 11-4 for physical addr bit 39-32 */
@@ -44,6 +45,14 @@
 #define GEN6_PTE_CACHE_LLC		(2 << 1)
 #define GEN6_PTE_CACHE_LLC_MLC		(3 << 1)
 #define GEN6_PTE_ADDR_ENCODE(addr)	GEN6_GTT_ADDR_ENCODE(addr)
+#define HSW_PTE_ADDR_ENCODE(addr)	HSW_GTT_ADDR_ENCODE(addr)
+
+/* Cacheability Control is a 4-bit value. The low three bits are stored in *
+ * bits 3:1 of the PTE, while the fourth bit is stored in bit 11 of the PTE.
+ */
+#define HSW_CACHEABILITY_CONTROL(bits)	((((bits) & 0x7) << 1) | \
+					 (((bits) & 0x8) << (11 - 3)))
+#define HSW_WB_LLC_AGE0			HSW_CACHEABILITY_CONTROL(0x3)
 
 static gen6_gtt_pte_t gen6_pte_encode(dma_addr_t addr,
 				      enum i915_cache_level level)
@@ -92,10 +101,10 @@ static gen6_gtt_pte_t hsw_pte_encode(dma_addr_t addr,
 				     enum i915_cache_level level)
 {
 	gen6_gtt_pte_t pte = GEN6_PTE_VALID;
-	pte |= GEN6_PTE_ADDR_ENCODE(addr);
+	pte |= HSW_PTE_ADDR_ENCODE(addr);
 
 	if (level != I915_CACHE_NONE)
-		pte |= GEN6_PTE_CACHE_LLC;
+		pte |= HSW_WB_LLC_AGE0;
 
 	return pte;
 }
-- 
1.8.3

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 2/5] drm/i915: Define some of the eLLC magic
  2013-07-04 18:02 [PATCH 1/5] drm/i915/hsw: Set correct Haswell PTE encodings Ben Widawsky
@ 2013-07-04 18:02 ` Ben Widawsky
  2013-07-13  0:02   ` Rodrigo Vivi
  2013-07-04 18:02 ` [PATCH 3/5] drm/i915: store eLLC size Ben Widawsky
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 30+ messages in thread
From: Ben Widawsky @ 2013-07-04 18:02 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

The EDRAM present register isn't really defined in the docs. It just
says check to see if it's set to 1. So I haven't defined the 1 value not
knowing what it actually means.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_gem.c | 4 ++--
 drivers/gpu/drm/i915/i915_reg.h | 4 ++++
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 4200c32..edea2cb 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4153,8 +4153,8 @@ i915_gem_init_hw(struct drm_device *dev)
 	if (INTEL_INFO(dev)->gen < 6 && !intel_enable_gtt())
 		return -EIO;
 
-	if (IS_HASWELL(dev) && (I915_READ(0x120010) == 1))
-		I915_WRITE(0x9008, I915_READ(0x9008) | 0xf0000);
+	if (IS_HASWELL(dev) && (I915_READ(HSW_EDRAM_PRESENT) == 1))
+		I915_WRITE(HSW_IDICR, I915_READ(HSW_IDICR) | IDIHASHMSK(0xf));
 
 	if (HAS_PCH_NOP(dev)) {
 		u32 temp = I915_READ(GEN7_MSG_CTL);
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 9b51be8..a2553ed 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -4469,6 +4469,10 @@
 #define  GT_FIFO_FREE_ENTRIES			0x120008
 #define    GT_FIFO_NUM_RESERVED_ENTRIES		20
 
+#define  HSW_IDICR				0x9008
+#define    IDIHASHMSK(x)			(((x) & 0x3f) << 16)
+#define  HSW_EDRAM_PRESENT			0x120010
+
 #define GEN6_UCGCTL1				0x9400
 # define GEN6_BLBUNIT_CLOCK_GATE_DISABLE		(1 << 5)
 # define GEN6_CSUNIT_CLOCK_GATE_DISABLE			(1 << 7)
-- 
1.8.3

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 3/5] drm/i915: store eLLC size
  2013-07-04 18:02 [PATCH 1/5] drm/i915/hsw: Set correct Haswell PTE encodings Ben Widawsky
  2013-07-04 18:02 ` [PATCH 2/5] drm/i915: Define some of the eLLC magic Ben Widawsky
@ 2013-07-04 18:02 ` Ben Widawsky
  2013-07-04 18:42   ` [PATCH 3.5/5] drm/i915: Do eLLC detection earlier Ben Widawsky
  2013-07-16  6:02   ` [PATCH 3/5] drm/i915: store eLLC size Daniel Vetter
  2013-07-04 18:02 ` [PATCH 4/5] drm/i915: Use eLLC/LLC by default when available Ben Widawsky
                   ` (4 subsequent siblings)
  6 siblings, 2 replies; 30+ messages in thread
From: Ben Widawsky @ 2013-07-04 18:02 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

The eLLC cannot be determined by PCIID because as far as we know, even
machines supporting eLLC may not have it enabled, or fused off or
whatever. It's possible this isn't actually true, and at that point we
can switch to a DEV_INFO flag instead.

I've defined everything where the docs are clear, and left the rest as
magic.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h | 3 +++
 drivers/gpu/drm/i915/i915_gem.c | 9 ++++++++-
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index fd0f589..c6de881 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1187,6 +1187,9 @@ typedef struct drm_i915_private {
 	/* Old dri1 support infrastructure, beware the dragons ya fools entering
 	 * here! */
 	struct i915_dri1_state dri1;
+
+	/* Cannot be determined by PCIID. You must always read a register. */
+	size_t ellc_size;
 } drm_i915_private_t;
 
 /* Iterate over initialised rings */
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index edea2cb..2df993d 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4153,8 +4153,15 @@ i915_gem_init_hw(struct drm_device *dev)
 	if (INTEL_INFO(dev)->gen < 6 && !intel_enable_gtt())
 		return -EIO;
 
-	if (IS_HASWELL(dev) && (I915_READ(HSW_EDRAM_PRESENT) == 1))
+	if (IS_HASWELL(dev) && (I915_READ(HSW_EDRAM_PRESENT) == 1)) {
 		I915_WRITE(HSW_IDICR, I915_READ(HSW_IDICR) | IDIHASHMSK(0xf));
+		/* The docs do not explain exactly how the calculation can be
+		 * made. It is somewhat guessable, but for now, it's always
+		 * 128MB.
+		 */
+		dev_priv->ellc_size = 128;
+		DRM_INFO("Found %zuMB of eLLC\n", dev_priv->ellc_size);
+	}
 
 	if (HAS_PCH_NOP(dev)) {
 		u32 temp = I915_READ(GEN7_MSG_CTL);
-- 
1.8.3

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 4/5] drm/i915: Use eLLC/LLC by default when available
  2013-07-04 18:02 [PATCH 1/5] drm/i915/hsw: Set correct Haswell PTE encodings Ben Widawsky
  2013-07-04 18:02 ` [PATCH 2/5] drm/i915: Define some of the eLLC magic Ben Widawsky
  2013-07-04 18:02 ` [PATCH 3/5] drm/i915: store eLLC size Ben Widawsky
@ 2013-07-04 18:02 ` Ben Widawsky
  2013-07-04 18:17   ` Daniel Vetter
  2013-07-04 18:02 ` [PATCH 5/5] drm/i915: debugfs entries for [e]LLC Ben Widawsky
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 30+ messages in thread
From: Ben Widawsky @ 2013-07-04 18:02 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

DRI clients really should be using MOCS to get fine grained streaming
cache controls. With that note, I *hope* that this patch doesn't improve
performance overwhelmingly, because if it does - it means there is a
problem elsewhere.

In any case, the kernel, and old userspace should get some benefit from
this, so let's do it. eLLC is always a good default, and really not
using it is the special case for MOCS.

References: http://www.intel.com/newsroom/kits/restricted/ha$well!/pdfs/4th_Gen_Intel_Core_PressBriefing_5-29.pdf (page 57)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 17 ++++++++++++++++-
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 42262d0..0ff8073 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -53,6 +53,7 @@
 #define HSW_CACHEABILITY_CONTROL(bits)	((((bits) & 0x7) << 1) | \
 					 (((bits) & 0x8) << (11 - 3)))
 #define HSW_WB_LLC_AGE0			HSW_CACHEABILITY_CONTROL(0x3)
+#define HSW_WB_ELLC_LLC_AGE0		HSW_CACHEABILITY_CONTROL(0xb)
 
 static gen6_gtt_pte_t gen6_pte_encode(dma_addr_t addr,
 				      enum i915_cache_level level)
@@ -109,6 +110,18 @@ static gen6_gtt_pte_t hsw_pte_encode(dma_addr_t addr,
 	return pte;
 }
 
+static gen6_gtt_pte_t iris_pte_encode(dma_addr_t addr,
+				      enum i915_cache_level level)
+{
+	gen6_gtt_pte_t pte = GEN6_PTE_VALID;
+	pte |= HSW_PTE_ADDR_ENCODE(addr);
+
+	if (level != I915_CACHE_NONE)
+		pte |= HSW_WB_ELLC_LLC_AGE0;
+
+	return pte;
+}
+
 static void gen6_write_pdes(struct i915_hw_ppgtt *ppgtt)
 {
 	struct drm_i915_private *dev_priv = ppgtt->dev->dev_private;
@@ -860,7 +873,9 @@ int i915_gem_gtt_init(struct drm_device *dev)
 	} else {
 		gtt->gtt_probe = gen6_gmch_probe;
 		gtt->gtt_remove = gen6_gmch_remove;
-		if (IS_HASWELL(dev))
+		if (IS_HASWELL(dev) && dev_priv->ellc_size)
+			gtt->pte_encode = iris_pte_encode;
+		else if (IS_HASWELL(dev))
 			gtt->pte_encode = hsw_pte_encode;
 		else if (IS_VALLEYVIEW(dev))
 			gtt->pte_encode = byt_pte_encode;
-- 
1.8.3

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 5/5] drm/i915: debugfs entries for [e]LLC
  2013-07-04 18:02 [PATCH 1/5] drm/i915/hsw: Set correct Haswell PTE encodings Ben Widawsky
                   ` (2 preceding siblings ...)
  2013-07-04 18:02 ` [PATCH 4/5] drm/i915: Use eLLC/LLC by default when available Ben Widawsky
@ 2013-07-04 18:02 ` Ben Widawsky
  2013-07-04 18:14   ` Daniel Vetter
                     ` (2 more replies)
  2013-07-13  0:00 ` [PATCH 1/5] drm/i915/hsw: Set correct Haswell PTE encodings Rodrigo Vivi
                   ` (2 subsequent siblings)
  6 siblings, 3 replies; 30+ messages in thread
From: Ben Widawsky @ 2013-07-04 18:02 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

To make users life a little easier figuring out what they have on their
system.

Ideally, I'd really like to report LLC size, but it turned out to be a
bit of a pain. Maybe I'll revisit it in the future.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_debugfs.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 3e36756..b75d0a6 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -1941,6 +1941,19 @@ static int i915_dpio_info(struct seq_file *m, void *data)
 	return 0;
 }
 
+static int i915_llc(struct seq_file *m, void *data)
+{
+	struct drm_info_node *node = (struct drm_info_node *) m->private;
+	struct drm_device *dev = node->minor->dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+
+	/* Size calculation for LLC is a bit of a pain. Ignore for now. */
+	seq_printf(m, "LLC: %s\n", yesno(HAS_LLC(dev)));
+	seq_printf(m, "eLLC: %zuMB\n", dev_priv->ellc_size);
+
+	return 0;
+}
+
 static int
 i915_wedged_get(void *data, u64 *val)
 {
@@ -2370,6 +2383,7 @@ static struct drm_info_list i915_debugfs_list[] = {
 	{"i915_swizzle_info", i915_swizzle_info, 0},
 	{"i915_ppgtt_info", i915_ppgtt_info, 0},
 	{"i915_dpio", i915_dpio_info, 0},
+	{"i915_llc", i915_llc, 0},
 };
 #define I915_DEBUGFS_ENTRIES ARRAY_SIZE(i915_debugfs_list)
 
-- 
1.8.3

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH 5/5] drm/i915: debugfs entries for [e]LLC
  2013-07-04 18:02 ` [PATCH 5/5] drm/i915: debugfs entries for [e]LLC Ben Widawsky
@ 2013-07-04 18:14   ` Daniel Vetter
  2013-07-04 18:40     ` Ben Widawsky
  2013-07-04 18:47   ` [PATCH 6/6] drm/i915: Add a param for eLLC size Ben Widawsky
  2013-07-13  0:11   ` [PATCH 5/5] drm/i915: debugfs entries for [e]LLC Rodrigo Vivi
  2 siblings, 1 reply; 30+ messages in thread
From: Daniel Vetter @ 2013-07-04 18:14 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Thu, Jul 04, 2013 at 11:02:07AM -0700, Ben Widawsky wrote:
> To make users life a little easier figuring out what they have on their
> system.
> 
> Ideally, I'd really like to report LLC size, but it turned out to be a
> bit of a pain. Maybe I'll revisit it in the future.
> 
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

I think a getparam for eLLC would be neat, so that usespace can use it to
tune working set sizes.
-Daniel

> ---
>  drivers/gpu/drm/i915/i915_debugfs.c | 14 ++++++++++++++
>  1 file changed, 14 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index 3e36756..b75d0a6 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -1941,6 +1941,19 @@ static int i915_dpio_info(struct seq_file *m, void *data)
>  	return 0;
>  }
>  
> +static int i915_llc(struct seq_file *m, void *data)
> +{
> +	struct drm_info_node *node = (struct drm_info_node *) m->private;
> +	struct drm_device *dev = node->minor->dev;
> +	struct drm_i915_private *dev_priv = dev->dev_private;
> +
> +	/* Size calculation for LLC is a bit of a pain. Ignore for now. */
> +	seq_printf(m, "LLC: %s\n", yesno(HAS_LLC(dev)));
> +	seq_printf(m, "eLLC: %zuMB\n", dev_priv->ellc_size);
> +
> +	return 0;
> +}
> +
>  static int
>  i915_wedged_get(void *data, u64 *val)
>  {
> @@ -2370,6 +2383,7 @@ static struct drm_info_list i915_debugfs_list[] = {
>  	{"i915_swizzle_info", i915_swizzle_info, 0},
>  	{"i915_ppgtt_info", i915_ppgtt_info, 0},
>  	{"i915_dpio", i915_dpio_info, 0},
> +	{"i915_llc", i915_llc, 0},
>  };
>  #define I915_DEBUGFS_ENTRIES ARRAY_SIZE(i915_debugfs_list)
>  
> -- 
> 1.8.3
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 4/5] drm/i915: Use eLLC/LLC by default when available
  2013-07-04 18:02 ` [PATCH 4/5] drm/i915: Use eLLC/LLC by default when available Ben Widawsky
@ 2013-07-04 18:17   ` Daniel Vetter
  2013-07-04 18:40     ` Ben Widawsky
  0 siblings, 1 reply; 30+ messages in thread
From: Daniel Vetter @ 2013-07-04 18:17 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Thu, Jul 04, 2013 at 11:02:06AM -0700, Ben Widawsky wrote:
> DRI clients really should be using MOCS to get fine grained streaming
> cache controls. With that note, I *hope* that this patch doesn't improve
> performance overwhelmingly, because if it does - it means there is a
> problem elsewhere.
> 
> In any case, the kernel, and old userspace should get some benefit from
> this, so let's do it. eLLC is always a good default, and really not
> using it is the special case for MOCS.
> 
> References: http://www.intel.com/newsroom/kits/restricted/ha$well!/pdfs/4th_Gen_Intel_Core_PressBriefing_5-29.pdf (page 57)
> 
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

Iris is the marketing name and likely to stick around for a bit (like HD
Graphics), I'd vote to use the codename for this thing here, i.e. crw.
-Daniel

> ---
>  drivers/gpu/drm/i915/i915_gem_gtt.c | 17 ++++++++++++++++-
>  1 file changed, 16 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index 42262d0..0ff8073 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -53,6 +53,7 @@
>  #define HSW_CACHEABILITY_CONTROL(bits)	((((bits) & 0x7) << 1) | \
>  					 (((bits) & 0x8) << (11 - 3)))
>  #define HSW_WB_LLC_AGE0			HSW_CACHEABILITY_CONTROL(0x3)
> +#define HSW_WB_ELLC_LLC_AGE0		HSW_CACHEABILITY_CONTROL(0xb)
>  
>  static gen6_gtt_pte_t gen6_pte_encode(dma_addr_t addr,
>  				      enum i915_cache_level level)
> @@ -109,6 +110,18 @@ static gen6_gtt_pte_t hsw_pte_encode(dma_addr_t addr,
>  	return pte;
>  }
>  
> +static gen6_gtt_pte_t iris_pte_encode(dma_addr_t addr,
> +				      enum i915_cache_level level)
> +{
> +	gen6_gtt_pte_t pte = GEN6_PTE_VALID;
> +	pte |= HSW_PTE_ADDR_ENCODE(addr);
> +
> +	if (level != I915_CACHE_NONE)
> +		pte |= HSW_WB_ELLC_LLC_AGE0;
> +
> +	return pte;
> +}
> +
>  static void gen6_write_pdes(struct i915_hw_ppgtt *ppgtt)
>  {
>  	struct drm_i915_private *dev_priv = ppgtt->dev->dev_private;
> @@ -860,7 +873,9 @@ int i915_gem_gtt_init(struct drm_device *dev)
>  	} else {
>  		gtt->gtt_probe = gen6_gmch_probe;
>  		gtt->gtt_remove = gen6_gmch_remove;
> -		if (IS_HASWELL(dev))
> +		if (IS_HASWELL(dev) && dev_priv->ellc_size)
> +			gtt->pte_encode = iris_pte_encode;
> +		else if (IS_HASWELL(dev))
>  			gtt->pte_encode = hsw_pte_encode;
>  		else if (IS_VALLEYVIEW(dev))
>  			gtt->pte_encode = byt_pte_encode;
> -- 
> 1.8.3
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 4/5] drm/i915: Use eLLC/LLC by default when available
  2013-07-04 18:17   ` Daniel Vetter
@ 2013-07-04 18:40     ` Ben Widawsky
  2013-07-13  0:08       ` Rodrigo Vivi
  0 siblings, 1 reply; 30+ messages in thread
From: Ben Widawsky @ 2013-07-04 18:40 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX

On Thu, Jul 04, 2013 at 08:17:09PM +0200, Daniel Vetter wrote:
> On Thu, Jul 04, 2013 at 11:02:06AM -0700, Ben Widawsky wrote:
> > DRI clients really should be using MOCS to get fine grained streaming
> > cache controls. With that note, I *hope* that this patch doesn't improve
> > performance overwhelmingly, because if it does - it means there is a
> > problem elsewhere.
> > 
> > In any case, the kernel, and old userspace should get some benefit from
> > this, so let's do it. eLLC is always a good default, and really not
> > using it is the special case for MOCS.
> > 
> > References: http://www.intel.com/newsroom/kits/restricted/ha$well!/pdfs/4th_Gen_Intel_Core_PressBriefing_5-29.pdf (page 57)
> > 
> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> 
> Iris is the marketing name and likely to stick around for a bit (like HD
> Graphics), I'd vote to use the codename for this thing here, i.e. crw.
> -Daniel
>
I think we've agreed on IRC to leave this as is?
-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 5/5] drm/i915: debugfs entries for [e]LLC
  2013-07-04 18:14   ` Daniel Vetter
@ 2013-07-04 18:40     ` Ben Widawsky
  2013-07-04 18:43       ` Daniel Vetter
  0 siblings, 1 reply; 30+ messages in thread
From: Ben Widawsky @ 2013-07-04 18:40 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX

On Thu, Jul 04, 2013 at 08:14:41PM +0200, Daniel Vetter wrote:
> On Thu, Jul 04, 2013 at 11:02:07AM -0700, Ben Widawsky wrote:
> > To make users life a little easier figuring out what they have on their
> > system.
> > 
> > Ideally, I'd really like to report LLC size, but it turned out to be a
> > bit of a pain. Maybe I'll revisit it in the future.
> > 
> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> 
> I think a getparam for eLLC would be neat, so that usespace can use it to
> tune working set sizes.
> -Daniel
>
And I assume drop debugfs?
-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 3.5/5] drm/i915: Do eLLC detection earlier
  2013-07-04 18:02 ` [PATCH 3/5] drm/i915: store eLLC size Ben Widawsky
@ 2013-07-04 18:42   ` Ben Widawsky
  2013-07-13  0:04     ` Rodrigo Vivi
  2013-07-13  9:39     ` Daniel Vetter
  2013-07-16  6:02   ` [PATCH 3/5] drm/i915: store eLLC size Daniel Vetter
  1 sibling, 2 replies; 30+ messages in thread
From: Ben Widawsky @ 2013-07-04 18:42 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

We need it before we set the pte_encode function pointers, which happens
really early, in gtt_init.

The problem with just doing the normal sequence earlier is we don't have
the ability to use forcewake until after the pte functions have been set
up.

Since all solutions are somewhat ugly (barring rewriting all the init
ordering), I've opted to do the detection really early, and the enabling
later - since the register to detect doesn't require forcewake.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_dma.c | 10 ++++++++++
 drivers/gpu/drm/i915/i915_gem.c |  9 +--------
 2 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 0e22142..7eda8ab 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -1524,6 +1524,16 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
 
 	intel_early_sanitize_regs(dev);
 
+	if (IS_HASWELL(dev) && (I915_READ(HSW_EDRAM_PRESENT) == 1)) {
+		/* The docs do not explain exactly how the calculation can be
+		 * made. It is somewhat guessable, but for now, it's always
+		 * 128MB.
+		 * NB: We can't write IDICR yet because we do not have gt funcs
+		 * set up */
+		dev_priv->ellc_size = 128;
+		DRM_INFO("Found %zuMB of eLLC\n", dev_priv->ellc_size);
+	}
+
 	ret = i915_gem_gtt_init(dev);
 	if (ret)
 		goto put_bridge;
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 2df993d..f9834f2 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4153,15 +4153,8 @@ i915_gem_init_hw(struct drm_device *dev)
 	if (INTEL_INFO(dev)->gen < 6 && !intel_enable_gtt())
 		return -EIO;
 
-	if (IS_HASWELL(dev) && (I915_READ(HSW_EDRAM_PRESENT) == 1)) {
+	if (dev_priv->ellc_size)
 		I915_WRITE(HSW_IDICR, I915_READ(HSW_IDICR) | IDIHASHMSK(0xf));
-		/* The docs do not explain exactly how the calculation can be
-		 * made. It is somewhat guessable, but for now, it's always
-		 * 128MB.
-		 */
-		dev_priv->ellc_size = 128;
-		DRM_INFO("Found %zuMB of eLLC\n", dev_priv->ellc_size);
-	}
 
 	if (HAS_PCH_NOP(dev)) {
 		u32 temp = I915_READ(GEN7_MSG_CTL);
-- 
1.8.3

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH 5/5] drm/i915: debugfs entries for [e]LLC
  2013-07-04 18:40     ` Ben Widawsky
@ 2013-07-04 18:43       ` Daniel Vetter
  2013-07-04 18:46         ` Ben Widawsky
  0 siblings, 1 reply; 30+ messages in thread
From: Daniel Vetter @ 2013-07-04 18:43 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Thu, Jul 4, 2013 at 8:40 PM, Ben Widawsky <ben@bwidawsk.net> wrote:
> On Thu, Jul 04, 2013 at 08:14:41PM +0200, Daniel Vetter wrote:
>> On Thu, Jul 04, 2013 at 11:02:07AM -0700, Ben Widawsky wrote:
>> > To make users life a little easier figuring out what they have on their
>> > system.
>> >
>> > Ideally, I'd really like to report LLC size, but it turned out to be a
>> > bit of a pain. Maybe I'll revisit it in the future.
>> >
>> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
>>
>> I think a getparam for eLLC would be neat, so that usespace can use it to
>> tune working set sizes.
>> -Daniel
>>
> And I assume drop debugfs?

Yeah, I guess the DRM_INFO message in dmesg should be good enough
then. For userspace's convenience we could even look into exposing the
LLC size with a getparam.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 5/5] drm/i915: debugfs entries for [e]LLC
  2013-07-04 18:43       ` Daniel Vetter
@ 2013-07-04 18:46         ` Ben Widawsky
  2013-07-09 18:35           ` Chad Versace
  0 siblings, 1 reply; 30+ messages in thread
From: Ben Widawsky @ 2013-07-04 18:46 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX

On Thu, Jul 04, 2013 at 08:43:58PM +0200, Daniel Vetter wrote:
> On Thu, Jul 4, 2013 at 8:40 PM, Ben Widawsky <ben@bwidawsk.net> wrote:
> > On Thu, Jul 04, 2013 at 08:14:41PM +0200, Daniel Vetter wrote:
> >> On Thu, Jul 04, 2013 at 11:02:07AM -0700, Ben Widawsky wrote:
> >> > To make users life a little easier figuring out what they have on their
> >> > system.
> >> >
> >> > Ideally, I'd really like to report LLC size, but it turned out to be a
> >> > bit of a pain. Maybe I'll revisit it in the future.
> >> >
> >> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> >>
> >> I think a getparam for eLLC would be neat, so that usespace can use it to
> >> tune working set sizes.
> >> -Daniel
> >>
> > And I assume drop debugfs?
> 
> Yeah, I guess the DRM_INFO message in dmesg should be good enough
> then. For userspace's convenience we could even look into exposing the
> LLC size with a getparam.
> -Daniel
>

I would like to do this since we have easy access to cpuid. I know Chad
really wants it. If you'll accept the patch, I'll write it.

> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch

-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 6/6] drm/i915: Add a param for eLLC size
  2013-07-04 18:02 ` [PATCH 5/5] drm/i915: debugfs entries for [e]LLC Ben Widawsky
  2013-07-04 18:14   ` Daniel Vetter
@ 2013-07-04 18:47   ` Ben Widawsky
  2013-07-16  6:10     ` Daniel Vetter
  2013-07-13  0:11   ` [PATCH 5/5] drm/i915: debugfs entries for [e]LLC Rodrigo Vivi
  2 siblings, 1 reply; 30+ messages in thread
From: Ben Widawsky @ 2013-07-04 18:47 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_dma.c | 3 +++
 include/uapi/drm/i915_drm.h     | 1 +
 2 files changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 7eda8ab..435fc4a 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -1000,6 +1000,9 @@ static int i915_getparam(struct drm_device *dev, void *data,
 	case I915_PARAM_HAS_EXEC_HANDLE_LUT:
 		value = 1;
 		break;
+	case I915_PARAM_ELLC_SIZE:
+		value = dev_priv->ellc_size;
+		break;
 	default:
 		DRM_DEBUG("Unknown parameter %d\n", param->param);
 		return -EINVAL;
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 923ed7f..56bb327 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -310,6 +310,7 @@ typedef struct drm_i915_irq_wait {
 #define I915_PARAM_HAS_PINNED_BATCHES	 24
 #define I915_PARAM_HAS_EXEC_NO_RELOC	 25
 #define I915_PARAM_HAS_EXEC_HANDLE_LUT   26
+#define I915_PARAM_ELLC_SIZE		 27
 
 typedef struct drm_i915_getparam {
 	int param;
-- 
1.8.3

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH 5/5] drm/i915: debugfs entries for [e]LLC
  2013-07-04 18:46         ` Ben Widawsky
@ 2013-07-09 18:35           ` Chad Versace
  2013-07-09 20:16             ` Ben Widawsky
  0 siblings, 1 reply; 30+ messages in thread
From: Chad Versace @ 2013-07-09 18:35 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On 07/04/2013 11:46 AM, Ben Widawsky wrote:
> On Thu, Jul 04, 2013 at 08:43:58PM +0200, Daniel Vetter wrote:
>> On Thu, Jul 4, 2013 at 8:40 PM, Ben Widawsky <ben@bwidawsk.net> wrote:
>>> On Thu, Jul 04, 2013 at 08:14:41PM +0200, Daniel Vetter wrote:
>>>> On Thu, Jul 04, 2013 at 11:02:07AM -0700, Ben Widawsky wrote:
>>>>> To make users life a little easier figuring out what they have on their
>>>>> system.
>>>>>
>>>>> Ideally, I'd really like to report LLC size, but it turned out to be a
>>>>> bit of a pain. Maybe I'll revisit it in the future.
>>>>>
>>>>> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
>>>>
>>>> I think a getparam for eLLC would be neat, so that usespace can use it to
>>>> tune working set sizes.
>>>> -Daniel
>>>>
>>> And I assume drop debugfs?
>>
>> Yeah, I guess the DRM_INFO message in dmesg should be good enough
>> then. For userspace's convenience we could even look into exposing the
>> LLC size with a getparam.
>> -Daniel
>>
>
> I would like to do this since we have easy access to cpuid. I know Chad
> really wants it. If you'll accept the patch, I'll write it.

I really want to know the cache sizes.

Actually, I didn't expect the kernel to do this for me. So, I've prototyped
a patch for Mesa to probe the cache sizes with CPUID. If the
kernel does that for Mesa, then I can likely drop my Mesa patch.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 5/5] drm/i915: debugfs entries for [e]LLC
  2013-07-09 18:35           ` Chad Versace
@ 2013-07-09 20:16             ` Ben Widawsky
  0 siblings, 0 replies; 30+ messages in thread
From: Ben Widawsky @ 2013-07-09 20:16 UTC (permalink / raw)
  To: Chad Versace; +Cc: Intel GFX

On Tue, Jul 09, 2013 at 11:35:38AM -0700, Chad Versace wrote:
> On 07/04/2013 11:46 AM, Ben Widawsky wrote:
> >On Thu, Jul 04, 2013 at 08:43:58PM +0200, Daniel Vetter wrote:
> >>On Thu, Jul 4, 2013 at 8:40 PM, Ben Widawsky <ben@bwidawsk.net> wrote:
> >>>On Thu, Jul 04, 2013 at 08:14:41PM +0200, Daniel Vetter wrote:
> >>>>On Thu, Jul 04, 2013 at 11:02:07AM -0700, Ben Widawsky wrote:
> >>>>>To make users life a little easier figuring out what they have on their
> >>>>>system.
> >>>>>
> >>>>>Ideally, I'd really like to report LLC size, but it turned out to be a
> >>>>>bit of a pain. Maybe I'll revisit it in the future.
> >>>>>
> >>>>>Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> >>>>
> >>>>I think a getparam for eLLC would be neat, so that usespace can use it to
> >>>>tune working set sizes.
> >>>>-Daniel
> >>>>
> >>>And I assume drop debugfs?
> >>
> >>Yeah, I guess the DRM_INFO message in dmesg should be good enough
> >>then. For userspace's convenience we could even look into exposing the
> >>LLC size with a getparam.
> >>-Daniel
> >>
> >
> >I would like to do this since we have easy access to cpuid. I know Chad
> >really wants it. If you'll accept the patch, I'll write it.
> 
> I really want to know the cache sizes.
> 
> Actually, I didn't expect the kernel to do this for me. So, I've prototyped
> a patch for Mesa to probe the cache sizes with CPUID. If the
> kernel does that for Mesa, then I can likely drop my Mesa patch.
> 

I think if we need to do the work, it makes sense to do it in the kernel
since the other components can easily take advantage of it. I can even
shoehorn it in to the existing LLC param.

-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/5] drm/i915/hsw: Set correct Haswell PTE encodings.
  2013-07-04 18:02 [PATCH 1/5] drm/i915/hsw: Set correct Haswell PTE encodings Ben Widawsky
                   ` (3 preceding siblings ...)
  2013-07-04 18:02 ` [PATCH 5/5] drm/i915: debugfs entries for [e]LLC Ben Widawsky
@ 2013-07-13  0:00 ` Rodrigo Vivi
  2013-07-14 20:34   ` Ben Widawsky
  2013-07-15 14:16 ` Damien Lespiau
  2013-07-15 14:23 ` Damien Lespiau
  6 siblings, 1 reply; 30+ messages in thread
From: Rodrigo Vivi @ 2013-07-13  0:00 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Daniel Vetter, Intel GFX, Ben Widawsky

Hi Ben,

sorry for taking so long to look at your patches.
Well, since I changed my TI password I'm not able to see bspec
anymore, so I couldn't verify many things that I'm going to ask many
questions.
If the answer is on spec or any other doc please send me in pvt!


On Thu, Jul 4, 2013 at 3:02 PM, Ben Widawsky <ben@bwidawsk.net> wrote:
> From: Ben Widawsky <benjamin.widawsky@intel.com>
>
> The cacheability controls have changed, and the bits have been
> rearranged in general.
>
> v2: Remove comments for snb/ivb cache leves, that's a separate change.
>
> v3: Resolve conflicts due to patch series reordering.
>
> v4: Rebased on top of Kenneth Graunke's ->pet_encode refactoring.

pet like in leper?! :P
just kidding never mind.

>
> v5: Removed eLLC bits for separate patch.
>
> In the internal repository this was:
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> ---
>  drivers/gpu/drm/i915/i915_gem_gtt.c | 13 +++++++++++--
>  1 file changed, 11 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index 66929ea..42262d0 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -33,6 +33,7 @@
>
>  /* PPGTT stuff */
>  #define GEN6_GTT_ADDR_ENCODE(addr)     ((addr) | (((addr) >> 28) & 0xff0))
> +#define HSW_GTT_ADDR_ENCODE(addr)      ((addr) | (((addr) >> 28) & 0x7f0))

why not masking bit 10 anymore?

>
>  #define GEN6_PDE_VALID                 (1 << 0)
>  /* gen6+ has bit 11-4 for physical addr bit 39-32 */
> @@ -44,6 +45,14 @@
>  #define GEN6_PTE_CACHE_LLC             (2 << 1)
>  #define GEN6_PTE_CACHE_LLC_MLC         (3 << 1)
>  #define GEN6_PTE_ADDR_ENCODE(addr)     GEN6_GTT_ADDR_ENCODE(addr)
> +#define HSW_PTE_ADDR_ENCODE(addr)      HSW_GTT_ADDR_ENCODE(addr)
> +
> +/* Cacheability Control is a 4-bit value. The low three bits are stored in *
> + * bits 3:1 of the PTE, while the fourth bit is stored in bit 11 of the PTE.
> + */
> +#define HSW_CACHEABILITY_CONTROL(bits) ((((bits) & 0x7) << 1) | \
> +                                        (((bits) & 0x8) << (11 - 3)))
> +#define HSW_WB_LLC_AGE0                        HSW_CACHEABILITY_CONTROL(0x3)

where did you get that? and are you really using that in any other patch?

>
>  static gen6_gtt_pte_t gen6_pte_encode(dma_addr_t addr,
>                                       enum i915_cache_level level)
> @@ -92,10 +101,10 @@ static gen6_gtt_pte_t hsw_pte_encode(dma_addr_t addr,
>                                      enum i915_cache_level level)
>  {
>         gen6_gtt_pte_t pte = GEN6_PTE_VALID;
> -       pte |= GEN6_PTE_ADDR_ENCODE(addr);
> +       pte |= HSW_PTE_ADDR_ENCODE(addr);
>
>         if (level != I915_CACHE_NONE)
> -               pte |= GEN6_PTE_CACHE_LLC;
> +               pte |= HSW_WB_LLC_AGE0;
>
>         return pte;
>  }
> --
> 1.8.3
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx



--
Rodrigo Vivi
Blog: http://blog.vivi.eng.br

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 2/5] drm/i915: Define some of the eLLC magic
  2013-07-04 18:02 ` [PATCH 2/5] drm/i915: Define some of the eLLC magic Ben Widawsky
@ 2013-07-13  0:02   ` Rodrigo Vivi
  2013-07-14 20:36     ` Ben Widawsky
  0 siblings, 1 reply; 30+ messages in thread
From: Rodrigo Vivi @ 2013-07-13  0:02 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Thu, Jul 4, 2013 at 3:02 PM, Ben Widawsky <ben@bwidawsk.net> wrote:
> The EDRAM present register isn't really defined in the docs. It just
> says check to see if it's set to 1. So I haven't defined the 1 value not
> knowing what it actually means.
>
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> ---
>  drivers/gpu/drm/i915/i915_gem.c | 4 ++--
>  drivers/gpu/drm/i915/i915_reg.h | 4 ++++
>  2 files changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 4200c32..edea2cb 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -4153,8 +4153,8 @@ i915_gem_init_hw(struct drm_device *dev)
>         if (INTEL_INFO(dev)->gen < 6 && !intel_enable_gtt())
>                 return -EIO;
>
> -       if (IS_HASWELL(dev) && (I915_READ(0x120010) == 1))
> -               I915_WRITE(0x9008, I915_READ(0x9008) | 0xf0000);
> +       if (IS_HASWELL(dev) && (I915_READ(HSW_EDRAM_PRESENT) == 1))
> +               I915_WRITE(HSW_IDICR, I915_READ(HSW_IDICR) | IDIHASHMSK(0xf));

even though you said doc doesn't define this register please fwd me
the doc... or please explain what is that
sorry about that, but as I said I'm without access :(

>
>         if (HAS_PCH_NOP(dev)) {
>                 u32 temp = I915_READ(GEN7_MSG_CTL);
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index 9b51be8..a2553ed 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -4469,6 +4469,10 @@
>  #define  GT_FIFO_FREE_ENTRIES                  0x120008
>  #define    GT_FIFO_NUM_RESERVED_ENTRIES                20
>
> +#define  HSW_IDICR                             0x9008
> +#define    IDIHASHMSK(x)                       (((x) & 0x3f) << 16)
> +#define  HSW_EDRAM_PRESENT                     0x120010
> +
>  #define GEN6_UCGCTL1                           0x9400
>  # define GEN6_BLBUNIT_CLOCK_GATE_DISABLE               (1 << 5)
>  # define GEN6_CSUNIT_CLOCK_GATE_DISABLE                        (1 << 7)
> --
> 1.8.3
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx



--
Rodrigo Vivi
Blog: http://blog.vivi.eng.br

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 3.5/5] drm/i915: Do eLLC detection earlier
  2013-07-04 18:42   ` [PATCH 3.5/5] drm/i915: Do eLLC detection earlier Ben Widawsky
@ 2013-07-13  0:04     ` Rodrigo Vivi
  2013-07-13  9:39     ` Daniel Vetter
  1 sibling, 0 replies; 30+ messages in thread
From: Rodrigo Vivi @ 2013-07-13  0:04 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

to be honest I didn't understood the other idea with forcewake, but
this way is fine for me, so:

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>

On Thu, Jul 4, 2013 at 3:42 PM, Ben Widawsky <ben@bwidawsk.net> wrote:
> We need it before we set the pte_encode function pointers, which happens
> really early, in gtt_init.
>
> The problem with just doing the normal sequence earlier is we don't have
> the ability to use forcewake until after the pte functions have been set
> up.
>
> Since all solutions are somewhat ugly (barring rewriting all the init
> ordering), I've opted to do the detection really early, and the enabling
> later - since the register to detect doesn't require forcewake.
>
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> ---
>  drivers/gpu/drm/i915/i915_dma.c | 10 ++++++++++
>  drivers/gpu/drm/i915/i915_gem.c |  9 +--------
>  2 files changed, 11 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> index 0e22142..7eda8ab 100644
> --- a/drivers/gpu/drm/i915/i915_dma.c
> +++ b/drivers/gpu/drm/i915/i915_dma.c
> @@ -1524,6 +1524,16 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
>
>         intel_early_sanitize_regs(dev);
>
> +       if (IS_HASWELL(dev) && (I915_READ(HSW_EDRAM_PRESENT) == 1)) {
> +               /* The docs do not explain exactly how the calculation can be
> +                * made. It is somewhat guessable, but for now, it's always
> +                * 128MB.
> +                * NB: We can't write IDICR yet because we do not have gt funcs
> +                * set up */
> +               dev_priv->ellc_size = 128;
> +               DRM_INFO("Found %zuMB of eLLC\n", dev_priv->ellc_size);
> +       }
> +
>         ret = i915_gem_gtt_init(dev);
>         if (ret)
>                 goto put_bridge;
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 2df993d..f9834f2 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -4153,15 +4153,8 @@ i915_gem_init_hw(struct drm_device *dev)
>         if (INTEL_INFO(dev)->gen < 6 && !intel_enable_gtt())
>                 return -EIO;
>
> -       if (IS_HASWELL(dev) && (I915_READ(HSW_EDRAM_PRESENT) == 1)) {
> +       if (dev_priv->ellc_size)
>                 I915_WRITE(HSW_IDICR, I915_READ(HSW_IDICR) | IDIHASHMSK(0xf));
> -               /* The docs do not explain exactly how the calculation can be
> -                * made. It is somewhat guessable, but for now, it's always
> -                * 128MB.
> -                */
> -               dev_priv->ellc_size = 128;
> -               DRM_INFO("Found %zuMB of eLLC\n", dev_priv->ellc_size);
> -       }
>
>         if (HAS_PCH_NOP(dev)) {
>                 u32 temp = I915_READ(GEN7_MSG_CTL);
> --
> 1.8.3
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx



-- 
Rodrigo Vivi
Blog: http://blog.vivi.eng.br

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 4/5] drm/i915: Use eLLC/LLC by default when available
  2013-07-04 18:40     ` Ben Widawsky
@ 2013-07-13  0:08       ` Rodrigo Vivi
  0 siblings, 0 replies; 30+ messages in thread
From: Rodrigo Vivi @ 2013-07-13  0:08 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

I don't have a strong opinion on iris x crw and you know that I like
to split different things in different functions, however in this case
I think either iris or crw are still hsw
so if it is possible to get  dev_priv->ellc_size inside hsw_pte_encode
somehow I in favor of not creating another function for iris...
otherwise, fell free to use:
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>

On Thu, Jul 4, 2013 at 3:40 PM, Ben Widawsky <ben@bwidawsk.net> wrote:
> On Thu, Jul 04, 2013 at 08:17:09PM +0200, Daniel Vetter wrote:
>> On Thu, Jul 04, 2013 at 11:02:06AM -0700, Ben Widawsky wrote:
>> > DRI clients really should be using MOCS to get fine grained streaming
>> > cache controls. With that note, I *hope* that this patch doesn't improve
>> > performance overwhelmingly, because if it does - it means there is a
>> > problem elsewhere.
>> >
>> > In any case, the kernel, and old userspace should get some benefit from
>> > this, so let's do it. eLLC is always a good default, and really not
>> > using it is the special case for MOCS.
>> >
>> > References: http://www.intel.com/newsroom/kits/restricted/ha$well!/pdfs/4th_Gen_Intel_Core_PressBriefing_5-29.pdf (page 57)
>> >
>> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
>>
>> Iris is the marketing name and likely to stick around for a bit (like HD
>> Graphics), I'd vote to use the codename for this thing here, i.e. crw.
>> -Daniel
>>
> I think we've agreed on IRC to leave this as is?
> --
> Ben Widawsky, Intel Open Source Technology Center
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx



-- 
Rodrigo Vivi
Blog: http://blog.vivi.eng.br

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 5/5] drm/i915: debugfs entries for [e]LLC
  2013-07-04 18:02 ` [PATCH 5/5] drm/i915: debugfs entries for [e]LLC Ben Widawsky
  2013-07-04 18:14   ` Daniel Vetter
  2013-07-04 18:47   ` [PATCH 6/6] drm/i915: Add a param for eLLC size Ben Widawsky
@ 2013-07-13  0:11   ` Rodrigo Vivi
  2 siblings, 0 replies; 30+ messages in thread
From: Rodrigo Vivi @ 2013-07-13  0:11 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>

I actually tested it too here on my ult... 0Mb ellc as expected...
unfortunately I don't have any cristallwell to test it for real ;)

On Thu, Jul 4, 2013 at 3:02 PM, Ben Widawsky <ben@bwidawsk.net> wrote:
> To make users life a little easier figuring out what they have on their
> system.
>
> Ideally, I'd really like to report LLC size, but it turned out to be a
> bit of a pain. Maybe I'll revisit it in the future.
>
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> ---
>  drivers/gpu/drm/i915/i915_debugfs.c | 14 ++++++++++++++
>  1 file changed, 14 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index 3e36756..b75d0a6 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -1941,6 +1941,19 @@ static int i915_dpio_info(struct seq_file *m, void *data)
>         return 0;
>  }
>
> +static int i915_llc(struct seq_file *m, void *data)
> +{
> +       struct drm_info_node *node = (struct drm_info_node *) m->private;
> +       struct drm_device *dev = node->minor->dev;
> +       struct drm_i915_private *dev_priv = dev->dev_private;
> +
> +       /* Size calculation for LLC is a bit of a pain. Ignore for now. */
> +       seq_printf(m, "LLC: %s\n", yesno(HAS_LLC(dev)));
> +       seq_printf(m, "eLLC: %zuMB\n", dev_priv->ellc_size);
> +
> +       return 0;
> +}
> +
>  static int
>  i915_wedged_get(void *data, u64 *val)
>  {
> @@ -2370,6 +2383,7 @@ static struct drm_info_list i915_debugfs_list[] = {
>         {"i915_swizzle_info", i915_swizzle_info, 0},
>         {"i915_ppgtt_info", i915_ppgtt_info, 0},
>         {"i915_dpio", i915_dpio_info, 0},
> +       {"i915_llc", i915_llc, 0},
>  };
>  #define I915_DEBUGFS_ENTRIES ARRAY_SIZE(i915_debugfs_list)
>
> --
> 1.8.3
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx



-- 
Rodrigo Vivi
Blog: http://blog.vivi.eng.br

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 3.5/5] drm/i915: Do eLLC detection earlier
  2013-07-04 18:42   ` [PATCH 3.5/5] drm/i915: Do eLLC detection earlier Ben Widawsky
  2013-07-13  0:04     ` Rodrigo Vivi
@ 2013-07-13  9:39     ` Daniel Vetter
  2013-07-14 20:37       ` Ben Widawsky
  1 sibling, 1 reply; 30+ messages in thread
From: Daniel Vetter @ 2013-07-13  9:39 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Thu, Jul 04, 2013 at 11:42:46AM -0700, Ben Widawsky wrote:
> We need it before we set the pte_encode function pointers, which happens
> really early, in gtt_init.
> 
> The problem with just doing the normal sequence earlier is we don't have
> the ability to use forcewake until after the pte functions have been set
> up.
> 
> Since all solutions are somewhat ugly (barring rewriting all the init
> ordering), I've opted to do the detection really early, and the enabling
> later - since the register to detect doesn't require forcewake.
> 
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

Looking at patch 3.5 here and patch 3 I think those should be squashed
together. Ok if I do that when applying?
-Daniel

> ---
>  drivers/gpu/drm/i915/i915_dma.c | 10 ++++++++++
>  drivers/gpu/drm/i915/i915_gem.c |  9 +--------
>  2 files changed, 11 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> index 0e22142..7eda8ab 100644
> --- a/drivers/gpu/drm/i915/i915_dma.c
> +++ b/drivers/gpu/drm/i915/i915_dma.c
> @@ -1524,6 +1524,16 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
>  
>  	intel_early_sanitize_regs(dev);
>  
> +	if (IS_HASWELL(dev) && (I915_READ(HSW_EDRAM_PRESENT) == 1)) {
> +		/* The docs do not explain exactly how the calculation can be
> +		 * made. It is somewhat guessable, but for now, it's always
> +		 * 128MB.
> +		 * NB: We can't write IDICR yet because we do not have gt funcs
> +		 * set up */
> +		dev_priv->ellc_size = 128;
> +		DRM_INFO("Found %zuMB of eLLC\n", dev_priv->ellc_size);
> +	}
> +
>  	ret = i915_gem_gtt_init(dev);
>  	if (ret)
>  		goto put_bridge;
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 2df993d..f9834f2 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -4153,15 +4153,8 @@ i915_gem_init_hw(struct drm_device *dev)
>  	if (INTEL_INFO(dev)->gen < 6 && !intel_enable_gtt())
>  		return -EIO;
>  
> -	if (IS_HASWELL(dev) && (I915_READ(HSW_EDRAM_PRESENT) == 1)) {
> +	if (dev_priv->ellc_size)
>  		I915_WRITE(HSW_IDICR, I915_READ(HSW_IDICR) | IDIHASHMSK(0xf));
> -		/* The docs do not explain exactly how the calculation can be
> -		 * made. It is somewhat guessable, but for now, it's always
> -		 * 128MB.
> -		 */
> -		dev_priv->ellc_size = 128;
> -		DRM_INFO("Found %zuMB of eLLC\n", dev_priv->ellc_size);
> -	}
>  
>  	if (HAS_PCH_NOP(dev)) {
>  		u32 temp = I915_READ(GEN7_MSG_CTL);
> -- 
> 1.8.3
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/5] drm/i915/hsw: Set correct Haswell PTE encodings.
  2013-07-13  0:00 ` [PATCH 1/5] drm/i915/hsw: Set correct Haswell PTE encodings Rodrigo Vivi
@ 2013-07-14 20:34   ` Ben Widawsky
  0 siblings, 0 replies; 30+ messages in thread
From: Ben Widawsky @ 2013-07-14 20:34 UTC (permalink / raw)
  To: Rodrigo Vivi; +Cc: Daniel Vetter, Ben Widawsky, Intel GFX

On Fri, Jul 12, 2013 at 09:00:31PM -0300, Rodrigo Vivi wrote:
> Hi Ben,
> 
> sorry for taking so long to look at your patches.
> Well, since I changed my TI password I'm not able to see bspec
> anymore, so I couldn't verify many things that I'm going to ask many
> questions.
> If the answer is on spec or any other doc please send me in pvt!

I think since we haven't yet published HSW docs, it's pretty fair of you
to ask, and for me to address it publicly.

> 
> 
> On Thu, Jul 4, 2013 at 3:02 PM, Ben Widawsky <ben@bwidawsk.net> wrote:
> > From: Ben Widawsky <benjamin.widawsky@intel.com>
> >
> > The cacheability controls have changed, and the bits have been
> > rearranged in general.
> >
> > v2: Remove comments for snb/ivb cache leves, that's a separate change.
> >
> > v3: Resolve conflicts due to patch series reordering.
> >
> > v4: Rebased on top of Kenneth Graunke's ->pet_encode refactoring.
> 
> pet like in leper?! :P
> just kidding never mind.

heh, fixedi locally, thanks.
> 
> >
> > v5: Removed eLLC bits for separate patch.
> >
> > In the internal repository this was:
> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
> > Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> > ---
> >  drivers/gpu/drm/i915/i915_gem_gtt.c | 13 +++++++++++--
> >  1 file changed, 11 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > index 66929ea..42262d0 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > @@ -33,6 +33,7 @@
> >
> >  /* PPGTT stuff */
> >  #define GEN6_GTT_ADDR_ENCODE(addr)     ((addr) | (((addr) >> 28) & 0xff0))
> > +#define HSW_GTT_ADDR_ENCODE(addr)      ((addr) | (((addr) >> 28) & 0x7f0))
> 
> why not masking bit 10 anymore?

You're asking the wrong guy... The HW just supports one less address
bit. The bit gets used for some of the cacheability flags later in the
series.

> 
> >
> >  #define GEN6_PDE_VALID                 (1 << 0)
> >  /* gen6+ has bit 11-4 for physical addr bit 39-32 */
> > @@ -44,6 +45,14 @@
> >  #define GEN6_PTE_CACHE_LLC             (2 << 1)
> >  #define GEN6_PTE_CACHE_LLC_MLC         (3 << 1)
> >  #define GEN6_PTE_ADDR_ENCODE(addr)     GEN6_GTT_ADDR_ENCODE(addr)
> > +#define HSW_PTE_ADDR_ENCODE(addr)      HSW_GTT_ADDR_ENCODE(addr)
> > +
> > +/* Cacheability Control is a 4-bit value. The low three bits are stored in *
> > + * bits 3:1 of the PTE, while the fourth bit is stored in bit 11 of the PTE.
> > + */
> > +#define HSW_CACHEABILITY_CONTROL(bits) ((((bits) & 0x7) << 1) | \
> > +                                        (((bits) & 0x8) << (11 - 3)))
> > +#define HSW_WB_LLC_AGE0                        HSW_CACHEABILITY_CONTROL(0x3)
> 
> where did you get that? and are you really using that in any other patch?

It's in a table in the spec. Unfortunately there is no formula defined
in the spec, but I came up with this one (actually my original formula
had a bug, fixed by Ken).

Not sure what you mean by, "using that in any other patch." I'm not sure
why is that relevant, it's used below.

> 
> >
> >  static gen6_gtt_pte_t gen6_pte_encode(dma_addr_t addr,
> >                                       enum i915_cache_level level)
> > @@ -92,10 +101,10 @@ static gen6_gtt_pte_t hsw_pte_encode(dma_addr_t addr,
> >                                      enum i915_cache_level level)
> >  {
> >         gen6_gtt_pte_t pte = GEN6_PTE_VALID;
> > -       pte |= GEN6_PTE_ADDR_ENCODE(addr);
> > +       pte |= HSW_PTE_ADDR_ENCODE(addr);
> >
> >         if (level != I915_CACHE_NONE)
> > -               pte |= GEN6_PTE_CACHE_LLC;
> > +               pte |= HSW_WB_LLC_AGE0;
> >
> >         return pte;
> >  }
> > --
> > 1.8.3
> >
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> 
> 
> 
> --
> Rodrigo Vivi
> Blog: http://blog.vivi.eng.br

-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 2/5] drm/i915: Define some of the eLLC magic
  2013-07-13  0:02   ` Rodrigo Vivi
@ 2013-07-14 20:36     ` Ben Widawsky
  0 siblings, 0 replies; 30+ messages in thread
From: Ben Widawsky @ 2013-07-14 20:36 UTC (permalink / raw)
  To: Rodrigo Vivi; +Cc: Intel GFX

On Fri, Jul 12, 2013 at 09:02:37PM -0300, Rodrigo Vivi wrote:
> On Thu, Jul 4, 2013 at 3:02 PM, Ben Widawsky <ben@bwidawsk.net> wrote:
> > The EDRAM present register isn't really defined in the docs. It just
> > says check to see if it's set to 1. So I haven't defined the 1 value not
> > knowing what it actually means.
> >
> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > ---
> >  drivers/gpu/drm/i915/i915_gem.c | 4 ++--
> >  drivers/gpu/drm/i915/i915_reg.h | 4 ++++
> >  2 files changed, 6 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index 4200c32..edea2cb 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -4153,8 +4153,8 @@ i915_gem_init_hw(struct drm_device *dev)
> >         if (INTEL_INFO(dev)->gen < 6 && !intel_enable_gtt())
> >                 return -EIO;
> >
> > -       if (IS_HASWELL(dev) && (I915_READ(0x120010) == 1))
> > -               I915_WRITE(0x9008, I915_READ(0x9008) | 0xf0000);
> > +       if (IS_HASWELL(dev) && (I915_READ(HSW_EDRAM_PRESENT) == 1))
> > +               I915_WRITE(HSW_IDICR, I915_READ(HSW_IDICR) | IDIHASHMSK(0xf));
> 
> even though you said doc doesn't define this register please fwd me
> the doc... or please explain what is that
> sorry about that, but as I said I'm without access :(

It's just in the normal bspec. When I said the doc doesn't define it,
what I meant is, the values 0x120010, and 1 have no symbolic names in
the docs. It simply says we should read the thing. If you need me to
clarify more, maybe we can discuss it on IRC later.

> 
> >
> >         if (HAS_PCH_NOP(dev)) {
> >                 u32 temp = I915_READ(GEN7_MSG_CTL);
> > diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> > index 9b51be8..a2553ed 100644
> > --- a/drivers/gpu/drm/i915/i915_reg.h
> > +++ b/drivers/gpu/drm/i915/i915_reg.h
> > @@ -4469,6 +4469,10 @@
> >  #define  GT_FIFO_FREE_ENTRIES                  0x120008
> >  #define    GT_FIFO_NUM_RESERVED_ENTRIES                20
> >
> > +#define  HSW_IDICR                             0x9008
> > +#define    IDIHASHMSK(x)                       (((x) & 0x3f) << 16)
> > +#define  HSW_EDRAM_PRESENT                     0x120010
> > +
> >  #define GEN6_UCGCTL1                           0x9400
> >  # define GEN6_BLBUNIT_CLOCK_GATE_DISABLE               (1 << 5)
> >  # define GEN6_CSUNIT_CLOCK_GATE_DISABLE                        (1 << 7)
> > --
> > 1.8.3
> >
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> 
> 
> 
> --
> Rodrigo Vivi
> Blog: http://blog.vivi.eng.br

-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 3.5/5] drm/i915: Do eLLC detection earlier
  2013-07-13  9:39     ` Daniel Vetter
@ 2013-07-14 20:37       ` Ben Widawsky
  0 siblings, 0 replies; 30+ messages in thread
From: Ben Widawsky @ 2013-07-14 20:37 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX

On Sat, Jul 13, 2013 at 11:39:44AM +0200, Daniel Vetter wrote:
> On Thu, Jul 04, 2013 at 11:42:46AM -0700, Ben Widawsky wrote:
> > We need it before we set the pte_encode function pointers, which happens
> > really early, in gtt_init.
> > 
> > The problem with just doing the normal sequence earlier is we don't have
> > the ability to use forcewake until after the pte functions have been set
> > up.
> > 
> > Since all solutions are somewhat ugly (barring rewriting all the init
> > ordering), I've opted to do the detection really early, and the enabling
> > later - since the register to detect doesn't require forcewake.
> > 
> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> 
> Looking at patch 3.5 here and patch 3 I think those should be squashed
> together. Ok if I do that when applying?
> -Daniel

Of course.

[snip]
-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/5] drm/i915/hsw: Set correct Haswell PTE encodings.
  2013-07-04 18:02 [PATCH 1/5] drm/i915/hsw: Set correct Haswell PTE encodings Ben Widawsky
                   ` (4 preceding siblings ...)
  2013-07-13  0:00 ` [PATCH 1/5] drm/i915/hsw: Set correct Haswell PTE encodings Rodrigo Vivi
@ 2013-07-15 14:16 ` Damien Lespiau
  2013-07-15 14:23 ` Damien Lespiau
  6 siblings, 0 replies; 30+ messages in thread
From: Damien Lespiau @ 2013-07-15 14:16 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Daniel Vetter, Intel GFX, Ben Widawsky

On Thu, Jul 04, 2013 at 11:02:03AM -0700, Ben Widawsky wrote:
> From: Ben Widawsky <benjamin.widawsky@intel.com>
> 
> The cacheability controls have changed, and the bits have been
> rearranged in general.
> 
> v2: Remove comments for snb/ivb cache leves, that's a separate change.
> 
> v3: Resolve conflicts due to patch series reordering.
> 
> v4: Rebased on top of Kenneth Graunke's ->pet_encode refactoring.
> 
> v5: Removed eLLC bits for separate patch.
> 
> In the internal repository this was:
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> ---

Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>

-- 
Damien

>  drivers/gpu/drm/i915/i915_gem_gtt.c | 13 +++++++++++--
>  1 file changed, 11 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index 66929ea..42262d0 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -33,6 +33,7 @@
>  
>  /* PPGTT stuff */
>  #define GEN6_GTT_ADDR_ENCODE(addr)	((addr) | (((addr) >> 28) & 0xff0))
> +#define HSW_GTT_ADDR_ENCODE(addr)	((addr) | (((addr) >> 28) & 0x7f0))
>  
>  #define GEN6_PDE_VALID			(1 << 0)
>  /* gen6+ has bit 11-4 for physical addr bit 39-32 */
> @@ -44,6 +45,14 @@
>  #define GEN6_PTE_CACHE_LLC		(2 << 1)
>  #define GEN6_PTE_CACHE_LLC_MLC		(3 << 1)
>  #define GEN6_PTE_ADDR_ENCODE(addr)	GEN6_GTT_ADDR_ENCODE(addr)
> +#define HSW_PTE_ADDR_ENCODE(addr)	HSW_GTT_ADDR_ENCODE(addr)
> +
> +/* Cacheability Control is a 4-bit value. The low three bits are stored in *
> + * bits 3:1 of the PTE, while the fourth bit is stored in bit 11 of the PTE.
> + */
> +#define HSW_CACHEABILITY_CONTROL(bits)	((((bits) & 0x7) << 1) | \
> +					 (((bits) & 0x8) << (11 - 3)))
> +#define HSW_WB_LLC_AGE0			HSW_CACHEABILITY_CONTROL(0x3)
>  
>  static gen6_gtt_pte_t gen6_pte_encode(dma_addr_t addr,
>  				      enum i915_cache_level level)
> @@ -92,10 +101,10 @@ static gen6_gtt_pte_t hsw_pte_encode(dma_addr_t addr,
>  				     enum i915_cache_level level)
>  {
>  	gen6_gtt_pte_t pte = GEN6_PTE_VALID;
> -	pte |= GEN6_PTE_ADDR_ENCODE(addr);
> +	pte |= HSW_PTE_ADDR_ENCODE(addr);
>  
>  	if (level != I915_CACHE_NONE)
> -		pte |= GEN6_PTE_CACHE_LLC;
> +		pte |= HSW_WB_LLC_AGE0;
>  
>  	return pte;
>  }
> -- 
> 1.8.3
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/5] drm/i915/hsw: Set correct Haswell PTE encodings.
  2013-07-04 18:02 [PATCH 1/5] drm/i915/hsw: Set correct Haswell PTE encodings Ben Widawsky
                   ` (5 preceding siblings ...)
  2013-07-15 14:16 ` Damien Lespiau
@ 2013-07-15 14:23 ` Damien Lespiau
  2013-07-15 16:54   ` Ben Widawsky
  6 siblings, 1 reply; 30+ messages in thread
From: Damien Lespiau @ 2013-07-15 14:23 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Daniel Vetter, Intel GFX, Ben Widawsky

On Thu, Jul 04, 2013 at 11:02:03AM -0700, Ben Widawsky wrote:
> +/* Cacheability Control is a 4-bit value. The low three bits are stored in *
> + * bits 3:1 of the PTE, while the fourth bit is stored in bit 11 of the PTE.
> + */
> +#define HSW_CACHEABILITY_CONTROL(bits)	((((bits) & 0x7) << 1) | \
> +					 (((bits) & 0x8) << (11 - 3)))
> +#define HSW_WB_LLC_AGE0			HSW_CACHEABILITY_CONTROL(0x3)

One small note, an age of '0' means "old" as in it's likely to be
evicted before buffers aged 3, 2 or 1. We don't use any other age yet,
so it doesn't matter for now, but might in the future.

-- 
Damien

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/5] drm/i915/hsw: Set correct Haswell PTE encodings.
  2013-07-15 14:23 ` Damien Lespiau
@ 2013-07-15 16:54   ` Ben Widawsky
  2013-07-16  6:00     ` Daniel Vetter
  0 siblings, 1 reply; 30+ messages in thread
From: Ben Widawsky @ 2013-07-15 16:54 UTC (permalink / raw)
  To: Damien Lespiau; +Cc: Daniel Vetter, Ben Widawsky, Intel GFX

On Mon, Jul 15, 2013 at 03:23:00PM +0100, Damien Lespiau wrote:
> On Thu, Jul 04, 2013 at 11:02:03AM -0700, Ben Widawsky wrote:
> > +/* Cacheability Control is a 4-bit value. The low three bits are stored in *
> > + * bits 3:1 of the PTE, while the fourth bit is stored in bit 11 of the PTE.
> > + */
> > +#define HSW_CACHEABILITY_CONTROL(bits)	((((bits) & 0x7) << 1) | \
> > +					 (((bits) & 0x8) << (11 - 3)))
> > +#define HSW_WB_LLC_AGE0			HSW_CACHEABILITY_CONTROL(0x3)
> 
> One small note, an age of '0' means "old" as in it's likely to be
> evicted before buffers aged 3, 2 or 1. We don't use any other age yet,
> so it doesn't matter for now, but might in the future.
> 
> -- 
> Damien
>
FWIW, I have no intention of using any ages in the kernel. We can pick 3
equally well. Maybe in a way off future if or when we decide to have the
kernel try to track which cacheability to use for objects, we'll care.

Daniel, would you mind adding a comment on merge? Damien is correct 3 is
youngest, 0 is oldest.

-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/5] drm/i915/hsw: Set correct Haswell PTE encodings.
  2013-07-15 16:54   ` Ben Widawsky
@ 2013-07-16  6:00     ` Daniel Vetter
  0 siblings, 0 replies; 30+ messages in thread
From: Daniel Vetter @ 2013-07-16  6:00 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Daniel Vetter, Intel GFX, Ben Widawsky

On Mon, Jul 15, 2013 at 09:54:35AM -0700, Ben Widawsky wrote:
> On Mon, Jul 15, 2013 at 03:23:00PM +0100, Damien Lespiau wrote:
> > On Thu, Jul 04, 2013 at 11:02:03AM -0700, Ben Widawsky wrote:
> > > +/* Cacheability Control is a 4-bit value. The low three bits are stored in *
> > > + * bits 3:1 of the PTE, while the fourth bit is stored in bit 11 of the PTE.
> > > + */
> > > +#define HSW_CACHEABILITY_CONTROL(bits)	((((bits) & 0x7) << 1) | \
> > > +					 (((bits) & 0x8) << (11 - 3)))
> > > +#define HSW_WB_LLC_AGE0			HSW_CACHEABILITY_CONTROL(0x3)
> > 
> > One small note, an age of '0' means "old" as in it's likely to be
> > evicted before buffers aged 3, 2 or 1. We don't use any other age yet,
> > so it doesn't matter for now, but might in the future.
> > 
> > -- 
> > Damien
> >
> FWIW, I have no intention of using any ages in the kernel. We can pick 3
> equally well. Maybe in a way off future if or when we decide to have the
> kernel try to track which cacheability to use for objects, we'll care.
> 
> Daniel, would you mind adding a comment on merge? Damien is correct 3 is
> youngest, 0 is oldest.

Done and merged, thanks for the patch.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 3/5] drm/i915: store eLLC size
  2013-07-04 18:02 ` [PATCH 3/5] drm/i915: store eLLC size Ben Widawsky
  2013-07-04 18:42   ` [PATCH 3.5/5] drm/i915: Do eLLC detection earlier Ben Widawsky
@ 2013-07-16  6:02   ` Daniel Vetter
  1 sibling, 0 replies; 30+ messages in thread
From: Daniel Vetter @ 2013-07-16  6:02 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Thu, Jul 04, 2013 at 11:02:05AM -0700, Ben Widawsky wrote:
> The eLLC cannot be determined by PCIID because as far as we know, even
> machines supporting eLLC may not have it enabled, or fused off or
> whatever. It's possible this isn't actually true, and at that point we
> can switch to a DEV_INFO flag instead.
> 
> I've defined everything where the docs are clear, and left the rest as
> magic.
> 
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> ---
>  drivers/gpu/drm/i915/i915_drv.h | 3 +++
>  drivers/gpu/drm/i915/i915_gem.c | 9 ++++++++-
>  2 files changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index fd0f589..c6de881 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1187,6 +1187,9 @@ typedef struct drm_i915_private {
>  	/* Old dri1 support infrastructure, beware the dragons ya fools entering
>  	 * here! */
>  	struct i915_dri1_state dri1;
> +
> +	/* Cannot be determined by PCIID. You must always read a register. */
> +	size_t ellc_size;

Thou shalt not put useful stuff next to the dri/ums dungeons! Really,
we've tried a bit to clean up the giant mess that is our driver private
structure, I expect people to look for more than 5 secs for a suitable
place for this ;-)

Fixed while applying.
-Daniel

>  } drm_i915_private_t;
>  
>  /* Iterate over initialised rings */
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index edea2cb..2df993d 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -4153,8 +4153,15 @@ i915_gem_init_hw(struct drm_device *dev)
>  	if (INTEL_INFO(dev)->gen < 6 && !intel_enable_gtt())
>  		return -EIO;
>  
> -	if (IS_HASWELL(dev) && (I915_READ(HSW_EDRAM_PRESENT) == 1))
> +	if (IS_HASWELL(dev) && (I915_READ(HSW_EDRAM_PRESENT) == 1)) {
>  		I915_WRITE(HSW_IDICR, I915_READ(HSW_IDICR) | IDIHASHMSK(0xf));
> +		/* The docs do not explain exactly how the calculation can be
> +		 * made. It is somewhat guessable, but for now, it's always
> +		 * 128MB.
> +		 */
> +		dev_priv->ellc_size = 128;
> +		DRM_INFO("Found %zuMB of eLLC\n", dev_priv->ellc_size);
> +	}
>  
>  	if (HAS_PCH_NOP(dev)) {
>  		u32 temp = I915_READ(GEN7_MSG_CTL);
> -- 
> 1.8.3
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 6/6] drm/i915: Add a param for eLLC size
  2013-07-04 18:47   ` [PATCH 6/6] drm/i915: Add a param for eLLC size Ben Widawsky
@ 2013-07-16  6:10     ` Daniel Vetter
  0 siblings, 0 replies; 30+ messages in thread
From: Daniel Vetter @ 2013-07-16  6:10 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Thu, Jul 04, 2013 at 11:47:44AM -0700, Ben Widawsky wrote:
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

Ok, I've slurped the entire series into dinq, with the exception of this
patch - I'm waiting for the jury to figure out whether we can do this in
userspace with cpuid or not. If not I think the commit message should be
amended with that information and then I'll merge it.
-Daniel

> ---
>  drivers/gpu/drm/i915/i915_dma.c | 3 +++
>  include/uapi/drm/i915_drm.h     | 1 +
>  2 files changed, 4 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> index 7eda8ab..435fc4a 100644
> --- a/drivers/gpu/drm/i915/i915_dma.c
> +++ b/drivers/gpu/drm/i915/i915_dma.c
> @@ -1000,6 +1000,9 @@ static int i915_getparam(struct drm_device *dev, void *data,
>  	case I915_PARAM_HAS_EXEC_HANDLE_LUT:
>  		value = 1;
>  		break;
> +	case I915_PARAM_ELLC_SIZE:
> +		value = dev_priv->ellc_size;
> +		break;
>  	default:
>  		DRM_DEBUG("Unknown parameter %d\n", param->param);
>  		return -EINVAL;
> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> index 923ed7f..56bb327 100644
> --- a/include/uapi/drm/i915_drm.h
> +++ b/include/uapi/drm/i915_drm.h
> @@ -310,6 +310,7 @@ typedef struct drm_i915_irq_wait {
>  #define I915_PARAM_HAS_PINNED_BATCHES	 24
>  #define I915_PARAM_HAS_EXEC_NO_RELOC	 25
>  #define I915_PARAM_HAS_EXEC_HANDLE_LUT   26
> +#define I915_PARAM_ELLC_SIZE		 27
>  
>  typedef struct drm_i915_getparam {
>  	int param;
> -- 
> 1.8.3
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2013-07-16  6:10 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-07-04 18:02 [PATCH 1/5] drm/i915/hsw: Set correct Haswell PTE encodings Ben Widawsky
2013-07-04 18:02 ` [PATCH 2/5] drm/i915: Define some of the eLLC magic Ben Widawsky
2013-07-13  0:02   ` Rodrigo Vivi
2013-07-14 20:36     ` Ben Widawsky
2013-07-04 18:02 ` [PATCH 3/5] drm/i915: store eLLC size Ben Widawsky
2013-07-04 18:42   ` [PATCH 3.5/5] drm/i915: Do eLLC detection earlier Ben Widawsky
2013-07-13  0:04     ` Rodrigo Vivi
2013-07-13  9:39     ` Daniel Vetter
2013-07-14 20:37       ` Ben Widawsky
2013-07-16  6:02   ` [PATCH 3/5] drm/i915: store eLLC size Daniel Vetter
2013-07-04 18:02 ` [PATCH 4/5] drm/i915: Use eLLC/LLC by default when available Ben Widawsky
2013-07-04 18:17   ` Daniel Vetter
2013-07-04 18:40     ` Ben Widawsky
2013-07-13  0:08       ` Rodrigo Vivi
2013-07-04 18:02 ` [PATCH 5/5] drm/i915: debugfs entries for [e]LLC Ben Widawsky
2013-07-04 18:14   ` Daniel Vetter
2013-07-04 18:40     ` Ben Widawsky
2013-07-04 18:43       ` Daniel Vetter
2013-07-04 18:46         ` Ben Widawsky
2013-07-09 18:35           ` Chad Versace
2013-07-09 20:16             ` Ben Widawsky
2013-07-04 18:47   ` [PATCH 6/6] drm/i915: Add a param for eLLC size Ben Widawsky
2013-07-16  6:10     ` Daniel Vetter
2013-07-13  0:11   ` [PATCH 5/5] drm/i915: debugfs entries for [e]LLC Rodrigo Vivi
2013-07-13  0:00 ` [PATCH 1/5] drm/i915/hsw: Set correct Haswell PTE encodings Rodrigo Vivi
2013-07-14 20:34   ` Ben Widawsky
2013-07-15 14:16 ` Damien Lespiau
2013-07-15 14:23 ` Damien Lespiau
2013-07-15 16:54   ` Ben Widawsky
2013-07-16  6:00     ` Daniel Vetter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox