Intel-GFX Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Daniel Vetter <daniel@ffwll.ch>
To: Daniel Vetter <daniel.vetter@ffwll.ch>,
	Keith Packard <keithp@keithp.com>,
	intel-gfx <intel-gfx@lists.freedesktop.org>
Subject: Re: [PATCH 20/43] drm/i915: swizzling support for snb/ivb
Date: Tue, 31 Jan 2012 09:42:46 +0100	[thread overview]
Message-ID: <20120131084246.GA3911@phenom.ffwll.local> (raw)
In-Reply-To: <20120131074435.GA20494@seagal.amr.corp.intel.com>

On Mon, Jan 30, 2012 at 11:44:35PM -0800, Ben Widawsky wrote:
> On Wed, Dec 14, 2011 at 01:57:17PM +0100, Daniel Vetter wrote:
> > We have to do this manually. Somebody had a Great Idea.
> > 
> > Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> 
> Okay, my configdb access isn't working, so please forgive me if my
> inline comments are not correct... I'll try to review it again when my
> access works again.
> 
> > ---
> >  drivers/gpu/drm/i915/i915_dma.c        |    2 +-
> >  drivers/gpu/drm/i915/i915_drv.c        |    4 ++-
> >  drivers/gpu/drm/i915/i915_drv.h        |    3 +-
> >  drivers/gpu/drm/i915/i915_gem.c        |   23 ++++++++++++++++++++-
> >  drivers/gpu/drm/i915/i915_gem_tiling.c |   16 +++++++++++++-
> >  drivers/gpu/drm/i915/i915_reg.h        |   33 ++++++++++++++++++++++++++++++++
> >  6 files changed, 74 insertions(+), 7 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> > index 448d5b1..4c21c67 100644
> > --- a/drivers/gpu/drm/i915/i915_dma.c
> > +++ b/drivers/gpu/drm/i915/i915_dma.c
> > @@ -1202,7 +1202,7 @@ static int i915_load_gem_init(struct drm_device *dev)
> >  	i915_gem_do_init(dev, 0, mappable_size, gtt_size - PAGE_SIZE);
> >  
> >  	mutex_lock(&dev->struct_mutex);
> > -	ret = i915_gem_init_ringbuffer(dev);
> > +	ret = i915_gem_init_hw(dev);
> >  	mutex_unlock(&dev->struct_mutex);
> >  	if (ret)
> >  		return ret;
> > diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> > index 6dd219b..f12b43e 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.c
> > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > @@ -494,7 +494,7 @@ static int i915_drm_thaw(struct drm_device *dev)
> >  		mutex_lock(&dev->struct_mutex);
> >  		dev_priv->mm.suspended = 0;
> >  
> > -		error = i915_gem_init_ringbuffer(dev);
> > +		error = i915_gem_init_hw(dev);
> >  		mutex_unlock(&dev->struct_mutex);
> >  
> >  		if (HAS_PCH_SPLIT(dev))
> > @@ -690,6 +690,8 @@ int i915_reset(struct drm_device *dev, u8 flags)
> >  			!dev_priv->mm.suspended) {
> >  		dev_priv->mm.suspended = 0;
> >  
> > +		i915_gem_init_swizzling(dev);
> > +
> >  		dev_priv->ring[RCS].init(&dev_priv->ring[RCS]);
> >  		if (HAS_BSD(dev))
> >  		    dev_priv->ring[VCS].init(&dev_priv->ring[VCS]);
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > index b46fac5..311a4e1 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -1206,7 +1206,8 @@ int __must_check i915_gem_object_set_domain(struct drm_i915_gem_object *obj,
> >  					    uint32_t read_domains,
> >  					    uint32_t write_domain);
> >  int __must_check i915_gem_object_finish_gpu(struct drm_i915_gem_object *obj);
> > -int __must_check i915_gem_init_ringbuffer(struct drm_device *dev);
> > +int __must_check i915_gem_init_hw(struct drm_device *dev);
> > +void i915_gem_init_swizzling(struct drm_device *dev);
> >  void i915_gem_cleanup_ringbuffer(struct drm_device *dev);
> >  void i915_gem_do_init(struct drm_device *dev,
> >  		      unsigned long start,
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index e995248..39459d2 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -3744,12 +3744,31 @@ i915_gem_idle(struct drm_device *dev)
> >  	return 0;
> >  }
> >  
> > +void i915_gem_init_swizzling(struct drm_device *dev)
> > +{
> > +	drm_i915_private_t *dev_priv = dev->dev_private;
> > +
> > +	if (INTEL_INFO(dev)->gen < 6 ||
> > +	    dev_priv->mm.bit_6_swizzle_x == I915_BIT_6_SWIZZLE_NONE)
> > +		return;
> > +
> > +	I915_WRITE(TILECTL, I915_READ(TILECTL) | TILECTL_SWZCTL);
> > +	if (IS_GEN6(dev))
> > +		I915_WRITE(ARB_MODE, ARB_MODE_ENABLE(ARB_MODE_SWIZZLE_SNB));
> > +	else
> > +		I915_WRITE(ARB_MODE, ARB_MODE_ENABLE(ARB_MODE_SWIZZLE_IVB));
> > +	I915_WRITE(DISP_ARB_CTL, I915_READ(DISP_ARB_CTL) |
> > +				 DISP_TILE_SURFACE_SWIZZLING);
> > +
> > +}
> >  int
> > -i915_gem_init_ringbuffer(struct drm_device *dev)
> > +i915_gem_init_hw(struct drm_device *dev)
> >  {
> >  	drm_i915_private_t *dev_priv = dev->dev_private;
> >  	int ret;
> >  
> > +	i915_gem_init_swizzling(dev);
> > +
> >  	ret = intel_init_render_ring_buffer(dev);
> >  	if (ret)
> >  		return ret;
> > @@ -3805,7 +3824,7 @@ i915_gem_entervt_ioctl(struct drm_device *dev, void *data,
> >  	mutex_lock(&dev->struct_mutex);
> >  	dev_priv->mm.suspended = 0;
> >  
> > -	ret = i915_gem_init_ringbuffer(dev);
> > +	ret = i915_gem_init_hw(dev);
> >  	if (ret != 0) {
> >  		mutex_unlock(&dev->struct_mutex);
> >  		return ret;
> > diff --git a/drivers/gpu/drm/i915/i915_gem_tiling.c b/drivers/gpu/drm/i915/i915_gem_tiling.c
> > index 861223b..af0a2fc 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_tiling.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_tiling.c
> > @@ -93,8 +93,20 @@ i915_gem_detect_bit_6_swizzle(struct drm_device *dev)
> >  	uint32_t swizzle_y = I915_BIT_6_SWIZZLE_UNKNOWN;
> >  
> >  	if (INTEL_INFO(dev)->gen >= 6) {
> > -		swizzle_x = I915_BIT_6_SWIZZLE_NONE;
> > -		swizzle_y = I915_BIT_6_SWIZZLE_NONE;
> > +		uint32_t dimm_c0, dimm_c1;
> > +		dimm_c0 = I915_READ(MAD_DIMM_C0);
> > +		dimm_c1 = I915_READ(MAD_DIMM_C1);
> > +		dimm_c0 &= MAD_DIMM_A_SIZE_MASK | MAD_DIMM_A_SIZE_MASK;
> > +		dimm_c1 &= MAD_DIMM_A_SIZE_MASK | MAD_DIMM_A_SIZE_MASK;
> 
> That doesn't look right to me. Presumably you meant
> MAD_DIMM_B_SIZE_MASK. Is it safe to ignore the others bits too, like
> dual rank, and width? Also without configdb, I'm somewhat confused why
> you don't read c2.

Yep to dimm_B instead of A. For the other bits I'm under the impression
that they're just used to select the best swizzling/interleaving within a
dimm, so shouldn't really matter for our purpose of selecting swizzling
between channels. For that I think we just want equally-sized channels.

I don't check for channel 2 because that only works on high-end 3-channel
cpus that don't ship with a gpu. All the swizzling patterns presume that
we have 2 channels only, anyway.

> 
> > +		/* Enable swizzling when the channels are populated with
> > +		 * identically sized dimms. */
> > +		if (dimm_c0 == dimm_c1) {
> > +			swizzle_x = I915_BIT_6_SWIZZLE_9_10;
> > +			swizzle_y = I915_BIT_6_SWIZZLE_9;
> > +		} else {
> > +			swizzle_x = I915_BIT_6_SWIZZLE_NONE;
> > +			swizzle_y = I915_BIT_6_SWIZZLE_NONE;
> > +		}
> >  	} else if (IS_GEN5(dev)) {
> >  		/* On Ironlake whatever DRAM config, GPU always do
> >  		 * same swizzling setup.
> > diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> > index 8a9f113..e810723 100644
> > --- a/drivers/gpu/drm/i915/i915_reg.h
> > +++ b/drivers/gpu/drm/i915/i915_reg.h
> > @@ -295,6 +295,12 @@
> >  #define FENCE_REG_SANDYBRIDGE_0		0x100000
> >  #define   SANDYBRIDGE_FENCE_PITCH_SHIFT	32
> >  
> > +/* control register for cpu gtt access */
> > +#define TILECTL				0x101000
> > +#define   TILECTL_SWZCTL			(1 << 0)
> > +#define   TILECTL_TLB_PREFETCH_DIS	(1 << 2)
> > +#define   TILECTL_BACKSNOOP_DIS		(1 << 3)
> > +
> >  /*
> >   * Instruction and interrupt control regs
> >   */
> > @@ -318,6 +324,11 @@
> >  #define RING_MAX_IDLE(base)	((base)+0x54)
> >  #define RING_HWS_PGA(base)	((base)+0x80)
> >  #define RING_HWS_PGA_GEN6(base)	((base)+0x2080)
> > +#define ARB_MODE		0x04030
> > +#define   ARB_MODE_SWIZZLE_SNB	(1<<4)
> > +#define   ARB_MODE_SWIZZLE_IVB	(1<<5)
> > +#define   ARB_MODE_ENABLE(x)	GFX_MODE_ENABLE(x)
> > +#define   ARB_MODE_DISABLE(x)	GFX_MODE_DISABLE(x)
> 
> Hurray for the designers reusing bits again!
> 
> >  #define RENDER_HWS_PGA_GEN7	(0x04080)
> >  #define BSD_HWS_PGA_GEN7	(0x04180)
> >  #define BLT_HWS_PGA_GEN7	(0x04280)
> > @@ -1034,6 +1045,28 @@
> >  #define C0DRB3			0x10206
> >  #define C1DRB3			0x10606
> >  
> > +/** snb MCH registers for reading the DRAM channel configuration */
> > +#define MAD_DIMM_C0			(MCHBAR_MIRROR_BASE_SNB + 0x5004)
> > +#define   MAD_DIMM_C1			(MCHBAR_MIRROR_BASE_SNB + 0x5008)
> > +#define   MAD_DIMM_C2			(MCHBAR_MIRROR_BASE_SNB + 0x500C)
> > +#define   MAD_DIMM_ECC_MASK		(0x3 << 24)
> > +#define   MAD_DIMM_ECC_OFF		(0x0 << 24)
> > +#define   MAD_DIMM_ECC_IO_ON_LOGIC_OFF	(0x1 << 24)
> > +#define   MAD_DIMM_ECC_IO_OFF_LOGIC_ON	(0x2 << 24)
> > +#define   MAD_DIMM_ECC_ON		(0x3 << 24)
> > +#define   MAD_DIMM_ENH_INTERLEAVE	(0x1 << 22)
> > +#define   MAD_DIMM_RANK_INTERLEAVE	(0x1 << 21)
> > +#define   MAD_DIMM_B_WIDTH_X16		(0x1 << 20) /* X8 chips if unset */
> > +#define   MAD_DIMM_A_WIDTH_X16		(0x1 << 19) /* X8 chips if unset */
> > +#define   MAD_DIMM_B_DUAL_RANK		(0x1 << 18)
> > +#define   MAD_DIMM_A_DUAL_RANK		(0x1 << 17)
> > +#define   MAD_DIMM_A_SELECT		(0x1 << 16)
> > +#define   MAD_DIMM_B_SIZE_MASK		(0xff << 8) /* in multiples of 256mb */
> > +#define   MAD_DIMM_B_SIZE_SHIFT		8
> > +#define   MAD_DIMM_A_SIZE_MASK		(0xff << 0) /* in multiples of 256mb */
> > +#define   MAD_DIMM_A_SIZE_SHIFT		8
> > +
> > +
> 
> This also looks less than correct. Seems like MAD_DIMM_A_SIZE_SHIFT
> should be 0, and you should use those defines in the MASK definitions.

Agreed, I'll fix this up.
-Daniel
-- 
Daniel Vetter
Mail: daniel@ffwll.ch
Mobile: +41 (0)79 365 57 48

  reply	other threads:[~2012-01-31  8:42 UTC|newest]

Thread overview: 115+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-12-14 12:56 [PATCH 01/43] drm/i915: kicking rings stuck on semaphores considered harmful Daniel Vetter
2011-12-14 12:56 ` [PATCH 02/43] drm/i915: don't bail out of intel_wait_ring_buffer too early Daniel Vetter
2011-12-14 18:39   ` Eugeni Dodonov
2011-12-14 12:57 ` [PATCH 03/43] drm/i915: switch ring->id to be a real id Daniel Vetter
2011-12-14 18:42   ` Eugeni Dodonov
2012-01-29 16:40     ` Daniel Vetter
2011-12-14 12:57 ` [PATCH 04/43] drm/i915: refactor ring error state capture to use arrays Daniel Vetter
2011-12-14 18:43   ` Eugeni Dodonov
2012-01-29 16:44     ` Daniel Vetter
2011-12-14 12:57 ` [PATCH 05/43] drm/i915: collect more per ring error state Daniel Vetter
2011-12-14 18:43   ` Eugeni Dodonov
2012-01-29 16:48     ` Daniel Vetter
2011-12-14 12:57 ` [PATCH 06/43] drm/i915: protect force_wake_(get|put) with the gt_lock Daniel Vetter
2012-01-03 18:51   ` Keith Packard
2012-01-03 19:12     ` Daniel Vetter
2012-01-03 21:13       ` Keith Packard
2012-01-03 21:49         ` Ben Widawsky
2012-01-03 22:23           ` Chris Wilson
2012-01-03 21:49         ` Daniel Vetter
2012-01-03 23:33           ` Keith Packard
2012-01-04 17:11             ` Daniel Vetter
2012-01-04 17:54               ` Keith Packard
2012-01-04 18:12                 ` Daniel Vetter
2012-01-05  2:22                   ` Keith Packard
2012-01-05 11:29                     ` Daniel Vetter
2012-01-05 15:49                       ` Keith Packard
2012-01-05 16:59                         ` Daniel Vetter
2012-01-06  0:29                           ` Keith Packard
2012-01-06  5:41                             ` Keith Packard
2012-01-06 20:43                               ` Keith Packard
2011-12-14 12:57 ` [PATCH 07/43] drm/i915: convert force_wake_get to func pointer in the gpu reset code Daniel Vetter
2011-12-14 12:57 ` [PATCH 08/43] drm/i915: drop register special-casing in forcewake Daniel Vetter
2011-12-14 15:05   ` Chris Wilson
2011-12-15 10:21     ` Daniel Vetter
2011-12-15 10:44       ` Chris Wilson
2011-12-22  0:28         ` [PATCH] drm/i915: clear up I915_(READ|WRITE)_NOTRACE confusion Daniel Vetter
2011-12-22 17:54           ` Keith Packard
2011-12-22 18:16             ` Daniel Vetter
2011-12-14 12:57 ` [PATCH 09/43] drm/i915: introduce a vtable for gpu core functions Daniel Vetter
2011-12-14 15:06   ` Chris Wilson
2011-12-21 20:38     ` Daniel Vetter
2011-12-14 18:58   ` Kenneth Graunke
2011-12-14 12:57 ` [PATCH 10/43] drm/i915/ringbuffer: kill snb blt workaround Daniel Vetter
2012-01-29 16:52   ` Daniel Vetter
2011-12-14 12:57 ` [PATCH 11/43] drm/i915: Separate fence pin counting from normal bind pin counting Daniel Vetter
2012-01-29 16:56   ` Daniel Vetter
2011-12-14 12:57 ` [PATCH 12/43] drm/i915: don't trash the gtt when running out of fences Daniel Vetter
2011-12-14 15:09   ` Chris Wilson
2012-01-29 16:57     ` Daniel Vetter
2011-12-14 12:57 ` [PATCH 13/43] drm/i915: refactor debugfs open function Daniel Vetter
2011-12-14 15:10   ` Chris Wilson
2011-12-14 18:36   ` Eugeni Dodonov
2012-01-29 17:28   ` Daniel Vetter
2011-12-14 12:57 ` [PATCH 14/43] drm/i915: refactor debugfs create functions Daniel Vetter
2011-12-14 18:44   ` Eugeni Dodonov
2011-12-14 12:57 ` [PATCH 15/43] drm/i915: add interface to simulate gpu hangs Daniel Vetter
2011-12-14 19:00   ` Eugeni Dodonov
2011-12-14 12:57 ` [PATCH 16/43] drm/i915: rework dev->first_error locking Daniel Vetter
2011-12-14 15:13   ` Chris Wilson
2011-12-14 18:37   ` Eugeni Dodonov
2011-12-14 12:57 ` [PATCH 17/43] drm/i915: destroy existing error_state when simulating a gpu hang Daniel Vetter
2011-12-14 18:45   ` Eugeni Dodonov
2011-12-14 12:57 ` [PATCH 18/43] drm/i915: fix swizzle detection for gen3 Daniel Vetter
2012-01-29 17:36   ` Chris Wilson
2012-01-30 20:20     ` Daniel Vetter
2011-12-14 12:57 ` [PATCH 19/43] drm/i915: add debugfs file for swizzling information Daniel Vetter
2012-01-29 17:37   ` Chris Wilson
2012-01-30 20:22     ` Daniel Vetter
2011-12-14 12:57 ` [PATCH 20/43] drm/i915: swizzling support for snb/ivb Daniel Vetter
2012-01-29 18:34   ` Chris Wilson
2012-01-31  7:44   ` Ben Widawsky
2012-01-31  8:42     ` Daniel Vetter [this message]
2011-12-14 12:57 ` [PATCH 21/43] drm/i915: add gen6+ registers to i915_swizzle_info Daniel Vetter
2011-12-14 12:57 ` [PATCH 22/43] drm/i915: prevent division by zero when asking for chipset power Daniel Vetter
2011-12-14 19:05   ` Kenneth Graunke
2011-12-14 12:57 ` [PATCH 23/43] drm/i915: multithreaded forcewake is an ivb+ feature Daniel Vetter
2011-12-14 21:07   ` Eric Anholt
2011-12-14 12:57 ` [PATCH 24/43] drm/i915: capture error_state also for stuck rings Daniel Vetter
2012-01-29 17:36   ` Daniel Vetter
2011-12-14 12:57 ` [PATCH 25/43] drm/i915: properly flush the wc buffer in pwrites to phys objects Daniel Vetter
2011-12-14 15:23   ` Chris Wilson
2011-12-14 12:57 ` [PATCH 26/43] drm/i915: Only clear the GPU domains upon a successful finish Daniel Vetter
2011-12-16 20:07   ` Eric Anholt
2012-03-01 20:40   ` Daniel Vetter
2011-12-14 12:57 ` [PATCH 27/43] drm/i915: flush overlay regfile writes Daniel Vetter
2011-12-14 15:24   ` Chris Wilson
2011-12-21 20:41     ` Daniel Vetter
2011-12-14 12:57 ` [PATCH 28/43] drm/i915: Handle unmappable buffers during error state capture Daniel Vetter
2011-12-14 18:46   ` Eugeni Dodonov
2012-01-31 19:32     ` Daniel Vetter
2011-12-14 12:57 ` [PATCH 29/43] drm/i915: remove the i915_batchbuffer_info debugfs file Daniel Vetter
2012-01-29 17:35   ` Daniel Vetter
2011-12-14 12:57 ` [PATCH 30/43] drm/i915: reject GTT domain in relocations Daniel Vetter
2012-01-29 17:38   ` Daniel Vetter
2011-12-14 12:57 ` [PATCH 31/43] drm/i915: Use kcalloc instead of kzalloc to allocate array Daniel Vetter
2011-12-14 18:48   ` Eugeni Dodonov
2011-12-14 12:57 ` [PATCH 32/43] drm/i915: Avoid using mappable space for relocation processing through the CPU Daniel Vetter
2011-12-14 12:57 ` [PATCH 33/43] drm/i915: fall through pwrite_gtt_slow to the shmem slow path Daniel Vetter
2011-12-14 12:57 ` [PATCH 34/43] drm/i915: rewrite shmem_pwrite_slow to use copy_from_user Daniel Vetter
2011-12-14 12:57 ` [PATCH 35/43] drm/i915: rewrite shmem_pread_slow to use copy_to_user Daniel Vetter
2012-01-30 22:37   ` Daniel Vetter
2011-12-14 12:57 ` [PATCH 36/43] agp/intel-gtt: export the scratch page dma address Daniel Vetter
2011-12-14 12:57 ` [PATCH 37/43] agp/intel-gtt: export the gtt pagetable iomapping Daniel Vetter
2011-12-14 12:57 ` [PATCH 38/43] drm/i915: initialization/teardown for the aliasing ppgtt Daniel Vetter
2011-12-14 12:57 ` [PATCH 39/43] drm/i915: ppgtt binding/unbinding support Daniel Vetter
2011-12-14 12:57 ` [PATCH 40/43] drm/i915: ppgtt register definitions Daniel Vetter
2011-12-14 18:58   ` Eugeni Dodonov
2011-12-14 19:01     ` Eugeni Dodonov
2011-12-14 12:57 ` [PATCH 41/43] drm/i915: ppgtt debugfs info Daniel Vetter
2011-12-14 12:57 ` [PATCH 42/43] drm/i915: per-ring fault reg Daniel Vetter
2011-12-14 19:00   ` Eugeni Dodonov
2012-01-29 22:20     ` Daniel Vetter
2011-12-14 12:57 ` [PATCH 43/43] drm/i915: enable ppgtt Daniel Vetter
2011-12-14 15:34   ` Chris Wilson
2011-12-21 20:46     ` Daniel Vetter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120131084246.GA3911@phenom.ffwll.local \
    --to=daniel@ffwll.ch \
    --cc=daniel.vetter@ffwll.ch \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=keithp@keithp.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox