public inbox for intel-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed
From: Daniel Vetter <daniel@ffwll.ch>
To: Chris Wilson <chris@chris-wilson.co.uk>
Cc: intel-gfx@lists.freedesktop.org,
	Jon Bloomfield <jon.bloomfield@intel.com>,
	Daniel Vetter <daniel.vetter@ffwll.ch>,
	Carsten Emde <C.Emde@osadl.org>,
	stable@vger.kernel.org
Subject: Re: [PATCH 1/2] drm/i915: Fix incoherence with fence updates on Sandybridge+
Date: Tue, 9 Jul 2013 22:43:11 +0200	[thread overview]
Message-ID: <20130709204311.GD18285@phenom.ffwll.local> (raw)
In-Reply-To: <1373388880-29646-1-git-send-email-chris@chris-wilson.co.uk>

On Tue, Jul 09, 2013 at 05:54:39PM +0100, Chris Wilson wrote:
> This hopefully fixes the root cause behind the workaround from
> 
> commit 25ff1195f8a0b3724541ae7bbe331b4296de9c06
> Author: Chris Wilson <chris@chris-wilson.co.uk>
> Date:   Thu Apr 4 21:31:03 2013 +0100
> 
>     drm/i915: Workaround incoherence between fences and LLC across multiple CPUs
> 
> Thanks to further investigation by Jon Bloomfield, he realised that
> the 64-bit register might be broken up by the hardware into two 32-bit
> writes (a problem we have encountered elsewhere). This non-atomicity
> would then cause an issue where a second thread would see a random
> content of its thread register, and so read/write into a fairly random

I think this needs a bit clarification (or I misunderstand how stuff blows
up):

"This non-atomicity can result in fences where area affected by the
fence can be any combinition of the old (start, end) and the new (start,
end) pair, depending upon how the hw executes the two 32bit writes (which
we don't know). For example for a short amount of time we can have a fence
spanning (new start, old end), with the new tiling parameters (since those
are in the first dword). If that gtt area is access right in that small window
by a 2nd cpu core can corrupt whatever's stored there (this depends upon
how the hw handles overlapping fences). Note that to actually get a bogus
fenced range in the gtt we need to steal fences, since otherwise the old
and new (start, end) pair will be the same if we only update the tiling
mode.

"We avoid this race by first disabling the fence (resetting bit 31 in the
first dword) and then (serialized by a posting read) writing the 2nd
dword. Then (again after ensuring serialization with a postin read) we can
update the first dword with the correct new value. This way we never have
a fence enabled spanning a bogus intervall."

I think the above is a theory that fits with all the observed evidence:
- fence thrashing is required
- running on multipler cpus is required

> tiled location. Breaking the operation into 3 explicit 32-bit updates
> (first disable the fence, poke the upper bits, then poke the lower bits
> and enable) ensures that, given proper serialisation between the
> 32-bit register write and the memory transfer, that the fence value is
> always consistent.
> 
> Note to self: this implies that we have multiple threads hitting the
> fence whilst it is being updated, but the sequence of steps we take to
> quiesce the memory accesses from the old and new fenced objects across
> the update should prevent anyone using the fence register during the
> update.
> 
> However, Daniel points out that the intermediate fence value may be seen
> by some other random thread as the intermediate value conflicts with its
> own fence register.
> 
> Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: Carsten Emde <C.Emde@osadl.org>
> Cc: stable@vger.kernel.org
> ---
>  drivers/gpu/drm/i915/i915_gem.c | 14 +++++++++-----
>  1 file changed, 9 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 3406c76..ce46e777 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2777,7 +2777,6 @@ static void i965_write_fence_reg(struct drm_device *dev, int reg,
>  	drm_i915_private_t *dev_priv = dev->dev_private;
>  	int fence_reg;
>  	int fence_pitch_shift;
> -	uint64_t val;
>  
>  	if (INTEL_INFO(dev)->gen >= 6) {
>  		fence_reg = FENCE_REG_SANDYBRIDGE_0;
> @@ -2787,8 +2786,11 @@ static void i965_write_fence_reg(struct drm_device *dev, int reg,
>  		fence_pitch_shift = I965_FENCE_PITCH_SHIFT;
>  	}
>  
> +	fence_reg += reg * 8;
> +
>  	if (obj) {
>  		u32 size = i915_gem_obj_ggtt_size(obj);
> +		uint64_t val;
>  
>  		val = (uint64_t)((i915_gem_obj_ggtt_offset(obj) + size - 4096) &
>  				 0xfffff000) << 32;
> @@ -2796,12 +2798,14 @@ static void i965_write_fence_reg(struct drm_device *dev, int reg,
>  		val |= (uint64_t)((obj->stride / 128) - 1) << fence_pitch_shift;
>  		if (obj->tiling_mode == I915_TILING_Y)
>  			val |= 1 << I965_FENCE_TILING_Y_SHIFT;
> -		val |= I965_FENCE_REG_VALID;
> +
> +		/* W/a incoherency with non-atomic 64-bit register updates */
> +		I915_WRITE(fence_reg + 0, (uint32_t)val); /* first we disable */

I vote for a posting read here ...

> +		I915_WRITE(fence_reg + 4, (uint32_t)(val >> 32));

... and here.

Cheers, Daniel

> +		I915_WRITE(fence_reg + 0, (uint32_t)(val | I965_FENCE_REG_VALID));
>  	} else
> -		val = 0;
> +		I915_WRITE64(fence_reg, 0);
>  
> -	fence_reg += reg * 8;
> -	I915_WRITE64(fence_reg, val);
>  	POSTING_READ(fence_reg);
>  }
>  
> -- 
> 1.8.3.2
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

  parent reply	other threads:[~2013-07-09 20:43 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-07-09 16:54 [PATCH 1/2] drm/i915: Fix incoherence with fence updates on Sandybridge+ Chris Wilson
2013-07-09 16:54 ` [PATCH 2/2] Revert "drm/i915: Workaround incoherence between fences and LLC across multiple CPUs" Chris Wilson
2013-07-09 20:44   ` Daniel Vetter
2013-07-09 20:43 ` Daniel Vetter [this message]
2013-07-10  7:30   ` [PATCH 1/2] drm/i915: Fix incoherence with fence updates on Sandybridge+ Daniel Vetter
  -- strict thread matches above, loose matches on Subject: below --
2013-07-10 12:36 Chris Wilson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130709204311.GD18285@phenom.ffwll.local \
    --to=daniel@ffwll.ch \
    --cc=C.Emde@osadl.org \
    --cc=chris@chris-wilson.co.uk \
    --cc=daniel.vetter@ffwll.ch \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=jon.bloomfield@intel.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox