From mboxrd@z Thu Jan 1 00:00:00 1970 From: Chris Wilson Subject: Re: [PATCH] drm/i915: Force synchronous TLB invalidations on the BLT ring for SNB+ Date: Wed, 09 Nov 2011 19:15:06 +0000 Message-ID: References: <1320859946-23391-1-git-send-email-chris@chris-wilson.co.uk> <87obwl86pv.fsf@eliezer.anholt.net> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Received: from mga03.intel.com (mga03.intel.com [143.182.124.21]) by gabe.freedesktop.org (Postfix) with ESMTP id B9DFE9E89D for ; Wed, 9 Nov 2011 11:15:14 -0800 (PST) In-Reply-To: <87obwl86pv.fsf@eliezer.anholt.net> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: intel-gfx-bounces+gcfxdi-intel-gfx=m.gmane.org@lists.freedesktop.org Errors-To: intel-gfx-bounces+gcfxdi-intel-gfx=m.gmane.org@lists.freedesktop.org To: Eric Anholt , intel-gfx@lists.freedesktop.org List-Id: intel-gfx@lists.freedesktop.org On Wed, 09 Nov 2011 10:57:00 -0800, Eric Anholt wrote: > On Wed, 9 Nov 2011 17:32:26 +0000, Chris Wilson wrote: > > We are advised that in order to workaround the TLB invalidation being > > performed asynchronously with the command streamer that we need to use > > a post-sync write along with the invalidations in the MI_FLUSH_DW so > > that all operations are complete before executing the next command in > > the ringbuffer. > > > > References: https://bugs.freedesktop.org/show_bug.cgi?id=37990 > > Signed-off-by: Chris Wilson > > Cc: Jesse Barnes > > I find the "References:" to a bug that this is not confirmed to fix to > be very misleading. Because it references a bug I think is due to TLB invalidations as the glitches look very similar to earlier TLB invalidation. Note the refer to implied by references, which I have never used to mean fixed. > > --- > > drivers/gpu/drm/i915/i915_reg.h | 11 +++++++++-- > > drivers/gpu/drm/i915/intel_ringbuffer.c | 13 +++++++++++-- > > drivers/gpu/drm/i915/intel_ringbuffer.h | 1 + > > 3 files changed, 21 insertions(+), 4 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h > > index cbf5f9f..bfda4b2 100644 > > --- a/drivers/gpu/drm/i915/i915_reg.h > > +++ b/drivers/gpu/drm/i915/i915_reg.h > > @@ -183,8 +183,15 @@ > > */ > > #define MI_LOAD_REGISTER_IMM(x) MI_INSTR(0x22, 2*x-1) > > #define MI_FLUSH_DW MI_INSTR(0x26, 1) /* for GEN6 */ > > -#define MI_INVALIDATE_TLB (1<<18) > > -#define MI_INVALIDATE_BSD (1<<7) > > +#define FLUSH_DW_USE_INDEX (1<<21) > > +#define MI_INVALIDATE_TLB (1<<18) > > +#define FLUSH_DW_SYNC_GFDT (1<<17) > > +#define FLUSH_DW_NO_WRITE (0<<14) > > +#define FLUSH_DW_WRITE_QWORD (1<<14) > > +#define FLUSH_DW_WRITE_TIMESTAMP (3<<14) > > +#define MI_INVALIDATE_BSD (1<<7) > > +#define FLUSH_DW_PPGTT (0<<2) > > +#define FLUSH_DW_GTT (1<<2) > > #define MI_BATCH_BUFFER MI_INSTR(0x30, 1) > > #define MI_BATCH_NON_SECURE (1) > > #define MI_BATCH_NON_SECURE_I965 (1<<8) > > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c > > index 3c30dba..a77858c 100644 > > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c > > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c > > @@ -1431,10 +1431,19 @@ static int blt_ring_flush(struct intel_ring_buffer *ring, > > return ret; > > > > cmd = MI_FLUSH_DW; > > - if (invalidate & I915_GEM_DOMAIN_RENDER) > > + if (invalidate & I915_GEM_DOMAIN_RENDER) { > > cmd |= MI_INVALIDATE_TLB; > > + /* In order to actually force the invalidations to take place > > + * before proceeding, we need to employ a workaround of > > + * enabling a post-sync write. > > + */ > > + cmd |= FLUSH_DW_USE_INDEX; > > + cmd |= FLUSH_DW_WRITE_TIMESTAMP; > > + } > > The commit message says this is about TLB flushes, while the comment > here talks about invalidations in general. Which is true? No the commit log doesn't say that. > > intel_ring_emit(ring, cmd); > > - intel_ring_emit(ring, 0); > > + intel_ring_emit(ring, > > + I915_GEM_TIMESTAMP_INDEX << MI_STORE_DWORD_INDEX_SHIFT | > > + FLUSH_DW_GTT); > > intel_ring_emit(ring, 0); > > My docs have "This field specifies Bits 31:3 of the Address where the > DWord or QWord will be stored. Note that the address can only be QWord > aligned, irrespective of data size." for MI_FLUSH_DW, but > MI_STORE_DWORD_INDEX_SHIFT is 2 (which is correct for > MI_STORE_DATA_IMM). The shift is to convert the index into the HWS from being a dword index into an offset, which itself is qword aligned. -Chris -- Chris Wilson, Intel Open Source Technology Centre