From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ben Widawsky Subject: Re: [PATCH] drm/i915/hsw: Disable L3 caching of atomic memory operations. Date: Wed, 2 Oct 2013 15:16:47 -0700 Message-ID: <20131002221647.GA5556@bwidawsk.net> References: <1380751423-6255-1-git-send-email-currojerez@riseup.net> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Received: from mail.bwidawsk.net (bwidawsk.net [166.78.191.112]) by gabe.freedesktop.org (Postfix) with ESMTP id 14D63E7BF3 for ; Wed, 2 Oct 2013 15:16:51 -0700 (PDT) Content-Disposition: inline In-Reply-To: <1380751423-6255-1-git-send-email-currojerez@riseup.net> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: intel-gfx-bounces+gcfxdi-intel-gfx=m.gmane.org@lists.freedesktop.org Errors-To: intel-gfx-bounces+gcfxdi-intel-gfx=m.gmane.org@lists.freedesktop.org To: Francisco Jerez Cc: intel-gfx@lists.freedesktop.org List-Id: intel-gfx@lists.freedesktop.org On Wed, Oct 02, 2013 at 03:03:43PM -0700, Francisco Jerez wrote: > Otherwise using any atomic memory operation will lock up the GPU due > to a Haswell hardware bug. This patch also defines a new DRM param so > userspace knows that atomics can be used safely. > > Signed-off-by: Francisco Jerez > --- > drivers/gpu/drm/i915/i915_dma.c | 3 +++ > drivers/gpu/drm/i915/i915_reg.h | 7 +++++++ > drivers/gpu/drm/i915/intel_pm.c | 6 ++++++ > include/uapi/drm/i915_drm.h | 1 + > 4 files changed, 17 insertions(+) > > diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c > index c27a210..e4fcb3d 100644 > --- a/drivers/gpu/drm/i915/i915_dma.c > +++ b/drivers/gpu/drm/i915/i915_dma.c > @@ -1003,6 +1003,9 @@ static int i915_getparam(struct drm_device *dev, void *data, > case I915_PARAM_HAS_EXEC_HANDLE_LUT: > value = 1; > break; > + case I915_PARAM_HAS_ATOMICS: > + value = 1; > + break; > default: > DRM_DEBUG("Unknown parameter %d\n", param->param); > return -EINVAL; > diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h > index c159e1a..611a863 100644 > --- a/drivers/gpu/drm/i915/i915_reg.h > +++ b/drivers/gpu/drm/i915/i915_reg.h > @@ -3881,6 +3881,9 @@ > #define GEN7_SQ_CHICKEN_MBCUNIT_CONFIG 0x9030 > #define GEN7_SQ_CHICKEN_MBCUNIT_SQINTMOB (1<<11) > > +#define HSW_SCRATCH1 0xb038 > +#define HSW_SCRATCH1_L3_DATA_ATOMICS_DISABLE (1<<27) > + > #define HSW_FUSE_STRAP 0x42014 > #define HSW_CDCLK_LIMIT (1 << 24) > > @@ -4728,6 +4731,10 @@ > #define GEN7_ROW_CHICKEN2_GT2 0xf4f4 > #define DOP_CLOCK_GATING_DISABLE (1<<0) > > +#define HSW_ROW_CHICKEN3 0xe49c > +#define HSW_ROW_CHICKEN3_L3_GLOBAL_ATOMICS_MASK (1 << 22) > +#define HSW_ROW_CHICKEN3_L3_GLOBAL_ATOMICS_DISABLE (1 << 6) > + You do not need the mask bit. We have macros for this: _MASKED_BIT_ENABLE > #define G4X_AUD_VID_DID (dev_priv->info->display_mmio_offset + 0x62020) > #define INTEL_AUDIO_DEVCL 0x808629FB > #define INTEL_AUDIO_DEVBLC 0x80862801 > diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c > index dd176b7..47f2b2f 100644 > --- a/drivers/gpu/drm/i915/intel_pm.c > +++ b/drivers/gpu/drm/i915/intel_pm.c > @@ -4955,6 +4955,12 @@ static void haswell_init_clock_gating(struct drm_device *dev) > I915_WRITE(GEN7_L3_CHICKEN_MODE_REGISTER, > GEN7_WA_L3_CHICKEN_MODE); > > + /* L3 caching of data atomics doesn't work -- disable it. */ > + I915_WRITE(HSW_SCRATCH1, HSW_SCRATCH1_L3_DATA_ATOMICS_DISABLE); > + I915_WRITE(HSW_ROW_CHICKEN3, > + HSW_ROW_CHICKEN3_L3_GLOBAL_ATOMICS_MASK | > + HSW_ROW_CHICKEN3_L3_GLOBAL_ATOMICS_DISABLE); > + > /* This is required by WaCatErrorRejectionIssue:hsw */ > I915_WRITE(GEN7_SQ_CHICKEN_MBCUNIT_CONFIG, > I915_READ(GEN7_SQ_CHICKEN_MBCUNIT_CONFIG) | > diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h > index 55bb572..fe0f52e 100644 > --- a/include/uapi/drm/i915_drm.h > +++ b/include/uapi/drm/i915_drm.h > @@ -335,6 +335,7 @@ typedef struct drm_i915_irq_wait { > #define I915_PARAM_HAS_EXEC_NO_RELOC 25 > #define I915_PARAM_HAS_EXEC_HANDLE_LUT 26 > #define I915_PARAM_HAS_WT 27 > +#define I915_PARAM_HAS_ATOMICS 28 > > typedef struct drm_i915_getparam { > int param; Also, AFAIK, this workaround still isn't in the bspec, so I think it's a bit hard for people to review. I'd prefer if we had the bspec updated, but since they're moving so slowly, and I have the context: Reviewed-by: Ben Widawsky -- Ben Widawsky, Intel Open Source Technology Center