From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ville =?iso-8859-1?Q?Syrj=E4l=E4?= Subject: Re: [PATCH 2/4] drm/i915: Fix mmio vs. CS flip race on ILK+ Date: Wed, 12 Mar 2014 11:11:12 +0200 Message-ID: <20140312091112.GD20292@intel.com> References: <1394559456-7348-1-git-send-email-ville.syrjala@linux.intel.com> <1394559456-7348-3-git-send-email-ville.syrjala@linux.intel.com> <20140312083008.GE23307@nuc-i3427.alporthouse.com> Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Return-path: Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by gabe.freedesktop.org (Postfix) with ESMTP id A8BA0FAE8B for ; Wed, 12 Mar 2014 02:11:17 -0700 (PDT) Content-Disposition: inline In-Reply-To: <20140312083008.GE23307@nuc-i3427.alporthouse.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: intel-gfx-bounces@lists.freedesktop.org Errors-To: intel-gfx-bounces@lists.freedesktop.org To: Chris Wilson , intel-gfx@lists.freedesktop.org List-Id: intel-gfx@lists.freedesktop.org On Wed, Mar 12, 2014 at 08:30:08AM +0000, Chris Wilson wrote: > On Tue, Mar 11, 2014 at 07:37:34PM +0200, ville.syrjala@linux.intel.com w= rote: > > From: Ville Syrj=E4l=E4 > > = > > Starting from ILK, mmio flips also cause a flip done interrupt to be > > signalled. This means if we first do a set_base and follow it > > immediately with the CS flip, we might mistake the flip done interrupt > > caused by the set_base as the flip done interrupt caused by the CS > > flip. > > = > > The hardware has a flip counter which increments every time a mmio or > > CS flip is issued. It basically counts the number of DSPSURF register > > writes. This means we can sample the counter before we put the CS > > flip into the ring, and then when we get a flip done interrupt we can > > check whether the CS flip has actually performed the surface address > > update, or if the interrupt was caused by a previous but yet > > unfinished mmio flip. > > = > > Even with the flip counter we still have a race condition of the CS flip > > base address update happens after the mmio flip done interrupt was > > raised but not yet processed by the driver. When the interrupt is > > eventually processed, the flip counter will already indicate that the > > CS flip has been executed, but it would not actually complete until the > > next start of vblank. We can use the DSPSURFLIVE register to check > > whether the hardware is actually scanning out of the buffer we expect, > > or if we managed hit this race window. > > = > > This covers all the cases where the CS flip actually changes the base > > address. If the base address remains unchanged, we might still complete > > the CS flip before it has actually completed. But since the address > > didn't change anyway, the premature flip completion can't result in > > userspace overwriting data that's still being scanned out. > > = > > CTG already has the flip counter and DSPSURFLIVE registers, and > > although the flip done interrupt is still limited to CS flips alone, > > the code now also checks the flip counter on CTG as well. > > = > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=3D73027 > > Signed-off-by: Ville Syrj=E4l=E4 > > --- > > drivers/gpu/drm/i915/i915_reg.h | 1 + > > drivers/gpu/drm/i915/intel_display.c | 72 ++++++++++++++++++++++++++++= ++++---- > > drivers/gpu/drm/i915/intel_drv.h | 2 + > > 3 files changed, 68 insertions(+), 7 deletions(-) > > = > > diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i91= 5_reg.h > > index 146609a..0d82924 100644 > > --- a/drivers/gpu/drm/i915/i915_reg.h > > +++ b/drivers/gpu/drm/i915/i915_reg.h > > @@ -3533,6 +3533,7 @@ enum punit_power_well { > > #define _PIPEA_FRMCOUNT_GM45 (dev_priv->info.display_mmio_offset + 0x7= 0040) > > #define _PIPEA_FLIPCOUNT_GM45 (dev_priv->info.display_mmio_offset + 0x= 70044) > > #define PIPE_FRMCOUNT_GM45(pipe) _PIPE(pipe, _PIPEA_FRMCOUNT_GM45, _PI= PEB_FRMCOUNT_GM45) > > +#define PIPE_FLIPCOUNT_GM45(pipe) _PIPE(pipe, _PIPEA_FLIPCOUNT_GM45, _= PIPEB_FLIPCOUNT_GM45) > > = > > /* Cursor A & B regs */ > > #define _CURACNTR (dev_priv->info.display_mmio_offset + 0x70080) > > diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i91= 5/intel_display.c > > index 37bc041..aabc90c 100644 > > --- a/drivers/gpu/drm/i915/intel_display.c > > +++ b/drivers/gpu/drm/i915/intel_display.c > > @@ -8610,6 +8610,47 @@ void intel_finish_page_flip_plane(struct drm_dev= ice *dev, int plane) > > do_intel_finish_page_flip(dev, crtc); > > } > > = > > +/* Is 'a' after or equal to 'b'? */ > > +static bool flip_count_after_eq(u32 a, u32 b) > > +{ > > + return !((a - b) & 0x80000000); > > +} > = > Couldn't we just use the idiom of casting the result to an int, please? That only works for 32bit counters, which these are, but if you look at my latest watermark series I have a similar function for the vblank counter comparison where I also need to deal w/ 24bit counters. So in order to be consistent I used the same approach here. > = > > +static bool page_flip_finished(struct intel_crtc *crtc) > > +{ > > + struct drm_device *dev =3D crtc->base.dev; > > + struct drm_i915_private *dev_priv =3D dev->dev_private; > > + > > + /* > > + * The relevant registers doen't exist on pre-ctg. > > + * As the flip done interrupt doesn't trigger for mmio > > + * flips on gmch platforms, a flip count check isn't > > + * really needed there. But since ctg has the registers, > > + * include it in the check anyway. > > + */ > > + if (INTEL_INFO(dev)->gen < 5 && !IS_G4X(dev)) > > + return true; > > + > > + /* > > + * A DSPSURFLIVE check isn't enough in case the mmio and CS flips > > + * used the same base address. In that case the mmio flip might > > + * have completed, but the CS hasn't even executed the flip yet. > > + * > > + * A flip count check isn't enough as the CS might have updated > > + * the base address just after start of vblank, but before we > > + * managed to process the interrupt. This means we'd complete the > > + * CS flip too soon. > > + * > > + * Combining both checks should get us a good enough result. It may > > + * still happen that the CS flip has been executed, but has not > > + * yet actually completed. But in case the base address is the same > > + * anyway, we don't really care. > > + */ > > + return (I915_READ(DSPSURFLIVE(crtc->plane)) & ~0xfff) =3D=3D crtc->un= pin_work->dspsurf && > > + flip_count_after_eq(I915_READ(PIPE_FLIPCOUNT_GM45(crtc->pipe)), > > + crtc->unpin_work->flip_count); > = > Instead of dspsurf, I'd prefer if this was just a more generic gtt_offset > so that it isn't peculiar to gen4+, and can be used for the gen3 stall > checks. You'll note that I did fill it in for gen2/3 too. I did cringe at the name a bit there, but didn't bother changing it to anything else. I can rename it. > = > Other than those two nitpicks, > Reviewed-by: Chris Wilson > = > A follow-on request would be to update the pageflip stall logic, but I > can rebase my patches on to this. > -Chris > = > -- = > Chris Wilson, Intel Open Source Technology Centre -- = Ville Syrj=E4l=E4 Intel OTC