From mboxrd@z Thu Jan 1 00:00:00 1970 From: Mario Kleiner Subject: Re: [PATCH 3/3] drm/i915: Improve the accuracy of get_scanout_pos on CTG+ Date: Thu, 26 Sep 2013 19:04:05 +0200 Message-ID: <52446905.1080109@gmail.com> References: <1379930527-19714-1-git-send-email-ville.syrjala@linux.intel.com> <1379930527-19714-4-git-send-email-ville.syrjala@linux.intel.com> <524254B6.4080709@tuebingen.mpg.de> <20130925081130.GJ4531@intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1"; Format="flowed" Content-Transfer-Encoding: quoted-printable Return-path: Received: from mail-ea0-f174.google.com (mail-ea0-f174.google.com [209.85.215.174]) by gabe.freedesktop.org (Postfix) with ESMTP id 06255E7E0A for ; Thu, 26 Sep 2013 10:04:05 -0700 (PDT) Received: by mail-ea0-f174.google.com with SMTP id z15so692635ead.33 for ; Thu, 26 Sep 2013 10:04:05 -0700 (PDT) In-Reply-To: <20130925081130.GJ4531@intel.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: intel-gfx-bounces+gcfxdi-intel-gfx=m.gmane.org@lists.freedesktop.org Errors-To: intel-gfx-bounces+gcfxdi-intel-gfx=m.gmane.org@lists.freedesktop.org To: =?ISO-8859-1?Q?Ville_Syrj=E4l=E4?= Cc: intel-gfx@lists.freedesktop.org, Mario Kleiner List-Id: intel-gfx@lists.freedesktop.org On 25.09.13 10:11, Ville Syrj=E4l=E4 wrote: > On Wed, Sep 25, 2013 at 05:12:54AM +0200, Mario Kleiner wrote: >> >> >> On 23.09.13 12:02, ville.syrjala@linux.intel.com wrote: >>> From: Ville Syrj=E4l=E4 >>> >>> The DSL register increments at the start of horizontal sync, so it >>> manages to miss the entire active portion of the current line. >>> >>> Improve the get_scanoutpos accuracy a bit when the scanout position is >>> close to the start or end of vblank. We can do that by double checking >>> the DSL value against the vblank status bit from ISR. >>> >>> Cc: Mario Kleiner >>> Signed-off-by: Ville Syrj=E4l=E4 >>> --- >>> drivers/gpu/drm/i915/i915_irq.c | 53 +++++++++++++++++++++++++++++++= ++++++++++ >>> 1 file changed, 53 insertions(+) >>> >>> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i91= 5_irq.c >>> index 4f74f0c..14b42d9 100644 >>> --- a/drivers/gpu/drm/i915/i915_irq.c >>> +++ b/drivers/gpu/drm/i915/i915_irq.c >>> @@ -567,6 +567,47 @@ static u32 gm45_get_vblank_counter(struct drm_devi= ce *dev, int pipe) >>> return I915_READ(reg); >>> } >>> >>> +static bool g4x_pipe_in_vblank(struct drm_device *dev, enum pipe pipe) >>> +{ >>> + struct drm_i915_private *dev_priv =3D dev->dev_private; >>> + uint32_t status; >>> + >>> + if (IS_VALLEYVIEW(dev)) { >>> + status =3D pipe =3D=3D PIPE_A ? >>> + I915_DISPLAY_PIPE_A_VBLANK_INTERRUPT : >>> + I915_DISPLAY_PIPE_B_VBLANK_INTERRUPT; >>> + >>> + return I915_READ(VLV_ISR) & status; >>> + } else if (IS_G4X(dev)) { >>> + status =3D pipe =3D=3D PIPE_A ? >>> + I915_DISPLAY_PIPE_A_VBLANK_INTERRUPT : >>> + I915_DISPLAY_PIPE_B_VBLANK_INTERRUPT; >>> + >>> + return I915_READ(ISR) & status; >>> + } else if (INTEL_INFO(dev)->gen < 7) { >>> + status =3D pipe =3D=3D PIPE_A ? >>> + DE_PIPEA_VBLANK : >>> + DE_PIPEB_VBLANK; >>> + >>> + return I915_READ(DEISR) & status; >>> + } else { >>> + switch (pipe) { >>> + default: >>> + case PIPE_A: >>> + status =3D DE_PIPEA_VBLANK_IVB; >>> + break; >>> + case PIPE_B: >>> + status =3D DE_PIPEB_VBLANK_IVB; >>> + break; >>> + case PIPE_C: >>> + status =3D DE_PIPEC_VBLANK_IVB; >>> + break; >>> + } >>> + >>> + return I915_READ(DEISR) & status; >>> + } >>> +} >>> + >>> static int i915_get_crtc_scanoutpos(struct drm_device *dev, int pipe, >>> int *vpos, int *hpos) >>> { >>> @@ -616,6 +657,18 @@ static int i915_get_crtc_scanoutpos(struct drm_dev= ice *dev, int pipe, >>> * scanout position from Display scan line register. >>> */ >>> position =3D I915_READ(PIPEDSL(pipe)) & 0x1fff; >>> + >>> + /* >>> + * The scanline counter increments at the leading edge >>> + * of hsync, ie. it completely misses the active portion >>> + * of the line. Fix up the counter at both edges of vblank >>> + * to get a more accurate picture whether we're in vblank >>> + * or not. >>> + */ >>> + in_vbl =3D g4x_pipe_in_vblank(dev, pipe); >>> + if ((in_vbl && position =3D=3D vbl_start - 1) || >>> + (!in_vbl && position =3D=3D vbl_end - 1)) >>> + position =3D (position + 1) % vtotal; >>> } else { >>> /* Have access to pixelcount since start of frame. >>> * We can split this into vertical and horizontal >>> >> >> This one i don't know. I think i can't follow the logic, but i don't >> know enough about the way the intel hw counts. >> >> Do you mean the counter increments when the scanline is over, instead of >> when it begins? > > Let me draw a picture of the scanline (not to scale): > > |XXXXXXXXXXXXX|-----|___________|---| > horiz. active horiz. sync > ^ ^ > | | > first pixel this is where the > of the line scanline counter increments > >> With this correction by +1 at the edges of vblank, the scanlines at >> vbl_start and vbl_end would be reported twice, for two successive >> scanline durations, that seems a bit weird and asymmetric to the rest of >> the scanline positions. Wouldn't it make more sense to simply always add >> 1 for a smaller overall error, given that hblank is shorter than the >> active scanout part of a scanline? > > Since the counter increments too late, drm_handle_vblank() > may get the wrong idea ie. something like this may happen: > > 1. vblank irq triggered > 2. drm_handle_vblank() gets called > 3. i915_get_crtc_scanoutpos() returns vbl_start-1 as the scanline > 4. delta_ns calculation gets confused and tries to correct for it > > Now, the correction you do for delta_ns should handle this, but > I don't like having such kludges in common code, and we can handle > it in the driver as I've demonstrated. But yeah, I suppose it can > make the error slightly less stable. > The kludges are also needed for other drivers, especially some radeon = gpu's which can fire their vblank interrupts multiple scanlines before = the vblank, and iirc nouveau with the prototype patches we had. I like = that catch-all for robustness, especially on gpu's whose behaviour is = not that well documented as in case of intel, but of course it's better = if a driver does the right thing from the start. > For some other uses (atomic page flip stuff) of the scanline position, > I definitely want this correction since I need accurate information > whether the position has passed vblank start. > Ok, i can live with that. >> Also it adds back one lock protected, therefore potentially slow, >> register read into the time critical code. > > I don't think a single register read should be _that_ slow even > with all the extra junk we do. And of course we can fix that problem. > Ja. If you make g4x_pipe_in_vblank() or maybe a helper to just return = the register and status mask, we can do the actual register read also as = a __raw read together with the raw read of scanout position regs, inside = that ktime_get() enclosed section. Then it's cheap and rt compatible and = done with that one uncore.lock lock/unlock and everybody's happy. -mario