From mboxrd@z Thu Jan  1 00:00:00 1970
From: Mario Kleiner <mario.kleiner.de@gmail.com>
Subject: Re: [PATCH 3/3] drm/i915: Improve the accuracy of
 get_scanout_pos on CTG+
Date: Thu, 26 Sep 2013 19:04:05 +0200
Message-ID: <52446905.1080109@gmail.com>
References: <1379930527-19714-1-git-send-email-ville.syrjala@linux.intel.com>
	<1379930527-19714-4-git-send-email-ville.syrjala@linux.intel.com>
	<524254B6.4080709@tuebingen.mpg.de>
	<20130925081130.GJ4531@intel.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"; Format="flowed"
Content-Transfer-Encoding: quoted-printable
Return-path: <intel-gfx-bounces+gcfxdi-intel-gfx=m.gmane.org@lists.freedesktop.org>
Received: from mail-ea0-f174.google.com (mail-ea0-f174.google.com
	[209.85.215.174])
	by gabe.freedesktop.org (Postfix) with ESMTP id 06255E7E0A
	for <intel-gfx@lists.freedesktop.org>;
	Thu, 26 Sep 2013 10:04:05 -0700 (PDT)
Received: by mail-ea0-f174.google.com with SMTP id z15so692635ead.33
	for <intel-gfx@lists.freedesktop.org>;
	Thu, 26 Sep 2013 10:04:05 -0700 (PDT)
In-Reply-To: <20130925081130.GJ4531@intel.com>
List-Unsubscribe: <http://lists.freedesktop.org/mailman/options/intel-gfx>,
	<mailto:intel-gfx-request@lists.freedesktop.org?subject=unsubscribe>
List-Archive: <http://lists.freedesktop.org/archives/intel-gfx>
List-Post: <mailto:intel-gfx@lists.freedesktop.org>
List-Help: <mailto:intel-gfx-request@lists.freedesktop.org?subject=help>
List-Subscribe: <http://lists.freedesktop.org/mailman/listinfo/intel-gfx>,
	<mailto:intel-gfx-request@lists.freedesktop.org?subject=subscribe>
Sender: intel-gfx-bounces+gcfxdi-intel-gfx=m.gmane.org@lists.freedesktop.org
Errors-To: intel-gfx-bounces+gcfxdi-intel-gfx=m.gmane.org@lists.freedesktop.org
To: =?ISO-8859-1?Q?Ville_Syrj=E4l=E4?= <ville.syrjala@linux.intel.com>
Cc: intel-gfx@lists.freedesktop.org, Mario Kleiner <mario.kleiner@tuebingen.mpg.de>
List-Id: intel-gfx@lists.freedesktop.org

On 25.09.13 10:11, Ville Syrj=E4l=E4 wrote:
> On Wed, Sep 25, 2013 at 05:12:54AM +0200, Mario Kleiner wrote:
>>
>>
>> On 23.09.13 12:02, ville.syrjala@linux.intel.com wrote:
>>> From: Ville Syrj=E4l=E4 <ville.syrjala@linux.intel.com>
>>>
>>> The DSL register increments at the start of horizontal sync, so it
>>> manages to miss the entire active portion of the current line.
>>>
>>> Improve the get_scanoutpos accuracy a bit when the scanout position is
>>> close to the start or end of vblank. We can do that by double checking
>>> the DSL value against the vblank status bit from ISR.
>>>
>>> Cc: Mario Kleiner <mario.kleiner@tuebingen.mpg.de>
>>> Signed-off-by: Ville Syrj=E4l=E4 <ville.syrjala@linux.intel.com>
>>> ---
>>>    drivers/gpu/drm/i915/i915_irq.c | 53 +++++++++++++++++++++++++++++++=
++++++++++
>>>    1 file changed, 53 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i91=
5_irq.c
>>> index 4f74f0c..14b42d9 100644
>>> --- a/drivers/gpu/drm/i915/i915_irq.c
>>> +++ b/drivers/gpu/drm/i915/i915_irq.c
>>> @@ -567,6 +567,47 @@ static u32 gm45_get_vblank_counter(struct drm_devi=
ce *dev, int pipe)
>>>    	return I915_READ(reg);
>>>    }
>>>
>>> +static bool g4x_pipe_in_vblank(struct drm_device *dev, enum pipe pipe)
>>> +{
>>> +	struct drm_i915_private *dev_priv =3D dev->dev_private;
>>> +	uint32_t status;
>>> +
>>> +	if (IS_VALLEYVIEW(dev)) {
>>> +		status =3D pipe =3D=3D PIPE_A ?
>>> +			I915_DISPLAY_PIPE_A_VBLANK_INTERRUPT :
>>> +			I915_DISPLAY_PIPE_B_VBLANK_INTERRUPT;
>>> +
>>> +		return I915_READ(VLV_ISR) & status;
>>> +	} else if (IS_G4X(dev)) {
>>> +		status =3D pipe =3D=3D PIPE_A ?
>>> +			I915_DISPLAY_PIPE_A_VBLANK_INTERRUPT :
>>> +			I915_DISPLAY_PIPE_B_VBLANK_INTERRUPT;
>>> +
>>> +		return I915_READ(ISR) & status;
>>> +	} else if (INTEL_INFO(dev)->gen < 7) {
>>> +		status =3D pipe =3D=3D PIPE_A ?
>>> +			DE_PIPEA_VBLANK :
>>> +			DE_PIPEB_VBLANK;
>>> +
>>> +		return I915_READ(DEISR) & status;
>>> +	} else {
>>> +		switch (pipe) {
>>> +		default:
>>> +		case PIPE_A:
>>> +			status =3D DE_PIPEA_VBLANK_IVB;
>>> +			break;
>>> +		case PIPE_B:
>>> +			status =3D DE_PIPEB_VBLANK_IVB;
>>> +			break;
>>> +		case PIPE_C:
>>> +			status =3D DE_PIPEC_VBLANK_IVB;
>>> +			break;
>>> +		}
>>> +
>>> +		return I915_READ(DEISR) & status;
>>> +	}
>>> +}
>>> +
>>>    static int i915_get_crtc_scanoutpos(struct drm_device *dev, int pipe,
>>>    			     int *vpos, int *hpos)
>>>    {
>>> @@ -616,6 +657,18 @@ static int i915_get_crtc_scanoutpos(struct drm_dev=
ice *dev, int pipe,
>>>    		 * scanout position from Display scan line register.
>>>    		 */
>>>    		position =3D I915_READ(PIPEDSL(pipe)) & 0x1fff;
>>> +
>>> +		/*
>>> +		 * The scanline counter increments at the leading edge
>>> +		 * of hsync, ie. it completely misses the active portion
>>> +		 * of the line. Fix up the counter at both edges of vblank
>>> +		 * to get a more accurate picture whether we're in vblank
>>> +		 * or not.
>>> +		 */
>>> +		in_vbl =3D g4x_pipe_in_vblank(dev, pipe);
>>> +		if ((in_vbl && position =3D=3D vbl_start - 1) ||
>>> +		    (!in_vbl && position =3D=3D vbl_end - 1))
>>> +			position =3D (position + 1) % vtotal;
>>>    	} else {
>>>    		/* Have access to pixelcount since start of frame.
>>>    		 * We can split this into vertical and horizontal
>>>
>>
>> This one i don't know. I think i can't follow the logic, but i don't
>> know enough about the way the intel hw counts.
>>
>> Do you mean the counter increments when the scanline is over, instead of
>> when it begins?
>
> Let me draw a picture of the scanline (not to scale):
>
>   |XXXXXXXXXXXXX|-----|___________|---|
>    horiz. active       horiz. sync
>   ^                   ^
>   |                   |
>   first pixel         this is where the
>   of the line         scanline counter increments
>
>> With this correction by +1 at the edges of vblank, the scanlines at
>> vbl_start and vbl_end would be reported twice, for two successive
>> scanline durations, that seems a bit weird and asymmetric to the rest of
>> the scanline positions. Wouldn't it make more sense to simply always add
>> 1 for a smaller overall error, given that hblank is shorter than the
>> active scanout part of a scanline?
>
> Since the counter increments too late, drm_handle_vblank()
> may get the wrong idea ie. something like this may happen:
>
> 1. vblank irq triggered
> 2. drm_handle_vblank() gets called
> 3. i915_get_crtc_scanoutpos() returns vbl_start-1 as the scanline
> 4. delta_ns calculation gets confused and tries to correct for it
>
> Now, the correction you do for delta_ns should handle this, but
> I don't like having such kludges in common code, and we can handle
> it in the driver as I've demonstrated. But yeah, I suppose it can
> make the error slightly less stable.
>

The kludges are also needed for other drivers, especially some radeon =

gpu's which can fire their vblank interrupts multiple scanlines before =

the vblank, and iirc nouveau with the prototype patches we had. I like =

that catch-all for robustness, especially on gpu's whose behaviour is =

not that well documented as in case of intel, but of course it's better =

if a driver does the right thing from the start.

> For some other uses (atomic page flip stuff) of the scanline position,
> I definitely want this correction since I need accurate information
> whether the position has passed vblank start.
>

Ok, i can live with that.

>> Also it adds back one lock protected, therefore potentially slow,
>> register read into the time critical code.
>
> I don't think a single register read should be _that_ slow even
> with all the extra junk we do. And of course we can fix that problem.
>

Ja. If you make g4x_pipe_in_vblank() or maybe a helper to just return =

the register and status mask, we can do the actual register read also as =

a __raw read together with the raw read of scanout position regs, inside =

that ktime_get() enclosed section. Then it's cheap and rt compatible and =

done with that one uncore.lock lock/unlock and everybody's happy.

-mario