Re: [PATCH 2/6] drm: Prevent vblank counter bumps > 1 with active vblank clients.

From: Mario Kleiner <mario.kleiner.de@gmail.com>
To: Daniel Vetter <daniel@ffwll.ch>
Cc: daniel.vetter@ffwll.ch, michel@daenzer.net,
	linux@bernd-steinhauser.de, stable@vger.kernel.org,
	dri-devel@lists.freedesktop.org, alexander.deucher@amd.com,
	christian.koenig@amd.com, vbabka@suse.cz
Subject: Re: [PATCH 2/6] drm: Prevent vblank counter bumps > 1 with active vblank clients.
Date: Tue, 9 Feb 2016 17:18:54 +0100	[thread overview]
Message-ID: <56BA116E.4010406@gmail.com> (raw)
In-Reply-To: <20160209142957.GX11240@phenom.ffwll.local>


On 02/09/2016 03:29 PM, Daniel Vetter wrote:
> On Tue, Feb 09, 2016 at 02:39:44PM +0100, Mario Kleiner wrote:
>> On 02/09/2016 11:23 AM, Daniel Vetter wrote:
>>> On Tue, Feb 09, 2016 at 12:07:27PM +0200, Ville Syrjälä wrote:
>>>> On Tue, Feb 09, 2016 at 10:56:38AM +0100, Daniel Vetter wrote:
>>>>> On Mon, Feb 08, 2016 at 02:13:25AM +0100, Mario Kleiner wrote:
>>>>>> This fixes a regression introduced by the new drm_update_vblank_count()
>>>>>> implementation in Linux 4.4:
>>>>>>
>>>>>> Restrict the bump of the software vblank counter in drm_update_vblank_count()
>>>>>> to a safe maximum value of +1 whenever there is the possibility that
>>>>>> concurrent readers of vblank timestamps could be active at the moment,
>>>>>> as the current implementation of the timestamp caching and updating is
>>>>>> not safe against concurrent readers for calls to store_vblank() with a
>>>>>> bump of anything but +1. A bump != 1 would very likely return corrupted
>>>>>> timestamps to userspace, because the same slot in the cache could
>>>>>> be concurrently written by store_vblank() and read by one of those
>>>>>> readers in a non-atomic fashion and without the read-retry logic
>>>>>> detecting this collision.
>>>>>>
>>>>>> Concurrent readers can exist while drm_update_vblank_count() is called
>>>>> >from the drm_vblank_off() or drm_vblank_on() functions or other non-vblank-
>>>>>> irq callers. However, all those calls are happening with the vbl_lock
>>>>>> locked thereby preventing a drm_vblank_get(), so the vblank refcount
>>>>>> can't increase while drm_update_vblank_count() is executing. Therefore
>>>>>> a zero vblank refcount during execution of that function signals that
>>>>>> is safe for arbitrary counter bumps if called from outside vblank irq,
>>>>>> whereas a non-zero count is not safe.
>>>>>>
>>>>>> Whenever the function is called from vblank irq, we have to assume concurrent
>>>>>> readers could show up any time during its execution, even if the refcount
>>>>>> is currently zero, as vblank irqs are usually only enabled due to the
>>>>>> presence of readers, and because when it is called from vblank irq it
>>>>>> can't hold the vbl_lock to protect it from sudden bumps in vblank refcount.
>>>>>> Therefore also restrict bumps to +1 when the function is called from vblank
>>>>>> irq.
>>>>>>
>>>>>> Such bumps of more than +1 can happen at other times than reenabling
>>>>>> vblank irqs, e.g., when regular vblank interrupts get delayed by more
>>>>>> than 1 frame due to long held locks, long irq off periods, realtime
>>>>>> preemption on RT kernels, or system management interrupts.
>>>>>>
>>>>>> Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
>>>>>> Cc: <stable@vger.kernel.org> # 4.4+
>>>>>> Cc: michel@daenzer.net
>>>>>> Cc: vbabka@suse.cz
>>>>>> Cc: ville.syrjala@linux.intel.com
>>>>>> Cc: daniel.vetter@ffwll.ch
>>>>>> Cc: dri-devel@lists.freedesktop.org
>>>>>> Cc: alexander.deucher@amd.com
>>>>>> Cc: christian.koenig@amd.com
>>>>>
>>>>> Imo this is duct-tape. If we want to fix this up properly I think we
>>>>> should just use a full-blown seqlock instead of our hand-rolled one. And
>>>>> that could handle any increment at all.
>>>>
>>>> And I even fixed this [1] almost a half a year ago when I sent the
>>>> original series, but that part got held hostage to the same seqlock
>>>> argument. Perfect is the enemy of good.
>>>>
>>>> [1] https://lists.freedesktop.org/archives/intel-gfx/2015-September/075879.html
>>>
>>> Hm yeah, that does suffer from reinventing seqlocks. But I'd prefer your
>>> patch over Mario's hack here tbh. Your patch with seqlock would be even
>>> more awesome.
>>> -Daniel
>>>
>>
>> I agree that my hack is only duct-tape. That's why the long code comment to
>> let people know under which condition they could remove it.
>>
>> Using seqlocks would be the robust long term solution. But as this is
>> supposed to be a fix for both 4.4 and 4.5 i thought that such a rewrite
>> would be too intrusive as a change, compared to this one-liner?
>>
>> The original "roll our own" seqlock look alike implementation was meant to
>> avoid/minimize taking locks, esp. with _irqsave that are taken by both
>> userspace and timing sensitive vblank irq handling code. There were various
>> locking changes since then and that advantage might have been lost already
>> quite a long time ago, so maybe switching to full seqlocks wouldn't pose
>> some new performance problems there, but i haven't looked into this.
>
> Last time I've checked we've already reinvented seqlocks completely,
> except buggy since ours can't take an increment > 1. I don't expect you'll
> be able to measure anything if we switch.
>
> Agree that it might be better to delay this for 4.6. So if you add a big
> "FIMXE: Need to replace this hack with proper seqlocks." a the top of your
> big comment (or just as a replacement for it), then
>
> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
>
> But currently it looks like this is a proper long-term solution, which it
> imo isn't.
> -Daniel
>

Ok, will do.
-mario

>
>>
>> -mario
>>
>>>>
>>>>> -Daniel
>>>>>
>>>>>> ---
>>>>>>   drivers/gpu/drm/drm_irq.c | 41 +++++++++++++++++++++++++++++++++++++++++
>>>>>>   1 file changed, 41 insertions(+)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c
>>>>>> index bcb8528..aa2c74b 100644
>>>>>> --- a/drivers/gpu/drm/drm_irq.c
>>>>>> +++ b/drivers/gpu/drm/drm_irq.c
>>>>>> @@ -221,6 +221,47 @@ static void drm_update_vblank_count(struct drm_device *dev, unsigned int pipe,
>>>>>>   		diff = (flags & DRM_CALLED_FROM_VBLIRQ) != 0;
>>>>>>   	}
>>>>>>
>>>>>> +	/*
>>>>>> +	 * Restrict the bump of the software vblank counter to a safe maximum
>>>>>> +	 * value of +1 whenever there is the possibility that concurrent readers
>>>>>> +	 * of vblank timestamps could be active at the moment, as the current
>>>>>> +	 * implementation of the timestamp caching and updating is not safe
>>>>>> +	 * against concurrent readers for calls to store_vblank() with a bump
>>>>>> +	 * of anything but +1. A bump != 1 would very likely return corrupted
>>>>>> +	 * timestamps to userspace, because the same slot in the cache could
>>>>>> +	 * be concurrently written by store_vblank() and read by one of those
>>>>>> +	 * readers without the read-retry logic detecting the collision.
>>>>>> +	 *
>>>>>> +	 * Concurrent readers can exist when we are called from the
>>>>>> +	 * drm_vblank_off() or drm_vblank_on() functions and other non-vblank-
>>>>>> +	 * irq callers. However, all those calls to us are happening with the
>>>>>> +	 * vbl_lock locked to prevent drm_vblank_get(), so the vblank refcount
>>>>>> +	 * can't increase while we are executing. Therefore a zero refcount at
>>>>>> +	 * this point is safe for arbitrary counter bumps if we are called
>>>>>> +	 * outside vblank irq, a non-zero count is not 100% safe. Unfortunately
>>>>>> +	 * we must also accept a refcount of 1, as whenever we are called from
>>>>>> +	 * drm_vblank_get() -> drm_vblank_enable() the refcount will be 1 and
>>>>>> +	 * we must let that one pass through in order to not lose vblank counts
>>>>>> +	 * during vblank irq off - which would completely defeat the whole
>>>>>> +	 * point of this routine.
>>>>>> +	 *
>>>>>> +	 * Whenever we are called from vblank irq, we have to assume concurrent
>>>>>> +	 * readers exist or can show up any time during our execution, even if
>>>>>> +	 * the refcount is currently zero, as vblank irqs are usually only
>>>>>> +	 * enabled due to the presence of readers, and because when we are called
>>>>>> +	 * from vblank irq we can't hold the vbl_lock to protect us from sudden
>>>>>> +	 * bumps in vblank refcount. Therefore also restrict bumps to +1 when
>>>>>> +	 * called from vblank irq.
>>>>>> +	 */
>>>>>> +	if ((diff > 1) && (atomic_read(&vblank->refcount) > 1 ||
>>>>>> +	    (flags & DRM_CALLED_FROM_VBLIRQ))) {
>>>>>> +		DRM_DEBUG_VBL("clamping vblank bump to 1 on crtc %u: diffr=%u "
>>>>>> +			      "refcount %u, vblirq %u\n", pipe, diff,
>>>>>> +			      atomic_read(&vblank->refcount),
>>>>>> +			      (flags & DRM_CALLED_FROM_VBLIRQ) != 0);
>>>>>> +		diff = 1;
>>>>>> +	}
>>>>>> +
>>>>>>   	DRM_DEBUG_VBL("updating vblank count on crtc %u:"
>>>>>>   		      " current=%u, diff=%u, hw=%u hw_last=%u\n",
>>>>>>   		      pipe, vblank->count, diff, cur_vblank, vblank->last);
>>>>>> --
>>>>>> 1.9.1
>>>>>>
>>>>>
>>>>> --
>>>>> Daniel Vetter
>>>>> Software Engineer, Intel Corporation
>>>>> http://blog.ffwll.ch
>>>>
>>>> --
>>>> Ville Syrjälä
>>>> Intel OTC
>>>
>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel