dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
From: "Ville Syrjälä" <ville.syrjala@linux.intel.com>
To: Daniel Vetter <daniel@ffwll.ch>
Cc: daniel.vetter@ffwll.ch, michel@daenzer.net,
	linux@bernd-steinhauser.de, stable@vger.kernel.org,
	dri-devel@lists.freedesktop.org, alexander.deucher@amd.com,
	christian.koenig@amd.com, vbabka@suse.cz
Subject: Re: [PATCH 2/6] drm: Prevent vblank counter bumps > 1 with active vblank clients.
Date: Tue, 9 Feb 2016 12:07:27 +0200	[thread overview]
Message-ID: <20160209100727.GG23290@intel.com> (raw)
In-Reply-To: <20160209095638.GM11240@phenom.ffwll.local>

On Tue, Feb 09, 2016 at 10:56:38AM +0100, Daniel Vetter wrote:
> On Mon, Feb 08, 2016 at 02:13:25AM +0100, Mario Kleiner wrote:
> > This fixes a regression introduced by the new drm_update_vblank_count()
> > implementation in Linux 4.4:
> > 
> > Restrict the bump of the software vblank counter in drm_update_vblank_count()
> > to a safe maximum value of +1 whenever there is the possibility that
> > concurrent readers of vblank timestamps could be active at the moment,
> > as the current implementation of the timestamp caching and updating is
> > not safe against concurrent readers for calls to store_vblank() with a
> > bump of anything but +1. A bump != 1 would very likely return corrupted
> > timestamps to userspace, because the same slot in the cache could
> > be concurrently written by store_vblank() and read by one of those
> > readers in a non-atomic fashion and without the read-retry logic
> > detecting this collision.
> > 
> > Concurrent readers can exist while drm_update_vblank_count() is called
> > from the drm_vblank_off() or drm_vblank_on() functions or other non-vblank-
> > irq callers. However, all those calls are happening with the vbl_lock
> > locked thereby preventing a drm_vblank_get(), so the vblank refcount
> > can't increase while drm_update_vblank_count() is executing. Therefore
> > a zero vblank refcount during execution of that function signals that
> > is safe for arbitrary counter bumps if called from outside vblank irq,
> > whereas a non-zero count is not safe.
> > 
> > Whenever the function is called from vblank irq, we have to assume concurrent
> > readers could show up any time during its execution, even if the refcount
> > is currently zero, as vblank irqs are usually only enabled due to the
> > presence of readers, and because when it is called from vblank irq it
> > can't hold the vbl_lock to protect it from sudden bumps in vblank refcount.
> > Therefore also restrict bumps to +1 when the function is called from vblank
> > irq.
> > 
> > Such bumps of more than +1 can happen at other times than reenabling
> > vblank irqs, e.g., when regular vblank interrupts get delayed by more
> > than 1 frame due to long held locks, long irq off periods, realtime
> > preemption on RT kernels, or system management interrupts.
> > 
> > Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
> > Cc: <stable@vger.kernel.org> # 4.4+
> > Cc: michel@daenzer.net
> > Cc: vbabka@suse.cz
> > Cc: ville.syrjala@linux.intel.com
> > Cc: daniel.vetter@ffwll.ch
> > Cc: dri-devel@lists.freedesktop.org
> > Cc: alexander.deucher@amd.com
> > Cc: christian.koenig@amd.com
> 
> Imo this is duct-tape. If we want to fix this up properly I think we
> should just use a full-blown seqlock instead of our hand-rolled one. And
> that could handle any increment at all.

And I even fixed this [1] almost a half a year ago when I sent the
original series, but that part got held hostage to the same seqlock
argument. Perfect is the enemy of good.

[1] https://lists.freedesktop.org/archives/intel-gfx/2015-September/075879.html

> -Daniel
> 
> > ---
> >  drivers/gpu/drm/drm_irq.c | 41 +++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 41 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c
> > index bcb8528..aa2c74b 100644
> > --- a/drivers/gpu/drm/drm_irq.c
> > +++ b/drivers/gpu/drm/drm_irq.c
> > @@ -221,6 +221,47 @@ static void drm_update_vblank_count(struct drm_device *dev, unsigned int pipe,
> >  		diff = (flags & DRM_CALLED_FROM_VBLIRQ) != 0;
> >  	}
> >  
> > +	/*
> > +	 * Restrict the bump of the software vblank counter to a safe maximum
> > +	 * value of +1 whenever there is the possibility that concurrent readers
> > +	 * of vblank timestamps could be active at the moment, as the current
> > +	 * implementation of the timestamp caching and updating is not safe
> > +	 * against concurrent readers for calls to store_vblank() with a bump
> > +	 * of anything but +1. A bump != 1 would very likely return corrupted
> > +	 * timestamps to userspace, because the same slot in the cache could
> > +	 * be concurrently written by store_vblank() and read by one of those
> > +	 * readers without the read-retry logic detecting the collision.
> > +	 *
> > +	 * Concurrent readers can exist when we are called from the
> > +	 * drm_vblank_off() or drm_vblank_on() functions and other non-vblank-
> > +	 * irq callers. However, all those calls to us are happening with the
> > +	 * vbl_lock locked to prevent drm_vblank_get(), so the vblank refcount
> > +	 * can't increase while we are executing. Therefore a zero refcount at
> > +	 * this point is safe for arbitrary counter bumps if we are called
> > +	 * outside vblank irq, a non-zero count is not 100% safe. Unfortunately
> > +	 * we must also accept a refcount of 1, as whenever we are called from
> > +	 * drm_vblank_get() -> drm_vblank_enable() the refcount will be 1 and
> > +	 * we must let that one pass through in order to not lose vblank counts
> > +	 * during vblank irq off - which would completely defeat the whole
> > +	 * point of this routine.
> > +	 *
> > +	 * Whenever we are called from vblank irq, we have to assume concurrent
> > +	 * readers exist or can show up any time during our execution, even if
> > +	 * the refcount is currently zero, as vblank irqs are usually only
> > +	 * enabled due to the presence of readers, and because when we are called
> > +	 * from vblank irq we can't hold the vbl_lock to protect us from sudden
> > +	 * bumps in vblank refcount. Therefore also restrict bumps to +1 when
> > +	 * called from vblank irq.
> > +	 */
> > +	if ((diff > 1) && (atomic_read(&vblank->refcount) > 1 ||
> > +	    (flags & DRM_CALLED_FROM_VBLIRQ))) {
> > +		DRM_DEBUG_VBL("clamping vblank bump to 1 on crtc %u: diffr=%u "
> > +			      "refcount %u, vblirq %u\n", pipe, diff,
> > +			      atomic_read(&vblank->refcount),
> > +			      (flags & DRM_CALLED_FROM_VBLIRQ) != 0);
> > +		diff = 1;
> > +	}
> > +
> >  	DRM_DEBUG_VBL("updating vblank count on crtc %u:"
> >  		      " current=%u, diff=%u, hw=%u hw_last=%u\n",
> >  		      pipe, vblank->count, diff, cur_vblank, vblank->last);
> > -- 
> > 1.9.1
> > 
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch

-- 
Ville Syrjälä
Intel OTC
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

  reply	other threads:[~2016-02-09 10:07 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-08  1:13 drm vblank regression fixes for Linux 4.4+ Mario Kleiner
2016-02-08  1:13 ` [PATCH 1/6] drm: No-Op redundant calls to drm_vblank_off() Mario Kleiner
2016-02-09  9:54   ` Daniel Vetter
2016-02-09 13:27     ` Mario Kleiner
2016-02-08  1:13 ` [PATCH 2/6] drm: Prevent vblank counter bumps > 1 with active vblank clients Mario Kleiner
2016-02-09  9:56   ` Daniel Vetter
2016-02-09 10:07     ` Ville Syrjälä [this message]
2016-02-09 10:23       ` Daniel Vetter
2016-02-09 13:39         ` Mario Kleiner
2016-02-09 14:29           ` Daniel Vetter
2016-02-09 16:18             ` Mario Kleiner
2016-02-08  1:13 ` [PATCH 3/6] drm: Fix drm_vblank_pre/post_modeset regression from Linux 4.4 Mario Kleiner
2016-02-09 10:00   ` Daniel Vetter
2016-02-11 13:03   ` Vlastimil Babka
2016-02-08  1:13 ` [PATCH 4/6] drm: Fix treatment of drm_vblank_offdelay in drm_vblank_on() Mario Kleiner
2016-02-09 10:06   ` Daniel Vetter
2016-02-09 11:10     ` Ville Syrjälä
2016-02-09 13:29       ` Mario Kleiner
2016-02-09 13:41         ` Ville Syrjälä
2016-02-09 14:31           ` Daniel Vetter
2016-02-08  1:13 ` [PATCH 5/6] drm: Prevent vblank counter jumps with timestamp based update method Mario Kleiner
2016-02-09 10:09   ` Daniel Vetter
2016-02-09 13:53     ` Mario Kleiner
2016-02-09 14:11       ` Ville Syrjälä
2016-02-09 15:03         ` Daniel Vetter
2016-02-10 16:28           ` Mario Kleiner
2016-02-10 17:17             ` Daniel Vetter
2016-02-10 18:36               ` Mario Kleiner
2016-02-10 19:34                 ` Daniel Vetter
2016-02-08  1:13 ` [PATCH 6/6] drm/radeon/pm: Handle failure of drm_vblank_get Mario Kleiner
2016-02-09 10:10   ` Daniel Vetter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160209100727.GG23290@intel.com \
    --to=ville.syrjala@linux.intel.com \
    --cc=alexander.deucher@amd.com \
    --cc=christian.koenig@amd.com \
    --cc=daniel.vetter@ffwll.ch \
    --cc=daniel@ffwll.ch \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=linux@bernd-steinhauser.de \
    --cc=michel@daenzer.net \
    --cc=stable@vger.kernel.org \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).