From: Peter Wu <peter-VTkQYDcBqhK7DlmcbJSQ7g@public.gmane.org>
To: Daniel Vetter <daniel-/w4YWyX8dFk@public.gmane.org>
Cc: Dave Airlie <airlied-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
Chris Wilson
<chris-Y6uKTt2uX1cEflXRtASbqLVCufUGDwFn@public.gmane.org>,
dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org,
nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
Subject: Re: [PATCH] drm/nouveau/fbcon: fix deadlock with FBIOPUT_CON2FBMAP
Date: Fri, 15 Jul 2016 13:10:51 +0200 [thread overview]
Message-ID: <20160715111051.GA2632@al> (raw)
In-Reply-To: <20160713145718.GP23520-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org>
On Wed, Jul 13, 2016 at 04:57:19PM +0200, Daniel Vetter wrote:
> On Wed, Jul 13, 2016 at 02:40:50PM +0200, Peter Wu wrote:
> > On Wed, Jul 13, 2016 at 11:54:49AM +0200, Daniel Vetter wrote:
> > > On Tue, Jul 12, 2016 at 06:49:34PM +0200, Peter Wu wrote:
> > > > The FBIOPUT_CON2FBMAP ioctl takes a console_lock(). When this is called
> > > > while nouveau was runtime suspended, a deadlock would occur due to
> > > > nouveau_fbcon_set_suspend also trying to obtain console_lock().
> > > >
> > > > Fix this by delaying the drm_fb_helper_set_suspend call. Based on the
> > > > i915 code (which was done for performance reasons though).
> > > >
> > > > Cc: Chris Wilson <chris@chris-wilson.co.uk>
> > > > Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> > > > Signed-off-by: Peter Wu <peter@lekensteyn.nl>
> > > > ---
> > > > Tested on top of v4.7-rc5, the deadlock is gone.
> > >
> > > If we bother with this, it should imo be moved into the drm_fb_helper.c
> > > function drm_fb_helper_set_suspend(). But this also smells like some kind
> > > of bad duct-tape. I think Lukas is working on some other rpm vs. fbdev
> > > deadlocks, maybe we could fix them all with one proper fix? I've made some
> > > comments on Lukas' last patch series.
> >
> > This patch is only needed for drivers that use console_lock (for
> > drm_fb_helper_set_suspend) in their runtime resume functions.
> > Lukas posted fixes for runtime PM reference leaks, those are different
> > from this deadlock (see
> > https://lists.freedesktop.org/archives/dri-devel/2016-July/113005.html
> > for a backtrace for this issue).
> >
> > The deadlock could also be avoided if the device backing the fbcon is
> > somehow runtime-resumed outside the lock, but that feels like a larger
> > hack that does not seem easy.
> >
> > The i915 patch was done to reduce resume time (due to console_lock
> > contention), that feature seems useful to all other drivers too even if
> > the deadlock is fixed in a different way.
>
> I might have imagined something, but I thought Lukas is already working on
> some rpm vs. vga_switcheroo inversions. But looking again this was on the
> audio side.
>
> I think the proper solution for fbcon would be for the fbdev emulation to
> also hold a runtime pm references when the console is in use.
nouveau does this by adding a fb_open and fb_release function that calls
pm_runtime_get and pm_runtime_put respectively (and is the only driver
doing this). So that is why it causes the console_lock issue for
nouveau, but not for others.
> This should already happen correctly for vblank, the more tricky part
> is fbdev mmap and fbcon:
>
> - I have no idea, but there should be a way to intercept fbdev userspace
> mmaps and make sure that as long as an app has the fbdev mmapped we
> don't runtime suspend. No one really should be doing that (at least for
> normal setups), hence this shouldn't result in unsafe.
mmap normally needs a fd right? When the chardev /dev/fbX is opened, the
fb_open function will be called. So nouveau should not have this issue
with mmap/read/write to a fb while the device is suspended.
(This RPM thing was added in f231976c2e89 ("drm/nouveau/fbcon: take
runpm reference when userspace has an open fd"), maybe it is not a bad
idea for other drivers with RPM support either.)
> - For fbcon, we could suspend it in the dpms off callbacks (maybe with a
> timer), and resume it only when enabling fbcon again - fbcon needs to
> redraw anyway on dpms on.
Would this guarantee that the fb is suspended before/during suspend (of
the graphics device) and resumed somewhere during/after resume?
Suspending the fb also has the effect that reads/writes to /dev/fbN
fail, maybe that is a bit too restricted since the framebuffer is just
accessible until the device is suspended.
(Hmm, fb_read/fb_write does not seem to do any locking...)
> Another solution for fbcon might be to untangle the suspend/resume stuff
> and protect it by something else than console_lock. But that means
> fixing up fbcon locking horror shows.
console_lock seems needed for some code down the call stack, removing it
risks some blow ups.
Some archaeology: this locking problem was introduced with 054430e773c9
("fbcon: fix locking harder"). In the past fb_set_suspend also took the
fb_info lock but that was removed in 9e769ff3f585 ("fb: avoid possible
deadlock caused by fb_set_suspend").
Peter
> > My current plan is to move stuff out of the lock and allow (just)
> > resuming the console to be delayed. Some drivers (nouveau,
> > radeon/amdgpu, i915) do unnecessary stuff under the console lock:
> >
> > - nouveau: I *think* that cleraing/setting FBINFO_HWACCEL_DISABLED
> > (nouveau_fbcon_accel_restore) is safe outside the lock as the fb is
> > already suspended before clearing/after setting the flag.
> > - radeon: since the console is suspended, I don't think that that all
> > of the code is radeon_resume_kms is really needed.
> > - amdgpu: same as radeon. Btw, console_lock is leaked on an error path.
> > - i915: I think that clearing the fb memory can be done outside the
> > lock too as the console is suspended.
> >
> > Please correct me if my assumptions are flawed.
>
> Yeah, fixing this independent issues should definitely help, irrespective
> of how we fix fb_set_suspend.
>
> > > Besides this, when fixing a deadlock pls provide more details about the
> > > precise callchain and the locks involved in the deadlock. If you
> > > discovered this using lockdep, then just add the entire lockdep splat to
> > > the commit message. Otherwise there's lots of guesswork involved here.
> > > -Daniel
> >
> > There was no lockdep splat, it was triggered via the ioctl in the commit
> > message. I'll include the verbose trace from the previous mail in the
> > next proposed patch to reduce hunting though.
>
> Sounds good too.
> -Daniel
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau
next prev parent reply other threads:[~2016-07-15 11:10 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-07-12 16:49 [PATCH] drm/nouveau/fbcon: fix deadlock with FBIOPUT_CON2FBMAP Peter Wu
[not found] ` <20160712164934.1390-1-peter-VTkQYDcBqhK7DlmcbJSQ7g@public.gmane.org>
2016-07-12 19:16 ` Lukas Wunner
[not found] ` <20160712191622.GA5476-JFq808J9C/izQB+pC5nmwQ@public.gmane.org>
2016-07-12 20:18 ` Peter Wu
2016-07-13 9:54 ` Daniel Vetter
[not found] ` <20160713095449.GZ23520-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org>
2016-07-13 12:40 ` Peter Wu
2016-07-13 14:57 ` Daniel Vetter
[not found] ` <20160713145718.GP23520-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org>
2016-07-15 11:10 ` Peter Wu [this message]
2016-07-18 6:48 ` Daniel Vetter
2016-07-13 17:17 ` Chris Wilson
[not found] ` <20160713171747.GM6157-aII6DKEyn0pWYbfKqPwjAkR8Iwp7RQ6xAL8bYrjMMd8@public.gmane.org>
2016-07-15 11:26 ` Peter Wu
2016-07-15 11:34 ` Chris Wilson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160715111051.GA2632@al \
--to=peter-vtkqydcbqhk7dlmcbjsq7g@public.gmane.org \
--cc=airlied-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
--cc=chris-Y6uKTt2uX1cEflXRtASbqLVCufUGDwFn@public.gmane.org \
--cc=daniel-/w4YWyX8dFk@public.gmane.org \
--cc=dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org \
--cc=nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.