public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Daniel Vetter <daniel@ffwll.ch>
To: Greg KH <gregkh@linuxfoundation.org>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>,
	LKML <linux-kernel@vger.kernel.org>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Intel Graphics Development <intel-gfx@lists.freedesktop.org>,
	DRI Development <dri-devel@lists.freedesktop.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Alan Cox <alan@lxorguk.ukuu.org.uk>
Subject: Re: [Intel-gfx] [PATCH] console: implement lockdep support for console_lock
Date: Mon, 24 Sep 2012 13:36:46 +0200	[thread overview]
Message-ID: <20120924113646.GB3824@bremse> (raw)
In-Reply-To: <20120922200629.GC14004@kroah.com>

On Sat, Sep 22, 2012 at 01:06:29PM -0700, Greg KH wrote:
> On Sat, Sep 22, 2012 at 07:52:11PM +0200, Daniel Vetter wrote:
> > Dave Airlie recently discovered a locking bug in the fbcon layer,
> > where a timer_del_sync (for the blinking cursor) deadlocks with the
> > timer itself, since both (want to) hold the console_lock:
> > 
> > https://lkml.org/lkml/2012/8/21/36
> > 
> > Unfortunately the console_lock isn't a plain mutex and hence has no
> > lockdep support. Which resulted in a few days wasted of tracking down
> > this bug (complicated by the fact that printk doesn't show anything
> > when the console is locked) instead of noticing the bug much earlier
> > with the lockdep splat.
> > 
> > Hence I've figured I need to fix that for the next deadlock involving
> > console_lock - and with kms/drm growing ever more complex locking
> > that'll eventually happen.
> > 
> > Now the console_lock has rather funky semantics, so after a quick irc
> > discussion with Thomas Gleixner and Dave Airlie I've quickly ditched
> > the original idead of switching to a real mutex (since it won't work)
> > and instead opted to annotate the console_lock with lockdep
> > information manually.
> > 
> > There are a few special cases:
> > - The console_lock state is protected by the console_sem, and usually
> >   grabbed/dropped at _lock/_unlock time. But the suspend/resume code
> >   drops the semaphore without dropping the console_lock (see
> >   suspend_console/resume_console). But since the same thread that did
> >   the suspend will do the resume, we don't need to fix up anything.
> > 
> > - In the printk code there's a special trylock, only used to kick off
> >   the logbuffer printk'ing in console_unlock. But all that happens
> >   while lockdep is disable (since printk does a few other evil
> >   tricks). So no issue there, either.
> > 
> > - The console_lock can also be acquired form irq context (but only
> >   with a trylock). lockdep already handles that.
> > 
> > This all leaves us with annotating the normal console_lock, _unlock
> > and _trylock functions.
> > 
> > And yes, it works - simply unloading a drm kms driver resulted in
> > lockdep complaining about the deadlock in fbcon_deinit:
> > 
> > ======================================================
> > [ INFO: possible circular locking dependency detected ]
> > 3.6.0-rc2+ #552 Not tainted
> > -------------------------------------------------------
> > kms-reload/3577 is trying to acquire lock:
> >  ((&info->queue)){+.+...}, at: [<ffffffff81058c70>] wait_on_work+0x0/0xa7
> > 
> > but task is already holding lock:
> >  (console_lock){+.+.+.}, at: [<ffffffff81264686>] bind_con_driver+0x38/0x263
> > 
> > which lock already depends on the new lock.
> > 
> > the existing dependency chain (in reverse order) is:
> > 
> > -> #1 (console_lock){+.+.+.}:
> >        [<ffffffff81087440>] lock_acquire+0x95/0x105
> >        [<ffffffff81040190>] console_lock+0x59/0x5b
> >        [<ffffffff81209cb6>] fb_flashcursor+0x2e/0x12c
> >        [<ffffffff81057c3e>] process_one_work+0x1d9/0x3b4
> >        [<ffffffff810584a2>] worker_thread+0x1a7/0x24b
> >        [<ffffffff8105ca29>] kthread+0x7f/0x87
> >        [<ffffffff813b1204>] kernel_thread_helper+0x4/0x10
> > 
> > -> #0 ((&info->queue)){+.+...}:
> >        [<ffffffff81086cb3>] __lock_acquire+0x999/0xcf6
> >        [<ffffffff81087440>] lock_acquire+0x95/0x105
> >        [<ffffffff81058cab>] wait_on_work+0x3b/0xa7
> >        [<ffffffff81058dd6>] __cancel_work_timer+0xbf/0x102
> >        [<ffffffff81058e33>] cancel_work_sync+0xb/0xd
> >        [<ffffffff8120a3b3>] fbcon_deinit+0x11c/0x1dc
> >        [<ffffffff81264793>] bind_con_driver+0x145/0x263
> >        [<ffffffff81264a45>] unbind_con_driver+0x14f/0x195
> >        [<ffffffff8126540c>] store_bind+0x1ad/0x1c1
> >        [<ffffffff8127cbb7>] dev_attr_store+0x13/0x1f
> >        [<ffffffff8116d884>] sysfs_write_file+0xe9/0x121
> >        [<ffffffff811145b2>] vfs_write+0x9b/0xfd
> >        [<ffffffff811147b7>] sys_write+0x3e/0x6b
> >        [<ffffffff813b0039>] system_call_fastpath+0x16/0x1b
> > 
> > other info that might help us debug this:
> > 
> >  Possible unsafe locking scenario:
> > 
> >        CPU0                    CPU1
> >        ----                    ----
> >   lock(console_lock);
> >                                lock((&info->queue));
> >                                lock(console_lock);
> >   lock((&info->queue));
> > 
> >  *** DEADLOCK ***
> > 
> > v2: Mark the lockdep_map static, noticed by Jani Nikula.
> > 
> > Cc: Dave Airlie <airlied@gmail.com>
> > Cc: Thomas Gleixner <tglx@linutronix.de>
> > Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
> > Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
> > Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> > ---
> >  kernel/printk.c |    9 +++++++++
> >  1 file changed, 9 insertions(+)
> 
> So I'm guessing I should take this through the tty tree, right?  Any
> objections to that for 3.7?

I didn't know who would be the relevant maintainer, so just spammed a few
people. Would be awesome if you could merge these patches for 3.7, and at
least Alan Cox seems to like them:

http://marc.info/?l=linux-fbdev&m=134564125601147&w=1

Thanks, Daniel
> 
> thanks,
> 
> greg k-h

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

  reply	other threads:[~2012-09-24 11:36 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-09-17 23:03 [PATCH 1/2] [RESEND] console: use might_sleep in console_lock Daniel Vetter
2012-09-17 23:03 ` [PATCH 2/2] [RESEND] console: implement lockdep support for console_lock Daniel Vetter
2012-09-18  7:33   ` Jani Nikula
2012-09-19  7:30     ` Daniel Vetter
2012-09-22 17:52     ` [PATCH] " Daniel Vetter
2012-09-22 20:06       ` [Intel-gfx] " Greg KH
2012-09-24 11:36         ` Daniel Vetter [this message]
2012-10-02 12:56         ` Daniel Vetter
2012-10-02 13:28           ` Greg KH
2012-10-02 13:31             ` Daniel Vetter
2012-09-24 12:17   ` [PATCH 2/2] [RESEND] " Peter Zijlstra
2012-09-24 12:24     ` Peter Zijlstra
2012-09-24 12:54       ` Daniel Vetter
2012-09-24 13:10         ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120924113646.GB3824@bremse \
    --to=daniel@ffwll.ch \
    --cc=a.p.zijlstra@chello.nl \
    --cc=alan@lxorguk.ukuu.org.uk \
    --cc=daniel.vetter@ffwll.ch \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox