[ 27/27] tty: dont deadlock while flushing workqueue

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org,
	Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
	Alan Cox <alan@linux.intel.com>,
	Peter Hurley <peter@hurleysoftware.com>,
	Bryan ODonoghue <bryan.odonoghue.lkml@nexus-software.ie>
Subject: [ 27/27] tty: dont deadlock while flushing workqueue
Date: Sun, 14 Apr 2013 19:43:26 -0700	[thread overview]
Message-ID: <20130415024233.376898638@linuxfoundation.org> (raw)
In-Reply-To: <20130415024231.351969241@linuxfoundation.org>

3.8-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

commit 852e4a8152b427c3f318bb0e1b5e938d64dcdc32 upstream.

Since commit 89c8d91e31f2 ("tty: localise the lock") I see a dead lock
in one of my dummy_hcd + g_nokia test cases. The first run was usually
okay, the second often resulted in a splat by lockdep and the third was
usually a dead lock.
Lockdep complained about tty->hangup_work and tty->legacy_mutex taken
both ways:
| ======================================================
| [ INFO: possible circular locking dependency detected ]
| 3.7.0-rc6+ #204 Not tainted
| -------------------------------------------------------
| kworker/2:1/35 is trying to acquire lock:
|  (&tty->legacy_mutex){+.+.+.}, at: [<c14051e6>] tty_lock_nested+0x36/0x80
|
| but task is already holding lock:
|  ((&tty->hangup_work)){+.+...}, at: [<c104f6e4>] process_one_work+0x124/0x5e0
|
| which lock already depends on the new lock.
|
| the existing dependency chain (in reverse order) is:
|
| -> #2 ((&tty->hangup_work)){+.+...}:
|        [<c107fe74>] lock_acquire+0x84/0x190
|        [<c104d82d>] flush_work+0x3d/0x240
|        [<c12e6986>] tty_ldisc_flush_works+0x16/0x30
|        [<c12e7861>] tty_ldisc_release+0x21/0x70
|        [<c12e0dfc>] tty_release+0x35c/0x470
|        [<c1105e28>] __fput+0xd8/0x270
|        [<c1105fcd>] ____fput+0xd/0x10
|        [<c1051dd9>] task_work_run+0xb9/0xf0
|        [<c1002a51>] do_notify_resume+0x51/0x80
|        [<c140550a>] work_notifysig+0x35/0x3b
|
| -> #1 (&tty->legacy_mutex/1){+.+...}:
|        [<c107fe74>] lock_acquire+0x84/0x190
|        [<c140276c>] mutex_lock_nested+0x6c/0x2f0
|        [<c14051e6>] tty_lock_nested+0x36/0x80
|        [<c1405279>] tty_lock_pair+0x29/0x70
|        [<c12e0bb8>] tty_release+0x118/0x470
|        [<c1105e28>] __fput+0xd8/0x270
|        [<c1105fcd>] ____fput+0xd/0x10
|        [<c1051dd9>] task_work_run+0xb9/0xf0
|        [<c1002a51>] do_notify_resume+0x51/0x80
|        [<c140550a>] work_notifysig+0x35/0x3b
|
| -> #0 (&tty->legacy_mutex){+.+.+.}:
|        [<c107f3c9>] __lock_acquire+0x1189/0x16a0
|        [<c107fe74>] lock_acquire+0x84/0x190
|        [<c140276c>] mutex_lock_nested+0x6c/0x2f0
|        [<c14051e6>] tty_lock_nested+0x36/0x80
|        [<c140523f>] tty_lock+0xf/0x20
|        [<c12df8e4>] __tty_hangup+0x54/0x410
|        [<c12dfcb2>] do_tty_hangup+0x12/0x20
|        [<c104f763>] process_one_work+0x1a3/0x5e0
|        [<c104fec9>] worker_thread+0x119/0x3a0
|        [<c1055084>] kthread+0x94/0xa0
|        [<c140ca37>] ret_from_kernel_thread+0x1b/0x28
|
|other info that might help us debug this:
|
|Chain exists of:
|  &tty->legacy_mutex --> &tty->legacy_mutex/1 --> (&tty->hangup_work)
|
| Possible unsafe locking scenario:
|
|       CPU0                    CPU1
|       ----                    ----
|  lock((&tty->hangup_work));
|                               lock(&tty->legacy_mutex/1);
|                               lock((&tty->hangup_work));
|  lock(&tty->legacy_mutex);
|
| *** DEADLOCK ***

Before the path mentioned tty_ldisc_release() look like this:

|	tty_ldisc_halt(tty);
|	tty_ldisc_flush_works(tty);
|	tty_lock();

As it can be seen, it first flushes the workqueue and then grabs the
tty_lock. Now we grab the lock first:

|	tty_lock_pair(tty, o_tty);
|	tty_ldisc_halt(tty);
|	tty_ldisc_flush_works(tty);

so lockdep's complaint seems valid.

The earlier version of this patch took the ldisc_mutex since the other
user of tty_ldisc_flush_works() (tty_set_ldisc()) did this.
Peter Hurley then said that it is should not be requried. Since it
wasn't done earlier, I dropped this part.
The code under tty_ldisc_kill() was executed earlier with the tty lock
taken so it is taken again.

I was able to reproduce the deadlock on v3.8-rc1, this patch fixes the
problem in my testcase. I didn't notice any problems so far.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Alan Cox <alan@linux.intel.com>
Cc: Peter Hurley <peter@hurleysoftware.com>
Cc: Bryan O'Donoghue <bryan.odonoghue.lkml@nexus-software.ie>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/tty/tty_ldisc.c |   10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

--- a/drivers/tty/tty_ldisc.c
+++ b/drivers/tty/tty_ldisc.c
@@ -934,17 +934,17 @@ void tty_ldisc_release(struct tty_struct
 	 * race with the set_ldisc code path.
 	 */
 
-	tty_lock_pair(tty, o_tty);
 	tty_ldisc_halt(tty);
-	tty_ldisc_flush_works(tty);
-	if (o_tty) {
+	if (o_tty)
 		tty_ldisc_halt(o_tty);
+
+	tty_ldisc_flush_works(tty);
+	if (o_tty)
 		tty_ldisc_flush_works(o_tty);
-	}
 
+	tty_lock_pair(tty, o_tty);
 	/* This will need doing differently if we need to lock */
 	tty_ldisc_kill(tty);
-
 	if (o_tty)
 		tty_ldisc_kill(o_tty);

next prev parent reply	other threads:[~2013-04-15  2:44 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-04-15  2:42 [ 00/27] 3.8.8-stable review Greg Kroah-Hartman
2013-04-15  2:43 ` [ 01/27] ALSA: usb-audio: fix endianness bug in snd_nativeinstruments_* Greg Kroah-Hartman
2013-04-15  2:43 ` [ 02/27] ASoC: core: Fix to check return value of snd_soc_update_bits_locked() Greg Kroah-Hartman
2013-04-15  2:43 ` [ 03/27] ASoC: wm5102: Correct lookup of arizona struct in SYSCLK event Greg Kroah-Hartman
2013-04-15  2:43 ` [ 04/27] ASoC: wm8903: Fix the bypass to HP/LINEOUT when no DAC or ADC is running Greg Kroah-Hartman
2013-04-15  2:43 ` [ 05/27] tracing: Fix double free when function profile init failed Greg Kroah-Hartman
2013-04-15  2:43 ` [ 06/27] ARM: Kirkwood: Fix typo in the definition of ix2-200 rebuild LED Greg Kroah-Hartman
2013-04-15  2:43 ` [ 07/27] ARM: imx35 Bugfix admux clock Greg Kroah-Hartman
2013-04-15  2:43 ` [ 08/27] dmaengine: omap-dma: Start DMA without delay for cyclic channels Greg Kroah-Hartman
2013-04-15  2:43 ` [ 09/27] PM / reboot: call syscore_shutdown() after disable_nonboot_cpus() Greg Kroah-Hartman
2013-04-15  2:43 ` [ 10/27] Revert "brcmsmac: support 4313iPA" Greg Kroah-Hartman
2013-04-15  2:43 ` [ 11/27] ipc: set msg back to -EAGAIN if copy wasnt performed Greg Kroah-Hartman
2013-04-15  2:43 ` [ 12/27] GFS2: Fix unlock of fcntl locks during withdrawn state Greg Kroah-Hartman
2013-04-15  2:43 ` [ 13/27] GFS2: return error if malloc failed in gfs2_rs_alloc() Greg Kroah-Hartman
2013-04-15  2:43 ` [ 14/27] SCSI: libsas: fix handling vacant phy in sas_set_ex_phy() Greg Kroah-Hartman
2013-04-15  2:43 ` [ 15/27] cifs: Allow passwords which begin with a delimitor Greg Kroah-Hartman
2013-04-15  2:43 ` [ 16/27] target: Fix incorrect fallthrough of ALUA Standby/Offline/Transition CDBs Greg Kroah-Hartman
2013-04-15  2:43 ` [ 17/27] vfs: Revert spurious fix to spinning prevention in prune_icache_sb Greg Kroah-Hartman
2013-04-15  2:43 ` [ 18/27] kobject: fix kset_find_obj() race with concurrent last kobject_put() Greg Kroah-Hartman
2013-04-15  2:43 ` [ 19/27] gpio: fix wrong checking condition for gpio range Greg Kroah-Hartman
2013-04-15  2:43 ` [ 20/27] x86-32: Fix possible incomplete TLB invalidate with PAE pagetables Greg Kroah-Hartman
2013-04-15  2:43 ` [ 21/27] tracing: Fix possible NULL pointer dereferences Greg Kroah-Hartman
2013-04-15  2:43 ` [ 22/27] udl: handle EDID failure properly Greg Kroah-Hartman
2013-04-15  2:43 ` [ 23/27] ftrace: Move ftrace_filter_lseek out of CONFIG_DYNAMIC_FTRACE section Greg Kroah-Hartman
2013-04-15  2:43 ` [ 24/27] sched_clock: Prevent 64bit inatomicity on 32bit systems Greg Kroah-Hartman
2013-04-15  2:43 ` [ 25/27] x86, mm, paravirt: Fix vmalloc_fault oops during lazy MMU updates Greg Kroah-Hartman
2013-04-15  2:43 ` [ 26/27] x86, mm: Patch out arch_flush_lazy_mmu_mode() when running on bare metal Greg Kroah-Hartman
2013-04-15  2:43 ` Greg Kroah-Hartman [this message]
2013-04-15 19:37   ` [ 27/27] tty: dont deadlock while flushing workqueue Peter Hurley
2013-04-15 19:50     ` Greg Kroah-Hartman
2013-04-15 14:05 ` [ 00/27] 3.8.8-stable review Shuah Khan
2013-04-15 16:07   ` Greg Kroah-Hartman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130415024233.376898638@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=alan@linux.intel.com \
    --cc=bigeasy@linutronix.de \
    --cc=bryan.odonoghue.lkml@nexus-software.ie \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peter@hurleysoftware.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox