All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Galbraith <efault@gmx.de>
To: Dave Jones <davej@codemonkey.org.uk>,
	Linux Kernel <linux-kernel@vger.kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Jiri Slaby <jslaby@suse.com>,
	Vegard Nossum <vegard.nossum@oracle.com>
Subject: [bisected] Re: tty lockdep trace
Date: Sat, 03 Jun 2017 08:33:52 +0200	[thread overview]
Message-ID: <1496471632.11868.1.camel@gmx.de> (raw)
In-Reply-To: <20170531172117.nj5fwgqspibg7rs6@codemonkey.org.uk>

On Wed, 2017-05-31 at 13:21 -0400, Dave Jones wrote:
> Just hit this during a trinity run.

925bb1ce47f429f69aad35876df7ecd8c53deb7e is the first bad commit
commit 925bb1ce47f429f69aad35876df7ecd8c53deb7e
Author: Vegard Nossum <vegard.nossum@oracle.com>
Date:   Thu May 11 12:18:52 2017 +0200

    tty: fix port buffer locking
    
    tty_insert_flip_string_fixed_flag() is racy against itself when called
    from the ioctl(TCXONC, TCION/TCIOFF) path [1] and the flush_to_ldisc()
    workqueue path [2].
    
    The problem is that port->buf.tail->used is modified without consistent
    locking; the ioctl path takes tty->atomic_write_lock, whereas the workqueue
    path takes ldata->output_lock.
    
    We cannot simply take ldata->output_lock, since that is specific to the
    N_TTY line discipline.
    
    It might seem natural to try to take port->buf.lock inside
    tty_insert_flip_string_fixed_flag() and friends (where port->buf is
    actually used/modified), but this creates problems for flush_to_ldisc()
    which takes it before grabbing tty->ldisc_sem, o_tty->termios_rwsem,
    and ldata->output_lock.
    
    Therefore, the simplest solution for now seems to be to take
    tty->atomic_write_lock inside tty_port_default_receive_buf(). This lock
    is also used in the write path [3] with a consistent ordering.
    
    [1]: Call Trace:
     tty_insert_flip_string_fixed_flag
     pty_write
     tty_send_xchar                     // down_read(&o_tty->termios_rwsem)
                                        // mutex_lock(&tty->atomic_write_lock)
     n_tty_ioctl_helper
     n_tty_ioctl
     tty_ioctl                          // down_read(&tty->ldisc_sem)
     do_vfs_ioctl
     SyS_ioctl
    
    [2]: Workqueue: events_unbound flush_to_ldisc
    Call Trace:
     tty_insert_flip_string_fixed_flag
     pty_write
     tty_put_char
     __process_echoes
     commit_echoes                      // mutex_lock(&ldata->output_lock)
     n_tty_receive_buf_common
     n_tty_receive_buf2
     tty_ldisc_receive_buf              // down_read(&o_tty->termios_rwsem)
     tty_port_default_receive_buf       // down_read(&tty->ldisc_sem)
     flush_to_ldisc                     // mutex_lock(&port->buf.lock)
     process_one_work
    
    [3]: Call Trace:
     tty_insert_flip_string_fixed_flag
     pty_write
     n_tty_write                        // mutex_lock(&ldata->output_lock)
                                        // down_read(&tty->termios_rwsem)
     do_tty_write (inline)              // mutex_lock(&tty->atomic_write_lock)
     tty_write                          // down_read(&tty->ldisc_sem)
     __vfs_write
     vfs_write
     SyS_write
    
    The bug can result in about a dozen different crashes depending on what
    exactly gets corrupted when port->buf.tail->used points outside the
    buffer.
    
    The patch passes my LOCKDEP/PROVE_LOCKING testing but more testing is
    always welcome.
    
    Found using syzkaller.
    
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

:040000 040000 6df9d7d2f22ff6ac1ad265869c1cb0d15621a0eb 56e5e66a5e451bf55857025899f4a092b51dca6e M      drivers

git bisect start
# good: [a351e9b9fc24e982ec2f0e76379a49826036da12] Linux 4.11
git bisect good a351e9b9fc24e982ec2f0e76379a49826036da12
# bad: [104c08ba8e921ef97abfdc10408d54921a6d9003] Merge tag 'acpi-4.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
git bisect bad 104c08ba8e921ef97abfdc10408d54921a6d9003
# good: [16a12fa9aed176444fc795b09e796be41902bb08] Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
git bisect good 16a12fa9aed176444fc795b09e796be41902bb08
# good: [2d3e4866dea96b0506395b47bfefb234f2088dac] Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
git bisect good 2d3e4866dea96b0506395b47bfefb234f2088dac
# good: [f94c128eefcce2e3448d543f13cd7d7b8aa660a5] Merge tag 'metag-for-v4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag
git bisect good f94c128eefcce2e3448d543f13cd7d7b8aa660a5
# good: [18365225f0440d09708ad9daade2ec11275c3df9] hwpoison, memcg: forcibly uncharge LRU pages
git bisect good 18365225f0440d09708ad9daade2ec11275c3df9
# good: [5bbecdbc8e7ffaaf47ac1f02014bf3bedda3fd11] nvme_fc: Support ctrl_loss_tmo
git bisect good 5bbecdbc8e7ffaaf47ac1f02014bf3bedda3fd11
# good: [d024baa58a4a7e5eb6058017771d15b9e47b56db] Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect good d024baa58a4a7e5eb6058017771d15b9e47b56db
# bad: [393bcfaeb8be7f46a4cd7d673e33541ebee76b12] Merge git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending
git bisect bad 393bcfaeb8be7f46a4cd7d673e33541ebee76b12
# bad: [b0f5a8f32e8bbdaae1abb8abe2d3cbafaba57e08] kthread: fix boot hang (regression) on MIPS/OpenRISC
git bisect bad b0f5a8f32e8bbdaae1abb8abe2d3cbafaba57e08
# good: [6f68a6ae1f0ea2fd87e26dc345866e928b5850a8] Merge tag 'powerpc-4.12-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
git bisect good 6f68a6ae1f0ea2fd87e26dc345866e928b5850a8
# bad: [da4c279942b05727088774df224c0734688b4cbc] serial: enable serdev support
git bisect bad da4c279942b05727088774df224c0734688b4cbc
# good: [5c9d6abed9e0a061de252a53ab687a1171502e81] serial: altera_jtaguart: adding iounmap()
git bisect good 5c9d6abed9e0a061de252a53ab687a1171502e81
# bad: [925bb1ce47f429f69aad35876df7ecd8c53deb7e] tty: fix port buffer locking
git bisect bad 925bb1ce47f429f69aad35876df7ecd8c53deb7e
# good: [11d4d32158eeaf36fe1073a4a260193d7a19ccf1] tty: ehv_bytechan: clean up init error handling
git bisect good 11d4d32158eeaf36fe1073a4a260193d7a19ccf1
# first bad commit: [925bb1ce47f429f69aad35876df7ecd8c53deb7e] tty: fix port buffer locking

[   56.727594] ======================================================
[   56.728482] WARNING: possible circular locking dependency detected
[   56.729318] 4.12.0-master #108 Tainted: G            E  
[   56.730125] ------------------------------------------------------
[   56.731024] kworker/u8:3/329 is trying to acquire lock:
[   56.731774]  (&buf->lock){+.+...}, at: [<ffffffff81423e97>] tty_buffer_flush+0x37/0xa0
[   56.732773] 
               but task is already holding lock:
[   56.733444]  (&o_tty->termios_rwsem/1){++++..}, at: [<ffffffff8141fb91>] isig+0x41/0xf0
[   56.734303] 
               which lock already depends on the new lock.

[   56.734868] 
               the existing dependency chain (in reverse order) is:
[   56.735377] 
               -> #2 (&o_tty->termios_rwsem/1){++++..}:
[   56.735799]        lock_acquire+0xbd/0x200
[   56.736071]        down_read+0x4c/0x70
[   56.736323]        n_tty_write+0x90/0x450
[   56.736584]        tty_write+0x159/0x2e0
[   56.736844]        __vfs_write+0x28/0x140
[   56.737119]        vfs_write+0xc4/0x1c0
[   56.737376]        redirected_tty_write+0x5c/0xa0
[   56.737681]        __do_readv_writev+0x2b4/0x350
[   56.737980]        do_readv_writev+0x63/0x80
[   56.738263]        vfs_writev+0x3c/0x50
[   56.738513]        do_writev+0x55/0xd0
[   56.738758]        SyS_writev+0x10/0x20
[   56.739009]        entry_SYSCALL_64_fastpath+0x1f/0xbe
[   56.739345] 
               -> #1 (&tty->atomic_write_lock){+.+.+.}:
[   56.739760]        lock_acquire+0xbd/0x200
[   56.740026]        __mutex_lock+0x75/0x8e0
[   56.740297]        mutex_lock_nested+0x1b/0x20
[   56.740585]        tty_port_default_receive_buf+0x48/0x90
[   56.740932]        flush_to_ldisc+0x7f/0xa0
[   56.741233]        process_one_work+0x1c5/0x5c0
[   56.741528]        worker_thread+0x4d/0x3c0
[   56.741800]        kthread+0x114/0x150
[   56.742046]        ret_from_fork+0x2a/0x40
[   56.742318] 
               -> #0 (&buf->lock){+.+...}:
[   56.742664]        __lock_acquire+0x13e1/0x1400
[   56.742957]        lock_acquire+0xbd/0x200
[   56.743228]        __mutex_lock+0x75/0x8e0
[   56.743496]        mutex_lock_nested+0x1b/0x20
[   56.743784]        tty_buffer_flush+0x37/0xa0
[   56.744068]        pty_flush_buffer+0x27/0x80
[   56.744462]        tty_driver_flush_buffer+0x1b/0x20
[   56.744776]        isig+0x95/0xf0
[   56.744999]        n_tty_receive_signal_char+0x1c/0x70
[   56.745338]        n_tty_receive_char_special+0x863/0xb00
[   56.745660]        n_tty_receive_buf_common+0x7ae/0xac0
[   56.746013]        n_tty_receive_buf2+0x14/0x20
[   56.746311]        tty_ldisc_receive_buf+0x23/0x50
[   56.746598]        tty_port_default_receive_buf+0x59/0x90
[   56.746960]        flush_to_ldisc+0x7f/0xa0
[   56.747237]        process_one_work+0x1c5/0x5c0
[   56.747509]        worker_thread+0x4d/0x3c0
[   56.747761]        kthread+0x114/0x150
[   56.748016]        ret_from_fork+0x2a/0x40
[   56.748300] 
               other info that might help us debug this:

[   56.748835] Chain exists of:
                 &buf->lock --> &tty->atomic_write_lock --> &o_tty->termios_rwsem/1

[   56.749581]  Possible unsafe locking scenario:

[   56.749994]        CPU0                    CPU1
[   56.750294]        ----                    ----
[   56.750564]   lock(&o_tty->termios_rwsem/1);
[   56.750855]                                lock(&tty->atomic_write_lock);
[   56.751314]                                lock(&o_tty->termios_rwsem/1);
[   56.751808]   lock(&buf->lock);
[   56.752021] 
                *** DEADLOCK ***

[   56.752380] 6 locks held by kworker/u8:3/329:
[   56.752641]  #0:  ("events_unbound"){.+.+.+}, at: [<ffffffff8108861f>] process_one_work+0x13f/0x5c0
[   56.753203]  #1:  ((&buf->work)){+.+...}, at: [<ffffffff8108861f>] process_one_work+0x13f/0x5c0
[   56.753717]  #2:  (&port->buf.lock/1){+.+...}, at: [<ffffffff81423935>] flush_to_ldisc+0x25/0xa0
[   56.754243]  #3:  (&tty->ldisc_sem){++++.+}, at: [<ffffffff81422ccf>] tty_ldisc_ref+0x1f/0x60
[   56.754748]  #4:  (&tty->atomic_write_lock){+.+.+.}, at: [<ffffffff814242b8>] tty_port_default_receive_buf+0x48/0x90
[   56.755373]  #5:  (&o_tty->termios_rwsem/1){++++..}, at: [<ffffffff8141fb91>] isig+0x41/0xf0
[   56.755875] 
               stack backtrace:
[   56.756139] CPU: 2 PID: 329 Comm: kworker/u8:3 Tainted: G            E   4.12.0-master #108
[   56.756637] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.1-0-gb3ef39f-prebuilt.qemu-project.org 04/01/2014
[   56.757327] Workqueue: events_unbound flush_to_ldisc
[   56.757629] Call Trace:
[   56.757830]  dump_stack+0x85/0xc9
[   56.758048]  print_circular_bug+0x1f9/0x207
[   56.758315]  __lock_acquire+0x13e1/0x1400
[   56.758564]  ? check_irq_usage+0x83/0xb0
[   56.758882]  lock_acquire+0xbd/0x200
[   56.759113]  ? tty_buffer_flush+0x37/0xa0
[   56.759463]  ? tty_buffer_flush+0x37/0xa0
[   56.759737]  __mutex_lock+0x75/0x8e0
[   56.759998]  ? tty_buffer_flush+0x37/0xa0
[   56.760250]  ? isig+0x5e/0xf0
[   56.760433]  ? tty_buffer_flush+0x37/0xa0
[   56.760677]  ? find_held_lock+0x35/0xa0
[   56.760957]  ? isig+0x8d/0xf0
[   56.761140]  mutex_lock_nested+0x1b/0x20
[   56.761383]  tty_buffer_flush+0x37/0xa0
[   56.761617]  pty_flush_buffer+0x27/0x80
[   56.761870]  tty_driver_flush_buffer+0x1b/0x20
[   56.762162]  isig+0x95/0xf0
[   56.762340]  n_tty_receive_signal_char+0x1c/0x70
[   56.762619]  n_tty_receive_char_special+0x863/0xb00
[   56.762960]  n_tty_receive_buf_common+0x7ae/0xac0
[   56.763250]  ? tty_ldisc_ref+0x1f/0x60
[   56.763478]  n_tty_receive_buf2+0x14/0x20
[   56.763721]  tty_ldisc_receive_buf+0x23/0x50
[   56.763983]  ? mutex_lock_nested+0x1b/0x20
[   56.764238]  tty_port_default_receive_buf+0x59/0x90
[   56.764533]  flush_to_ldisc+0x7f/0xa0
[   56.764756]  process_one_work+0x1c5/0x5c0
[   56.765075]  ? process_one_work+0x13f/0x5c0
[   56.765366]  worker_thread+0x4d/0x3c0
[   56.765590]  kthread+0x114/0x150
[   56.765787]  ? process_one_work+0x5c0/0x5c0
[   56.766094]  ? kthread_create_on_node+0x40/0x40
[   56.766374]  ret_from_fork+0x2a/0x40

  reply	other threads:[~2017-06-03  6:34 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-05-31 17:21 tty lockdep trace Dave Jones
2017-06-03  6:33 ` Mike Galbraith [this message]
2017-06-04  8:32   ` [bisected] " Greg Kroah-Hartman
2017-06-04  9:02     ` Mike Galbraith
2017-06-04 10:00       ` Vegard Nossum
2017-06-04 11:34         ` Mike Galbraith
2017-06-05 15:09           ` Alan Cox
2017-06-05 15:41             ` Mike Galbraith
2017-06-05  4:52         ` [patch] tty: fix port buffer locking V2 Mike Galbraith
2017-06-05  6:48           ` Greg Kroah-Hartman
2017-06-05  7:16             ` Mike Galbraith
2017-06-07  2:07           ` [lkp-robot] [tty] 52b03e644f: calltrace:flush_to_ldisc kernel test robot
2017-06-07  2:07             ` kernel test robot
2017-06-07  2:25             ` Mike Galbraith

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1496471632.11868.1.camel@gmx.de \
    --to=efault@gmx.de \
    --cc=davej@codemonkey.org.uk \
    --cc=gregkh@linuxfoundation.org \
    --cc=jslaby@suse.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=vegard.nossum@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.