From: lkml@pengaru.com
To: lkml@pengaru.com
Cc: linux-kernel <linux-kernel@vger.kernel.org>, robh@kernel.org
Subject: Re: [BUG] 4.11.0-rc3 xterm hung in D state on exit, wchan is tty_release_struct
Date: Thu, 23 Mar 2017 00:30:18 -0700 [thread overview]
Message-ID: <20170323073018.GI802@shells.gnugeneration.com> (raw)
In-Reply-To: <20170323064418.GH802@shells.gnugeneration.com>
On Wed, Mar 22, 2017 at 11:44:18PM -0700, lkml@pengaru.com wrote:
> On Wed, Mar 22, 2017 at 07:08:46PM -0700, lkml@pengaru.com wrote:
> > Hello list,
> >
> > After approximately one day day of running 4.11.0-rc3 with 7e54d9d reverted to
> > enable regular use, this happened upon destroying an xterm:
> >
> > [80817.525112] BUG: unable to handle kernel paging request at 0000000000002260
> > [80817.525239] IP: n_tty_receive_buf_common+0x68/0xab0
> > [80817.525312] PGD 0
> >
> > [80817.525387] Oops: 0000 [#1] PREEMPT SMP
> > [80817.525452] CPU: 0 PID: 9532 Comm: kworker/u4:3 Not tainted 4.11.0-rc3-00001-gc56a355 #53
> > [80817.525564] Hardware name: LENOVO 7668CTO/7668CTO, BIOS 7NETC2WW (2.22 ) 03/22/2011
> > [80817.525673] Workqueue: events_unbound flush_to_ldisc
> > [80817.525752] task: ffff967d91d80000 task.stack: ffff9add81f40000
> > [80817.525839] RIP: 0010:n_tty_receive_buf_common+0x68/0xab0
> > [80817.525917] RSP: 0018:ffff9add81f43d38 EFLAGS: 00010297
> > [80817.525992] RAX: 0000000000000000 RBX: ffff967d91c98c00 RCX: 0000000000000001
> > [80817.526035] RDX: ffff967e73bba58d RSI: ffff967e73bba48d RDI: ffff967d91c98cc0
> > [80817.526035] RBP: ffff9add81f43dd0 R08: 0000000000000001 R09: 0000000000000000
> > [80817.526035] R10: 00004980cbe001e0 R11: 0000000000000000 R12: ffff967d87aacf20
> > [80817.526035] R13: ffff967e73bba58d R14: 0000000000000001 R15: ffff967e74aa8008
> > [80817.526035] FS: 0000000000000000(0000) GS:ffff967e7bc00000(0000) knlGS:0000000000000000
> > [80817.526035] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [80817.526035] CR2: 0000000000002260 CR3: 0000000099009000 CR4: 00000000000006f0
> > [80817.526035] Call Trace:
> > [80817.526035] ? update_curr+0xbb/0x1a0
> > [80817.526035] n_tty_receive_buf2+0xf/0x20
> > [80817.526035] tty_ldisc_receive_buf+0x1d/0x50
> > [80817.526035] tty_port_default_receive_buf+0x40/0x60
> > [80817.526035] flush_to_ldisc+0x94/0xa0
> > [80817.526035] process_one_work+0x13b/0x3e0
> > [80817.526035] worker_thread+0x64/0x4a0
> > [80817.526035] kthread+0x10f/0x150
> > [80817.526035] ? process_one_work+0x3e0/0x3e0
> > [80817.526035] ? __kthread_create_on_node+0x150/0x150
> > [80817.526035] ret_from_fork+0x29/0x40
> > [80817.526035] Code: 85 70 ff ff ff e8 59 75 57 00 48 8d 83 00 02 00 00 c7 45 c8 00 00 00 00 48 89 45 98 48 8d 83 28 02 00 00 48 89 45 90 48 8b 45 b8 <48> 8b b0 60 22 00 00 48 8b 08 89 f0 29 c8 f6 83 10 01 00 00 08
> > [80817.526035] RIP: n_tty_receive_buf_common+0x68/0xab0 RSP: ffff9add81f43d38
> > [80817.526035] CR2: 0000000000002260
> > [80817.526035] ---[ end trace 640aec4765d350f2 ]---
> >
> >
> > That xterm process is stuck, and I am unable to start any new xterms, switching to virtual consoles proves useless, presumably there's an important lock held.
> >
> <snip>
>
> At a casual glance of the v4.10..v4.11-rc3 changes affecting drivers/tty, the
> commit c3485e looks suspicious to me, these hunks in particular:
>
> @@ -465,16 +465,6 @@ static void flush_to_ldisc(struct work_struct *work)
> {
> struct tty_port *port = container_of(work, struct tty_port, buf.work);
> struct tty_bufhead *buf = &port->buf;
> - struct tty_struct *tty;
> - struct tty_ldisc *disc;
> -
> - tty = READ_ONCE(port->itty);
> - if (tty == NULL)
> - return;
> -
> - disc = tty_ldisc_ref(tty);
> - if (disc == NULL)
> - return;
>
> mutex_lock(&buf->lock);
>
> @@ -504,7 +494,7 @@ static void flush_to_ldisc(struct work_struct *work)
> continue;
> }
>
> - count = receive_buf(disc, head, count);
> + count = receive_buf(port, head, count);
> if (!count)
> break;
> head->read += count;
> @@ -512,7 +502,6 @@ static void flush_to_ldisc(struct work_struct *work)
>
> mutex_unlock(&buf->lock);
>
> - tty_ldisc_deref(disc);
> }
>
> /**
>
> <snip>
>
> I'm not familiar with this code at all, but port->buf is part of port, and if
> the port is destroyed as part of the tty, then perhaps port->buf (and
> port->buf->lock) may become invalid on us without these:
>
> - tty = READ_ONCE(port->itty);
> - if (tty == NULL)
> - return;
> -
> - disc = tty_ldisc_ref(tty);
> - if (disc == NULL)
> - return;
>
> Added Rob Herring, author of c3485ee to CC list.
>
I suspect this part was a mistake:
- tty = READ_ONCE(port->itty);
- if (tty == NULL)
- return;
Note release_tty() tty->port->itty is assigned NULL before calling
tty_buffer_cancel_work():
static void release_tty(struct tty_struct *tty, int idx)
{
/* This should always be true but check for the moment */
WARN_ON(tty->index != idx);
WARN_ON(!mutex_is_locked(&tty_mutex));
if (tty->ops->shutdown)
tty->ops->shutdown(tty);
tty_free_termios(tty);
tty_driver_remove_tty(tty->driver, tty);
tty->port->itty = NULL;
if (tty->link)
tty->link->port->itty = NULL;
tty_buffer_cancel_work(tty->port);
tty_kref_put(tty->link);
tty_kref_put(tty);
}
I'm also unfamiliar with the kernel work queues, but this looks like an
intentional barrier of sorts, with the READ_ONCE atomic read of port->itty.
Maybe just an oversight while shuffling the ldisc stuff around?
Regards,
Vito Caputo
next prev parent reply other threads:[~2017-03-23 7:35 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-03-23 2:08 [BUG] 4.11.0-rc3 xterm hung in D state on exit, wchan is tty_release_struct lkml
2017-03-23 6:44 ` lkml
2017-03-23 7:30 ` lkml [this message]
2017-03-23 13:46 ` Rob Herring
2017-03-23 16:57 ` Rob Herring
2017-03-23 21:22 ` lkml
2017-03-24 13:59 ` Rob Herring
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170323073018.GI802@shells.gnugeneration.com \
--to=lkml@pengaru.com \
--cc=linux-kernel@vger.kernel.org \
--cc=robh@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.