From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ed1-f67.google.com ([209.85.208.67]:32910 "EHLO mail-ed1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727245AbeH2UeH (ORCPT ); Wed, 29 Aug 2018 16:34:07 -0400 Received: by mail-ed1-f67.google.com with SMTP id h9-v6so4415764edr.0 for ; Wed, 29 Aug 2018 09:36:23 -0700 (PDT) Message-ID: <1535560580.23560.65.camel@arista.com> Subject: Re: [PATCH 2/4] tty: Hold tty_ldisc_lock() during tty_reopen() From: Dmitry Safonov To: Jiri Slaby , linux-kernel@vger.kernel.org Cc: Daniel Axtens , Dmitry Safonov <0x7f454c46@gmail.com>, Sergey Senozhatsky , Dmitry Vyukov , Tan Xiaojun , Peter Hurley , Pasi =?ISO-8859-1?Q?K=E4rkk=E4inen?= , Greg Kroah-Hartman , Michael Neuling , Mikulas Patocka , stable@vger.kernel.org Date: Wed, 29 Aug 2018 17:36:20 +0100 In-Reply-To: <914d8184-d5e6-519c-b355-7f1360cfa6a0@suse.cz> References: <20180829022353.23568-1-dima@arista.com> <20180829022353.23568-3-dima@arista.com> <914d8184-d5e6-519c-b355-7f1360cfa6a0@suse.cz> Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: stable-owner@vger.kernel.org List-ID: On Wed, 2018-08-29 at 16:40 +0200, Jiri Slaby wrote: > On 08/29/2018, 04:23 AM, Dmitry Safonov wrote: > > tty_ldisc_reinit() doesn't race with neither tty_ldisc_hangup() > > nor set_ldisc() nor tty_ldisc_release() as they use tty lock. > > But it races with anyone who expects line discipline to be the same > > after hoding read semaphore in tty_ldisc_ref(). > > > > We've seen the following crash on v4.9.108 stable: > > > > BUG: unable to handle kernel paging request at 0000000000002260 > > IP: [..] n_tty_receive_buf_common+0x5f/0x86d > > Workqueue: events_unbound flush_to_ldisc > > Call Trace: > > [..] n_tty_receive_buf2 > > [..] tty_ldisc_receive_buf > > [..] flush_to_ldisc > > [..] process_one_work > > [..] worker_thread > > [..] kthread > > [..] ret_from_fork > > > > I think, tty_ldisc_reinit() should be called with ldisc_sem hold > > for > > writing, which will protect any reader against line discipline > > changes. > > > > Note: I failed to reproduce the described crash, so obiviously > > can't > > guarantee that this is the place where line discipline was > > switched. > > > > Cc: Greg Kroah-Hartman > > Cc: Jiri Slaby > > Cc: stable@vger.kernel.org > > Signed-off-by: Dmitry Safonov > > --- > > drivers/tty/tty_io.c | 9 +++++++-- > > 1 file changed, 7 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c > > index 5e5da9acaf0a..3ef8b977b167 100644 > > --- a/drivers/tty/tty_io.c > > +++ b/drivers/tty/tty_io.c > > @@ -1267,15 +1267,20 @@ static int tty_reopen(struct tty_struct > > *tty) > > if (test_bit(TTY_EXCLUSIVE, &tty->flags) && > > !capable(CAP_SYS_ADMIN)) > > return -EBUSY; > > > > - tty->count++; > > + retval = tty_ldisc_lock(tty, 5 * HZ); > > Why 5 secs? This would cause random errors on machines under heavy > load. Yeah, I think MAX_SCHEDULE_TIMEOUT will make more sense here.. Not sure, why I decided to go with 5*HZ instead. Will resend with new timeout, if everything else looks good to you. (having in mind my argument for count++ in 1/4) > > > + if (retval) > > + return retval; > > > > + tty->count++; > > if (tty->ldisc) > > - return 0; > > + goto out_unlock; > > > > retval = tty_ldisc_reinit(tty, tty->termios.c_line); > > if (retval) > > tty->count--; > > > > +out_unlock: > > + tty_ldisc_unlock(tty); > > return retval; > > So what about: > tty_ldisc_lock(tty, MAX_SCHEDULE_TIMEOUT); > if (!tty->ldisc) > ret = tty_ldisc_reinit(tty, tty->termios.c_line); > tty_ldisc_unlock(tty); > > if (!ret) > tty->count++; > > return ret; > -- Thanks, Dmitry