public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Dmitry Safonov <dima@arista.com>
To: Jiri Slaby <jslaby@suse.cz>, kernel test robot <rong.a.chen@intel.com>
Cc: lkp@01.org, "Daniel Axtens" <dja@axtens.net>,
	"Dmitry Safonov" <0x7f454c46@gmail.com>,
	"Sergey Senozhatsky" <sergey.senozhatsky.work@gmail.com>,
	"Dmitry Vyukov" <dvyukov@google.com>,
	"Tan Xiaojun" <tanxiaojun@huawei.com>,
	"Peter Hurley" <peter@hurleysoftware.com>,
	"Pasi Kärkkäinen" <pasik@iki.fi>,
	"Greg Kroah-Hartman" <gregkh@linuxfoundation.org>,
	"Michael Neuling" <mikey@neuling.org>,
	"Mikulas Patocka" <mpatocka@redhat.com>,
	linux-kernel@vger.kernel.org, stable@vger.kernel.org
Subject: Re: [LKP] [tty] 0b4f83d510: INFO:task_blocked_for_more_than#seconds
Date: Fri, 07 Sep 2018 12:12:08 +0100	[thread overview]
Message-ID: <1536318728.23560.98.camel@arista.com> (raw)
In-Reply-To: <d5d23bc5-0e11-5cee-6d66-514dfc4fc25b@suse.cz>

On Fri, 2018-09-07 at 08:39 +0200, Jiri Slaby wrote:
> On 09/07/2018, 06:50 AM, kernel test robot wrote:
> > FYI, we noticed the following commit (built with gcc-7):
> > 
> > commit: 0b4f83d510f6fef6bb9da25f122c8d733d50516f ("[PATCH 2/4] tty:
> > Hold tty_ldisc_lock() during tty_reopen()")
> > url: https://github.com/0day-ci/linux/commits/Dmitry-Safonov/tty-Ho
> > ld-write-ldisc-sem-in-tty_reopen/20180829-165618
> > base: https://git.kernel.org/cgit/linux/kernel/git/gregkh/tty.git
> > tty-testing
> > 
> > in testcase: trinity
> > with following parameters:
> > 
> > 	runtime: 300s
> > 
> > test-description: Trinity is a linux system call fuzz tester.
> > test-url: http://codemonkey.org.uk/projects/trinity/
> > 
> > 
> > on test machine: qemu-system-x86_64 -enable-kvm -m 256M
> > 
> > caused below changes (please refer to attached dmesg/kmsg for
> > entire log/backtrace):
> > 
> > 
> > +--------------------------------------------------+------------+
> > ------------+
> > >                                                  | 58dd163974 |
> > > 0b4f83d510 |
> > 
> > +--------------------------------------------------+------------+
> > ------------+
> > > boot_successes                                   | 14         |
> > > 4          |
> > > boot_failures                                    | 0          |
> > > 6          |
> > > INFO:task_blocked_for_more_than#seconds          | 0          |
> > > 6          |
> > > Kernel_panic-not_syncing:hung_task:blocked_tasks | 0          |
> > > 6          |
> > 
> > +--------------------------------------------------+------------+
> > ------------+
> > 
> > 
> > 
> > [  244.816801] INFO: task validate_data:655 blocked for more than
> > 120 seconds.
> > [  244.818833]       Not tainted 4.18.0-11684-g0b4f83d #1
> > [  244.820028] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > disables this message.
> > [  244.826965] validate_data   D    0   655    623 0x20020000
> > [  244.828279] Call Trace:
> > [  244.828958]  ? __schedule+0x843/0x950
> > [  244.830173]  ? __ldsem_down_read_nested+0x1c4/0x3b0
> > [  244.834903]  schedule+0x31/0x70
> > [  244.835665]  schedule_timeout+0x34/0x760
> > [  244.836613]  ? ftrace_likely_update+0x35/0x60
> > [  244.837683]  ? __ldsem_down_read_nested+0x1c4/0x3b0
> > [  244.838818]  ? ftrace_likely_update+0x35/0x60
> > [  244.840127]  ? ftrace_likely_update+0x35/0x60
> > [  244.845947]  ? __ldsem_down_read_nested+0x1c4/0x3b0
> > [  244.847882]  __ldsem_down_read_nested+0x23a/0x3b0
> > [  244.849886]  ? tty_ldisc_ref_wait+0x25/0x50
> > [  244.853807]  tty_ldisc_ref_wait+0x25/0x50
> > [  244.854946]  tty_compat_ioctl+0x8a/0x120
> > [  244.855928]  ? this_tty+0x80/0x80
> > [  244.856742]  __ia32_compat_sys_ioctl+0xc28/0x1ce0
> > [  244.857981]  do_int80_syscall_32+0x1d2/0x5f0
> > [  244.859003]  entry_INT80_compat+0x88/0xa0
> > [  244.859972] INFO: task dnsmasq:668 blocked for more than 120
> > seconds.
> > [  244.868315]       Not tainted 4.18.0-11684-g0b4f83d #1
> > [  244.869583] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > disables this message.
> > [  244.871744] dnsmasq         D    0   668      1 0x20020000
> > [  244.873063] Call Trace:
> > [  244.873697]  ? __schedule+0x843/0x950
> > [  244.874572]  ? __ldsem_down_read_nested+0x1c4/0x3b0
> > [  244.875725]  schedule+0x31/0x70
> > [  244.876576]  schedule_timeout+0x34/0x760
> > [  244.877573]  ? ftrace_likely_update+0x35/0x60
> > [  244.878660]  ? __ldsem_down_read_nested+0x1c4/0x3b0
> > [  244.879872]  ? ftrace_likely_update+0x35/0x60
> > [  244.890522]  ? ftrace_likely_update+0x35/0x60
> > [  244.891572]  ? __ldsem_down_read_nested+0x1c4/0x3b0
> > [  244.892746]  __ldsem_down_read_nested+0x23a/0x3b0
> > [  244.893861]  ? tty_ldisc_ref_wait+0x25/0x50
> > [  244.894841]  tty_ldisc_ref_wait+0x25/0x50
> > [  244.895911]  tty_compat_ioctl+0x8a/0x120
> > [  244.896916]  ? this_tty+0x80/0x80
> > [  244.897717]  __ia32_compat_sys_ioctl+0xc28/0x1ce0
> > [  244.898821]  do_int80_syscall_32+0x1d2/0x5f0
> > [  244.899830]  entry_INT80_compat+0x88/0xa0
> > [  244.909466] INFO: task dropbear:734 blocked for more than 120
> > seconds.
> > [  244.911173]       Not tainted 4.18.0-11684-g0b4f83d #1
> > [  244.912394] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > disables this message.
> > [  244.914176] dropbear        D    0   734      1 0x20020000
> > [  244.915446] Call Trace:
> > [  244.916068]  ? __schedule+0x843/0x950
> > [  244.916945]  ? __ldsem_down_read_nested+0x1c4/0x3b0
> > [  244.918076]  schedule+0x31/0x70
> > [  244.918832]  schedule_timeout+0x34/0x760
> > [  244.919781]  ? ftrace_likely_update+0x35/0x60
> > [  244.921104]  ? __ldsem_down_read_nested+0x1c4/0x3b0
> > [  244.922304]  ? ftrace_likely_update+0x35/0x60
> > [  244.923347]  ? ftrace_likely_update+0x35/0x60
> > [  244.924369]  ? __ldsem_down_read_nested+0x1c4/0x3b0
> > [  244.925496]  __ldsem_down_read_nested+0x23a/0x3b0
> > [  244.926598]  ? tty_ldisc_ref_wait+0x25/0x50
> > [  244.927578]  tty_ldisc_ref_wait+0x25/0x50
> > [  244.928526]  tty_compat_ioctl+0x8a/0x120
> > [  244.929449]  ? this_tty+0x80/0x80
> > [  244.930240]  __ia32_compat_sys_ioctl+0xc28/0x1ce0
> > [  244.940083]  do_int80_syscall_32+0x1d2/0x5f0
> > [  244.941310]  entry_INT80_compat+0x88/0xa0
> > [  244.944070] 
> > [  244.944070] Showing all locks held in the system:
> > [  244.945558] 1 lock held by khungtaskd/18:
> > [  244.946495]  #0: (____ptrval____) (rcu_read_lock){....}, at:
> > debug_show_all_locks+0x16/0x190
> > [  244.948503] 2 locks held by askfirst/235:
> > [  244.949439]  #0: (____ptrval____) (&tty->ldisc_sem){++++}, at:
> > tty_ldisc_ref_wait+0x25/0x50
> > [  244.951762]  #1: (____ptrval____) (&ldata-
> > >atomic_read_lock){+.+.}, at: n_tty_read+0x13d/0xa00
> 
> Here, it just seems to wait for input from the user.
> 
> > [  244.953799] 1 lock held by validate_data/655:
> > [  244.954814]  #0: (____ptrval____) (&tty->ldisc_sem){++++}, at:
> > tty_ldisc_ref_wait+0x25/0x50
> > [  244.956764] 1 lock held by dnsmasq/668:
> > [  244.957649]  #0: (____ptrval____) (&tty->ldisc_sem){++++}, at:
> > tty_ldisc_ref_wait+0x25/0x50
> > [  244.959598] 1 lock held by dropbear/734:
> > [  244.967564]  #0: (____ptrval____) (&tty->ldisc_sem){++++}, at:
> > tty_ldisc_ref_wait+0x25/0x50
> 
> Hmm, I assume there is another task waiting for write_ldsem and that
> one
> prevents these readers to proceed. Unfortunately, due to the defunct
> __ptrval__ pointer hashing crap, we do not see who is waiting for
> what.
> But I am guessing all are the same locks.

Hmm, and the writer is not able to grab it because there is already
reader who uses console, I suppose.
I wonder if we can add re-acquiring for the reader side..
That will make write acquire less likely to fail under the load.
I'll try to reproduce it.

> 
> So I think, we are forced to limit the waiting to 5 seconds in reopen
> in
> the end too (the same as we do for new open in tty_init_dev).
> 
> Dmitry, could you add the limit and handle the return value of
> tty_ldisc_lock now?

Yeah, will do.

-- 
Thanks,
             Dmitry

  reply	other threads:[~2018-09-07 11:12 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-29  2:23 [PATCH 0/4] tty: Hold write ldisc sem in tty_reopen() Dmitry Safonov
2018-08-29  2:23 ` [PATCH 1/4] tty: Drop tty->count on tty_reopen() failure Dmitry Safonov
2018-08-29 14:38   ` Jiri Slaby
2018-08-29 16:13     ` Dmitry Safonov
2018-08-31  6:47       ` Jiri Slaby
2018-08-31 11:54         ` Dmitry Safonov
2018-08-29  2:23 ` [PATCH 2/4] tty: Hold tty_ldisc_lock() during tty_reopen() Dmitry Safonov
2018-08-29  4:34   ` Sergey Senozhatsky
2018-08-29 14:30     ` Dmitry Safonov
2018-08-30  5:16     ` Benjamin Herrenschmidt
2018-08-29 14:40   ` Jiri Slaby
2018-08-29 14:45     ` Jiri Slaby
2018-08-29 16:36     ` Dmitry Safonov
2018-08-29 15:19   ` Tetsuo Handa
2018-08-31  6:51     ` Jiri Slaby
2018-08-31 11:17       ` Tetsuo Handa
2018-08-31 11:21         ` Jiri Slaby
2018-08-31 12:12           ` Dmitry Safonov
2018-09-07  4:50   ` [LKP] [tty] 0b4f83d510: INFO:task_blocked_for_more_than#seconds kernel test robot
2018-09-07  6:39     ` Jiri Slaby
2018-09-07 11:12       ` Dmitry Safonov [this message]
2018-09-10  5:14       ` Sergey Senozhatsky
2018-09-10 18:50         ` Dmitry Safonov
2018-08-29  2:23 ` [PATCH 3/4] tty: Lock tty pair in tty_init_dev() Dmitry Safonov
2018-08-29 14:46   ` Jiri Slaby
2018-08-29 16:28     ` Dmitry Safonov
2018-08-31  6:54       ` Jiri Slaby
2018-08-31 12:22         ` Dmitry Safonov
2018-08-29  2:23 ` [PATCH 4/4] tty/lockdep: Add ldisc_sem asserts Dmitry Safonov
2018-08-30  7:03 ` [PATCH 0/4] tty: Hold write ldisc sem in tty_reopen() Pasi Kärkkäinen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1536318728.23560.98.camel@arista.com \
    --to=dima@arista.com \
    --cc=0x7f454c46@gmail.com \
    --cc=dja@axtens.net \
    --cc=dvyukov@google.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=jslaby@suse.cz \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lkp@01.org \
    --cc=mikey@neuling.org \
    --cc=mpatocka@redhat.com \
    --cc=pasik@iki.fi \
    --cc=peter@hurleysoftware.com \
    --cc=rong.a.chen@intel.com \
    --cc=sergey.senozhatsky.work@gmail.com \
    --cc=stable@vger.kernel.org \
    --cc=tanxiaojun@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox