From: Dmitry Safonov <dima@arista.com>
To: Jiri Slaby <jslaby@suse.cz>, kernel test robot <rong.a.chen@intel.com>
Cc: lkp@01.org, "Daniel Axtens" <dja@axtens.net>,
"Dmitry Safonov" <0x7f454c46@gmail.com>,
"Sergey Senozhatsky" <sergey.senozhatsky.work@gmail.com>,
"Dmitry Vyukov" <dvyukov@google.com>,
"Tan Xiaojun" <tanxiaojun@huawei.com>,
"Peter Hurley" <peter@hurleysoftware.com>,
"Pasi Kärkkäinen" <pasik@iki.fi>,
"Greg Kroah-Hartman" <gregkh@linuxfoundation.org>,
"Michael Neuling" <mikey@neuling.org>,
"Mikulas Patocka" <mpatocka@redhat.com>,
linux-kernel@vger.kernel.org, stable@vger.kernel.org
Subject: Re: [LKP] [tty] 0b4f83d510: INFO:task_blocked_for_more_than#seconds
Date: Fri, 07 Sep 2018 12:12:08 +0100 [thread overview]
Message-ID: <1536318728.23560.98.camel@arista.com> (raw)
In-Reply-To: <d5d23bc5-0e11-5cee-6d66-514dfc4fc25b@suse.cz>
On Fri, 2018-09-07 at 08:39 +0200, Jiri Slaby wrote:
> On 09/07/2018, 06:50 AM, kernel test robot wrote:
> > FYI, we noticed the following commit (built with gcc-7):
> >
> > commit: 0b4f83d510f6fef6bb9da25f122c8d733d50516f ("[PATCH 2/4] tty:
> > Hold tty_ldisc_lock() during tty_reopen()")
> > url: https://github.com/0day-ci/linux/commits/Dmitry-Safonov/tty-Ho
> > ld-write-ldisc-sem-in-tty_reopen/20180829-165618
> > base: https://git.kernel.org/cgit/linux/kernel/git/gregkh/tty.git
> > tty-testing
> >
> > in testcase: trinity
> > with following parameters:
> >
> > runtime: 300s
> >
> > test-description: Trinity is a linux system call fuzz tester.
> > test-url: http://codemonkey.org.uk/projects/trinity/
> >
> >
> > on test machine: qemu-system-x86_64 -enable-kvm -m 256M
> >
> > caused below changes (please refer to attached dmesg/kmsg for
> > entire log/backtrace):
> >
> >
> > +--------------------------------------------------+------------+
> > ------------+
> > > | 58dd163974 |
> > > 0b4f83d510 |
> >
> > +--------------------------------------------------+------------+
> > ------------+
> > > boot_successes | 14 |
> > > 4 |
> > > boot_failures | 0 |
> > > 6 |
> > > INFO:task_blocked_for_more_than#seconds | 0 |
> > > 6 |
> > > Kernel_panic-not_syncing:hung_task:blocked_tasks | 0 |
> > > 6 |
> >
> > +--------------------------------------------------+------------+
> > ------------+
> >
> >
> >
> > [ 244.816801] INFO: task validate_data:655 blocked for more than
> > 120 seconds.
> > [ 244.818833] Not tainted 4.18.0-11684-g0b4f83d #1
> > [ 244.820028] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > disables this message.
> > [ 244.826965] validate_data D 0 655 623 0x20020000
> > [ 244.828279] Call Trace:
> > [ 244.828958] ? __schedule+0x843/0x950
> > [ 244.830173] ? __ldsem_down_read_nested+0x1c4/0x3b0
> > [ 244.834903] schedule+0x31/0x70
> > [ 244.835665] schedule_timeout+0x34/0x760
> > [ 244.836613] ? ftrace_likely_update+0x35/0x60
> > [ 244.837683] ? __ldsem_down_read_nested+0x1c4/0x3b0
> > [ 244.838818] ? ftrace_likely_update+0x35/0x60
> > [ 244.840127] ? ftrace_likely_update+0x35/0x60
> > [ 244.845947] ? __ldsem_down_read_nested+0x1c4/0x3b0
> > [ 244.847882] __ldsem_down_read_nested+0x23a/0x3b0
> > [ 244.849886] ? tty_ldisc_ref_wait+0x25/0x50
> > [ 244.853807] tty_ldisc_ref_wait+0x25/0x50
> > [ 244.854946] tty_compat_ioctl+0x8a/0x120
> > [ 244.855928] ? this_tty+0x80/0x80
> > [ 244.856742] __ia32_compat_sys_ioctl+0xc28/0x1ce0
> > [ 244.857981] do_int80_syscall_32+0x1d2/0x5f0
> > [ 244.859003] entry_INT80_compat+0x88/0xa0
> > [ 244.859972] INFO: task dnsmasq:668 blocked for more than 120
> > seconds.
> > [ 244.868315] Not tainted 4.18.0-11684-g0b4f83d #1
> > [ 244.869583] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > disables this message.
> > [ 244.871744] dnsmasq D 0 668 1 0x20020000
> > [ 244.873063] Call Trace:
> > [ 244.873697] ? __schedule+0x843/0x950
> > [ 244.874572] ? __ldsem_down_read_nested+0x1c4/0x3b0
> > [ 244.875725] schedule+0x31/0x70
> > [ 244.876576] schedule_timeout+0x34/0x760
> > [ 244.877573] ? ftrace_likely_update+0x35/0x60
> > [ 244.878660] ? __ldsem_down_read_nested+0x1c4/0x3b0
> > [ 244.879872] ? ftrace_likely_update+0x35/0x60
> > [ 244.890522] ? ftrace_likely_update+0x35/0x60
> > [ 244.891572] ? __ldsem_down_read_nested+0x1c4/0x3b0
> > [ 244.892746] __ldsem_down_read_nested+0x23a/0x3b0
> > [ 244.893861] ? tty_ldisc_ref_wait+0x25/0x50
> > [ 244.894841] tty_ldisc_ref_wait+0x25/0x50
> > [ 244.895911] tty_compat_ioctl+0x8a/0x120
> > [ 244.896916] ? this_tty+0x80/0x80
> > [ 244.897717] __ia32_compat_sys_ioctl+0xc28/0x1ce0
> > [ 244.898821] do_int80_syscall_32+0x1d2/0x5f0
> > [ 244.899830] entry_INT80_compat+0x88/0xa0
> > [ 244.909466] INFO: task dropbear:734 blocked for more than 120
> > seconds.
> > [ 244.911173] Not tainted 4.18.0-11684-g0b4f83d #1
> > [ 244.912394] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > disables this message.
> > [ 244.914176] dropbear D 0 734 1 0x20020000
> > [ 244.915446] Call Trace:
> > [ 244.916068] ? __schedule+0x843/0x950
> > [ 244.916945] ? __ldsem_down_read_nested+0x1c4/0x3b0
> > [ 244.918076] schedule+0x31/0x70
> > [ 244.918832] schedule_timeout+0x34/0x760
> > [ 244.919781] ? ftrace_likely_update+0x35/0x60
> > [ 244.921104] ? __ldsem_down_read_nested+0x1c4/0x3b0
> > [ 244.922304] ? ftrace_likely_update+0x35/0x60
> > [ 244.923347] ? ftrace_likely_update+0x35/0x60
> > [ 244.924369] ? __ldsem_down_read_nested+0x1c4/0x3b0
> > [ 244.925496] __ldsem_down_read_nested+0x23a/0x3b0
> > [ 244.926598] ? tty_ldisc_ref_wait+0x25/0x50
> > [ 244.927578] tty_ldisc_ref_wait+0x25/0x50
> > [ 244.928526] tty_compat_ioctl+0x8a/0x120
> > [ 244.929449] ? this_tty+0x80/0x80
> > [ 244.930240] __ia32_compat_sys_ioctl+0xc28/0x1ce0
> > [ 244.940083] do_int80_syscall_32+0x1d2/0x5f0
> > [ 244.941310] entry_INT80_compat+0x88/0xa0
> > [ 244.944070]
> > [ 244.944070] Showing all locks held in the system:
> > [ 244.945558] 1 lock held by khungtaskd/18:
> > [ 244.946495] #0: (____ptrval____) (rcu_read_lock){....}, at:
> > debug_show_all_locks+0x16/0x190
> > [ 244.948503] 2 locks held by askfirst/235:
> > [ 244.949439] #0: (____ptrval____) (&tty->ldisc_sem){++++}, at:
> > tty_ldisc_ref_wait+0x25/0x50
> > [ 244.951762] #1: (____ptrval____) (&ldata-
> > >atomic_read_lock){+.+.}, at: n_tty_read+0x13d/0xa00
>
> Here, it just seems to wait for input from the user.
>
> > [ 244.953799] 1 lock held by validate_data/655:
> > [ 244.954814] #0: (____ptrval____) (&tty->ldisc_sem){++++}, at:
> > tty_ldisc_ref_wait+0x25/0x50
> > [ 244.956764] 1 lock held by dnsmasq/668:
> > [ 244.957649] #0: (____ptrval____) (&tty->ldisc_sem){++++}, at:
> > tty_ldisc_ref_wait+0x25/0x50
> > [ 244.959598] 1 lock held by dropbear/734:
> > [ 244.967564] #0: (____ptrval____) (&tty->ldisc_sem){++++}, at:
> > tty_ldisc_ref_wait+0x25/0x50
>
> Hmm, I assume there is another task waiting for write_ldsem and that
> one
> prevents these readers to proceed. Unfortunately, due to the defunct
> __ptrval__ pointer hashing crap, we do not see who is waiting for
> what.
> But I am guessing all are the same locks.
Hmm, and the writer is not able to grab it because there is already
reader who uses console, I suppose.
I wonder if we can add re-acquiring for the reader side..
That will make write acquire less likely to fail under the load.
I'll try to reproduce it.
>
> So I think, we are forced to limit the waiting to 5 seconds in reopen
> in
> the end too (the same as we do for new open in tty_init_dev).
>
> Dmitry, could you add the limit and handle the return value of
> tty_ldisc_lock now?
Yeah, will do.
--
Thanks,
Dmitry
next prev parent reply other threads:[~2018-09-07 11:12 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-08-29 2:23 [PATCH 0/4] tty: Hold write ldisc sem in tty_reopen() Dmitry Safonov
2018-08-29 2:23 ` [PATCH 1/4] tty: Drop tty->count on tty_reopen() failure Dmitry Safonov
2018-08-29 14:38 ` Jiri Slaby
2018-08-29 16:13 ` Dmitry Safonov
2018-08-31 6:47 ` Jiri Slaby
2018-08-31 11:54 ` Dmitry Safonov
2018-08-29 2:23 ` [PATCH 2/4] tty: Hold tty_ldisc_lock() during tty_reopen() Dmitry Safonov
2018-08-29 4:34 ` Sergey Senozhatsky
2018-08-29 14:30 ` Dmitry Safonov
2018-08-30 5:16 ` Benjamin Herrenschmidt
2018-08-29 14:40 ` Jiri Slaby
2018-08-29 14:45 ` Jiri Slaby
2018-08-29 16:36 ` Dmitry Safonov
2018-08-29 15:19 ` Tetsuo Handa
2018-08-31 6:51 ` Jiri Slaby
2018-08-31 11:17 ` Tetsuo Handa
2018-08-31 11:21 ` Jiri Slaby
2018-08-31 12:12 ` Dmitry Safonov
2018-09-07 4:50 ` [LKP] [tty] 0b4f83d510: INFO:task_blocked_for_more_than#seconds kernel test robot
2018-09-07 6:39 ` Jiri Slaby
2018-09-07 11:12 ` Dmitry Safonov [this message]
2018-09-10 5:14 ` Sergey Senozhatsky
2018-09-10 18:50 ` Dmitry Safonov
2018-08-29 2:23 ` [PATCH 3/4] tty: Lock tty pair in tty_init_dev() Dmitry Safonov
2018-08-29 14:46 ` Jiri Slaby
2018-08-29 16:28 ` Dmitry Safonov
2018-08-31 6:54 ` Jiri Slaby
2018-08-31 12:22 ` Dmitry Safonov
2018-08-29 2:23 ` [PATCH 4/4] tty/lockdep: Add ldisc_sem asserts Dmitry Safonov
2018-08-30 7:03 ` [PATCH 0/4] tty: Hold write ldisc sem in tty_reopen() Pasi Kärkkäinen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1536318728.23560.98.camel@arista.com \
--to=dima@arista.com \
--cc=0x7f454c46@gmail.com \
--cc=dja@axtens.net \
--cc=dvyukov@google.com \
--cc=gregkh@linuxfoundation.org \
--cc=jslaby@suse.cz \
--cc=linux-kernel@vger.kernel.org \
--cc=lkp@01.org \
--cc=mikey@neuling.org \
--cc=mpatocka@redhat.com \
--cc=pasik@iki.fi \
--cc=peter@hurleysoftware.com \
--cc=rong.a.chen@intel.com \
--cc=sergey.senozhatsky.work@gmail.com \
--cc=stable@vger.kernel.org \
--cc=tanxiaojun@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox