From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pl0-f65.google.com ([209.85.160.65]:36448 "EHLO mail-pl0-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730522AbeGROuG (ORCPT ); Wed, 18 Jul 2018 10:50:06 -0400 Received: by mail-pl0-f65.google.com with SMTP id e11-v6so2103641plb.3 for ; Wed, 18 Jul 2018 07:11:59 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: References: <0000000000002f5541057143a85e@google.com> <0adc592b-d4a3-f6da-3c5c-22490f641eb9@i-love.sakura.ne.jp> From: Dmitry Vyukov Date: Wed, 18 Jul 2018 16:11:38 +0200 Message-ID: Subject: Re: INFO: task hung in grab_super To: Tetsuo Handa Cc: Eric Van Hensbergen , Ron Minnich , Latchesar Ionkov , v9fs-developer@lists.sourceforge.net, syzbot , linux-fsdevel , LKML , syzkaller-bugs , Al Viro Content-Type: text/plain; charset="UTF-8" Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Wed, Jul 18, 2018 at 3:35 PM, Tetsuo Handa wrote: >>>> This seems to be related to 9p. After rerunning the log I got: >>>> >>>> root@syzkaller:~# ps afxu | grep syz >>>> root 18253 0.0 0.0 0 0 ttyS0 Zl 10:16 0:00 \_ >>>> [syz-executor] >>>> root@syzkaller:~# cat /proc/18253/task/*/stack >>>> [<0>] p9_client_rpc+0x3a2/0x1400 >>>> [<0>] p9_client_flush+0x134/0x2a0 >>>> [<0>] p9_client_rpc+0x122c/0x1400 >>>> [<0>] p9_client_create+0xc56/0x16af >>>> [<0>] v9fs_session_init+0x21a/0x1a80 >>>> [<0>] v9fs_mount+0x7c/0x900 >>>> [<0>] mount_fs+0xae/0x328 >>>> [<0>] vfs_kern_mount.part.34+0xdc/0x4e0 >>>> [<0>] do_mount+0x581/0x30e0 >>>> [<0>] ksys_mount+0x12d/0x140 >>>> [<0>] __x64_sys_mount+0xbe/0x150 >>>> [<0>] do_syscall_64+0x1b9/0x820 >>>> [<0>] entry_SYSCALL_64_after_hwframe+0x49/0xbe >>>> [<0>] 0xffffffffffffffff >>>> >>>> There is a bunch of hangs in 9p, so let's do: >>>> >>>> #syz dup: INFO: task hung in flush_work >>>> >>> Then, is dumping all threads when khungtaskd fires a candidate >>> for CONFIG_DEBUG_AID_FOR_SYZBOT=y path? >> >> Perhaps would be useful. But maybe only tasks that are blocked for >> more than timeout/2? and/or unkillable tasks? killable tasks are not a >> problem. > > TASK_KILLABLE waiters are not reported by khungtaskd, are they? > > /* use "==" to skip the TASK_KILLABLE tasks waiting on NFS */ > if (t->state == TASK_UNINTERRUPTIBLE) > check_hung_task(t, timeout); > > And TASK_KILLABLE waiters can become a problem because > >> >> Btw, I see that p9_client_rpc uses wait_event_killable, why wasn't it >> killed along with the whole process? >> > > wait_event_killable() would return -ERESTARTSYS if got SIGKILL. > But if (c->status == Connected) && (type == P9_TFLUSH) is also true, > it ignores SIGKILL by retrying the loop... > > again: > err = wait_event_killable(*req->wq, req->status >= REQ_STATUS_RCVD); > if ((err == -ERESTARTSYS) && (c->status == Connected) && (type == P9_TFLUSH)) { > sigpending = 1; > clear_thread_flag(TIF_SIGPENDING); > goto again; > } > > I wish they don't ignore SIGKILL (by e.g. offloading operations to a kernel thread). I guess that's the problem, right? SIGKILL-ed task must not ignore SIGKILL and hang in infinite loop. This would explain a bunch of hangs in 9p.