From: Alan Post <adp@prgmr.com>
To: linux-nfs <linux-nfs@vger.kernel.org>
Subject: User process NFS write hang in wait_on_commit with kworker
Date: Mon, 17 Jun 2019 18:06:13 -0600 [thread overview]
Message-ID: <20190618000613.GR4158@turtle.email> (raw)
On May 20th I reported "User process NFS write hang followed
by automount hang requiring reboot" to this list. There I
had a process that would hang on NFS write, followed by sync
hanging, eventually leading to my need to reboot the host.
On June 4th, after upgrading to Linux 4.19.44, I reported
the issue resolved. Since that time, as I've deployed out
Linux 4.19.44, the issue has come back--sort of.
I have begun once again getting sync hangs following a
hung NFS write. The hung write has a different stack trace
than any I previously reported:
[<0>] wait_on_commit+0x60/0x90 [nfs]
[<0>] __nfs_commit_inode+0x146/0x1a0 [nfs]
[<0>] nfs_file_fsync+0xa7/0x1d0 [nfs]
[<0>] filp_close+0x25/0x70
[<0>] put_files_struct+0x66/0xb0
[<0>] do_exit+0x2af/0xbb0
[<0>] do_group_exit+0x35/0xa0
[<0>] __x64_sys_exit_group+0xf/0x10
[<0>] do_syscall_64+0x45/0x100
[<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[<0>] 0xffffffffffffffff
And there is attendant kworker thread:
[<0>] wait_on_commit+0x60/0x90 [nfs]
[<0>] __nfs_commit_inode+0x146/0x1a0 [nfs]
[<0>] nfs_write_inode+0x5c/0x90 [nfs]
[<0>] nfs4_write_inode+0xd/0x30 [nfsv4]
[<0>] __writeback_single_inode+0x27a/0x320
[<0>] writeback_sb_inodes+0x19a/0x460
[<0>] wb_writeback+0x102/0x2f0
[<0>] wb_workfn+0xa3/0x400
[<0>] process_one_work+0x1e3/0x3d0
[<0>] worker_thread+0x28/0x3c0
[<0>] kthread+0x10e/0x130
[<0>] ret_from_fork+0x35/0x40
[<0>] 0xffffffffffffffff
Oddly enough, I can clear the problem without rebooting the host.
I arrange to block all traffic between the NFS server and NFS
client using iptables, of sufficient time for any open TCP
connections to timeout. After which the connection apparently
reestablishes and unblocks the hung process.
I can't explain what's keeping the connection alive but apparently
stalled--requiring my manual intervention. Do any of you have
ideas or speculation? I'm happy to poke around in a packet capture
if the information provided isn't sufficient.
-A
--
Alan Post | Xen VPS hosting for the technically adept
PO Box 61688 | Sunnyvale, CA 94088-1681 | https://prgmr.com/
email: adp@prgmr.com
next reply other threads:[~2019-06-18 0:04 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-06-18 0:06 Alan Post [this message]
2019-06-18 15:29 ` User process NFS write hang in wait_on_commit with kworker Benjamin Coddington
2019-06-19 0:07 ` Alan Post
2019-06-19 12:38 ` Benjamin Coddington
2019-06-21 20:47 ` Alan Post
2019-06-28 18:33 ` Alan Post
2019-07-02 9:55 ` Benjamin Coddington
2019-07-03 21:32 ` Alan Post
2019-07-05 23:53 ` Tom Talpey
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190618000613.GR4158@turtle.email \
--to=adp@prgmr.com \
--cc=linux-nfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox