From: Nick Alcock <nix@esperi.org.uk>
To: NFS list <linux-nfs@vger.kernel.org>
Subject: Re: steam-associated reproducible hard NFSv4.2 client hang (5.9, 5.10)
Date: Sat, 13 Feb 2021 15:21:35 +0000 [thread overview]
Message-ID: <87pn14c7sw.fsf@esperi.org.uk> (raw)
(I can't get References: right on this mail due to the original aging
out of my mailbox: archive URL,
https://www.spinics.net/lists/linux-nfs/msg81430.html).
I now have a little lockdep info from this hang (and reports from at
least two others that they've seen similar-looking hangs dating back to
4.19, though much harder to reproduce, taking many hours rather than
five minutes: in one case they report not using NFS in production any
more because of this).
Unfortunately the lockdep info isn't much use:
Feb 13 14:13:12 silk warning: : [ 888.834464] Showing all locks held in the system:
Feb 13 14:13:12 silk warning: : [ 888.834501] 1 lock held by dmesg/1152:
Feb 13 14:13:12 silk warning: : [ 888.834508] #0: ffff980c3b7200d0 (&user->lock){+.+.}-{3:3}, at: devkmsg_read+0x49/0x2d1
Feb 13 14:13:12 silk warning: : [ 888.834540] 2 locks held by tee/1322:
Feb 13 14:13:12 silk warning: : [ 888.834546] #0: ffff980c0809a430 (sb_writers#12){.+.+}-{0:0}, at: ksys_write+0x6a/0xdc
Feb 13 14:13:12 silk warning: : [ 888.834573] #1: ffff980c3ca7b5e8 (&sb->s_type->i_mutex_key#16){++++}-{3:3}, at: nfs_start_io_write+0x1a/0x45
Feb 13 14:13:12 silk warning: : [ 888.834632] 1 lock held by 192.168.16.8-ma/2302:
Feb 13 14:13:12 silk warning: : [ 888.834638] #0: ffff980c0fe6b700 (&acct->lock#2){+.+.}-{3:3}, at: acct_process+0x102/0x2bc
The first of those is my ongoing dmesg -w. The last is process
accounting. The middle one is an ongoing, always-active Xsession-errors
tee over the same NFSv4 connection, which says nothing more than that
writes to this NFS server from this client have hung, which we already
know. There are no signs of locks held by the Steam client which has
hung in the middle of installation.
So whateverthehell this is, it's not blocked on a lock. The NFS client
is hanging all on its own. (I have no idea how clients can block in the
middle of writing if a lock is *not* involved somehow, but that is what
it looks like from the lockdep output.)
Does anyone know how I might start debugging this sod?
next reply other threads:[~2021-02-13 16:25 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-02-13 15:21 Nick Alcock [this message]
-- strict thread matches above, loose matches on Subject: below --
2021-01-03 14:27 steam-associated reproducible hard NFSv4.2 client hang (5.9, 5.10) Nick Alcock
2021-02-23 22:57 ` J. Bruce Fields
2021-02-23 23:58 ` Trond Myklebust
2021-02-24 2:01 ` bfields
2021-04-01 13:33 ` Nix
2021-04-01 13:44 ` bfields
2021-04-01 21:52 ` Nix
2021-04-02 19:20 ` bfields
2021-04-03 22:41 ` Nix
2021-04-05 11:48 ` Nix
2021-04-05 16:52 ` Nix
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87pn14c7sw.fsf@esperi.org.uk \
--to=nix@esperi.org.uk \
--cc=linux-nfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox