From: "'J. Bruce Fields'" <bfields@fieldses.org>
To: "Jäkel, Guido" <G.Jaekel@dnb.de>
Cc: 'Jeff Layton' <jlayton@kernel.org>,
"'linux-nfs@vger.kernel.org'" <linux-nfs@vger.kernel.org>
Subject: Re: NFS3 subsystem hung, Kernel alive
Date: Mon, 24 Sep 2018 17:58:54 -0400 [thread overview]
Message-ID: <20180924215854.GA9559@fieldses.org> (raw)
In-Reply-To: <d4da0e3ad43b4a4da7a6a6d2c0615939@dnb.de>
On Thu, Sep 20, 2018 at 10:52:17AM +0000, Jäkel, Guido wrote:
> Hi all,
>
> Today at about "the event time" production keeps running but I discover that one of the hosts in the Test stage (bladerunner10) become very "stuttering" to react on commands.
>
> From https://utcc.utoronto.ca/~cks/space/blog/linux/NFSMountstatsXprt I got some information about. And I started to
>
> watch -n 1 "sed -n '/^device .* on \/ with/,/^$/ p' /proc/self/mountstats"
>
> on the hosts to watch the root mount. On bladerunner10 I notice a very high value of the 8th field of xprt ('bad XIDs'), which is identical to the difference between filed 6 and 7 (TX-RX). Does that mean, that there were a high number of bad answers to questions? Or is this the number of replies that are out of time?
I don't know what you mean by "filed 6 and 7". Oh, wait, I guess you're
talking about the 6th and 7th fileds of the "xprt" line in mountstats.
bad_xids means the client got a response but couldn't find a matching
reply. I'm not sure why that would happen--maybe a response came after
the client gave up waiting for it?
--b.
>
> If I watch TX-RX-BAD, this is near zero on all hosts. But on bladerunner10, it sometime rises to enormous values (>100000) and in this moment, all File-IO is frozen - E.g. I don't get a new prompt if I simply hit enter on an bash command line.
>
>
>
> device 10.69.63.196:/02/q/diskless/roots/bladerunner10 mounted on / with fstype nfs statvers=1.1
> opts: rw,vers=3,rsize=1024,wsize=1024,namlen=255,acregmin=3,acregmax=60,acdirmin=30,acdirmax=60,hard,nolock,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.69.63.196,mountvers=3,mountport=0,mountproto=tcp,local_lock=all
> age: 9939702
> caps: caps=0x3fc7,wtmult=512,dtsize=1024,bsize=0,namlen=255
> sec: flavor=1,pseudoflavor=1
> events: 269343924 134739087308 20734 140915 232195524 79262 134886538148 21804722 104 16067 0 293341786 222190 75356 177067969 35796 2826 231908027 0 411 21783902 199 0 0 0 0 0
> bytes: 128654830696 20320953759 0 0 219517679 20415228955 63772 5008821
> RPC iostats version: 1.0 p/v: 100003/3 (nfs)
> xprt: tcp 837 1 1 0 0 21448220350 21448165066 55284 576287654630121 0 34712 845220323041 514256914035
> per-op statistics
> NULL: 0 0 0 0 0 0 0 0
> GETATTR: 269343899 269343899 0 36809071916 30166513552 3034498 71578350 78080492
> SETATTR: 75721 75721 0 15972628 10903824 1855 70284 73720
> LOOKUP: 80296 80296 0 15825484 18814360 7312 135951 144678
> ACCESS: 39274 39274 0 7048052 4712880 4241 26485 31274
> READLINK: 995 995 0 170796 139564 72 479 567
> READ: 223945 223945 0 40327228 248198116 130225 1437810 1583172
> WRITE: 19958985 19958985 0 24406783848 3193437600 167421458404 27086586679 194511012992
> CREATE: 5281 5281 0 1126060 1542052 132 21698 21989
> MKDIR: 127 127 0 29160 36740 10 12307 12321
> SYMLINK: 3 3 0 716 876 0 1 1
> MKNOD: 3 3 0 636 876 0 2 2
> REMOVE: 3400 3400 0 663604 489600 52 12164 12312
> RMDIR: 122 122 0 24624 17520 15 463 483
> RENAME: 2074 2074 0 491352 539240 67 11433 11529
> LINK: 0 0 0 0 0 0 0 0
> READDIR: 31882 31882 0 6376400 32311036 2707 64806 68379
> READDIRPLUS: 273882 273882 0 55807876 140884360 14257 509826 530894
> FSSTAT: 538 538 0 95212 90384 61 445 519
> FSINFO: 2 2 0 272 328 0 0 0
> PATHCONF: 1 1 0 136 140 0 0 0
> COMMIT: 0 0 0 0 0 0 0 0
>
>
next prev parent reply other threads:[~2018-09-25 4:03 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <01105607924f4f9ea0dbf14f9ae45268@dnb.de>
2018-09-20 7:51 ` NFS3 subsystem hung, Kernel alive Jäkel, Guido
[not found] ` <74259d8dda7b4753b0d49e7e60c293e5@dnb.de>
2018-09-20 10:52 ` Jäkel, Guido
2018-09-24 21:58 ` 'J. Bruce Fields' [this message]
2018-09-25 6:56 ` Jäkel, Guido
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180924215854.GA9559@fieldses.org \
--to=bfields@fieldses.org \
--cc=G.Jaekel@dnb.de \
--cc=jlayton@kernel.org \
--cc=linux-nfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).