From: Jeff Layton <jlayton@kernel.org>
To: "Florian Möller" <fmoeller@mathematik.uni-wuerzburg.de>,
linux-nfs@vger.kernel.org
Cc: Andreas Seeg <andreas.seeg@mathematik.uni-wuerzburg.de>
Subject: Re: Reoccurring 5 second delays during NFS calls
Date: Tue, 07 Feb 2023 10:21:48 -0500 [thread overview]
Message-ID: <8a02c86882bc47c1c1387dba8c7d756237cb3f3f.camel@kernel.org> (raw)
In-Reply-To: <59682160-a246-395a-9486-9bbf11686740@mathematik.uni-wuerzburg.de>
On Tue, 2023-02-07 at 11:58 +0100, Florian Möller wrote:
> Hi all,
>
> we are currently in the process of migrating our file server infrastructure to
> NFS. In our test environments, the following problem has now occurred several
> times in certain situations:
>
> A previously very fast NFS file operation suddenly takes 5 seconds longer - per
> file. This leads to applications running very slowly and severely delayed file
> operations.
>
> Here are the details:
>
> NFS server:
> OS: Ubuntu 22.04.1, all patches installed
> Kernel: Ubuntu Mainline, Versions
> 6.1.7-060107-generic_6.1.7-060107.202301181200
> 6.1.8-060108_6.1.8-060108.202301240742
> 6.1.9-060109_6.1.9-060109.202302010835
> Security options: all Kerberos security options are affected
> (The bug does not seem to occur without Kerberos security.)
> Output of exportfs -v:
> /export
> gss/krb5p(async,wdelay,hide,crossmnt,no_subtree_check,fsid=0,sec=sys,rw,secure,root_squash,no_all_squash)
> /export
> gss/krb5(async,wdelay,hide,crossmnt,no_subtree_check,fsid=0,sec=sys,rw,secure,root_squash,no_all_squash)
>
I see you're using the -o async export option. Note that you may end up
with corrupt data on a server reboot (see exports(5) manpage).
Assuming you're aware of this and want to keep that anyway, then the
patch I just posted to the mailing list may help you, if the stalls are
coming during CLOSE operations:
https://lore.kernel.org/linux-nfs/9137413986ba9c2e83c030513fa9ae3358f30a85.camel@kernel.org/T/#mcb88f091263d07d8b9c13e6cc5ce0a0413d3f761
>
> Client 1:
> OS: Arch Linux, all patches installed
> Kernel: 6.1.9-arch1-2, 6.1.9-arch1-1
> Mount-Line: servername:/ on /nfs type nfs4
> (rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=krb5p,clientaddr=XX.XX.XX.XX,local_lock=none,addr=YY.YY.YY.YY,_netdev)
> krb5: 1.20.1-1
> libevent: 2.1.12-4
> nfs-utils: 2.6.2-1
> util-linux: 2.38.1-1
>
>
> Client 2:
> OS: openSuSE 15.4, all patches installed
> Kernel: 5.14.21-150400.24.41-default
> Mount-Line: servername:/ on /nfs type nfs4
> (rw,relatime,sync,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,acregmin=0,acregmax=0,acdirmin=0,acdirmax=0,softerr,softreval,noac,noresvport,proto=tcp,timeo=1,retrans=2,sec=krb5,clientaddr=XX.XX.XX.XX,lookupcache=none,local_lock=none,addr=YY.YY.YY.YY)
>
> libgssapi3: 7.8.0-bp154.2.4.1
> libevent: 2.1.8-2.23
> nfs-client: 2.1.1-150100.10.27.1
> util-linux: 2.37.2-140400.8.14.1
>
>
> The error occurs for example if a file is touched twice:
>
> touch testfile && echo "done" && touch testfile && echo "and again"
>
> However, touching a large number of files (about 10000) with (pairwise)
> different filenames works fast.
>
>
> Here is another example that triggers the error:
>
> 1st step: create many files (shell code in Z-shell syntax):
>
> for i in {1..10000}; do
> echo "test" > $i.txt
> done
>
> This is fast.
>
> 2nd step:
>
> for i in {1..10000}; do
> echo $i
> cat $i.txt
> done
>
> This takes 5 seconds per cat(1) call.
> After unmounting and mounting, the 2nd step also runs quickly at first. But
> after executing the 2nd step several times in a row, the error occurs again
> (quite soon, after the 2nd or 3rd execution).
>
> We were not able to reproduce the error without a Kerberos security type set.
>
>
> Attached are a log from the server and from the client. In both cases
>
> rpcdebug -m nfs -s all
> rpcdebug -m nfsd -s all
> rpcdebug -m rpc -s all
>
> was set.
>
>
> Best regards,
> Florian Möller
>
>
>
--
Jeff Layton <jlayton@kernel.org>
next prev parent reply other threads:[~2023-02-07 15:22 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-02-07 10:58 Reoccurring 5 second delays during NFS calls Florian Möller
2023-02-07 15:21 ` Jeff Layton [this message]
2023-02-21 13:33 ` Florian Möller
2023-02-21 14:13 ` Benjamin Coddington
2023-02-21 16:52 ` Florian Möller
2023-02-21 18:58 ` Benjamin Coddington
[not found] ` <4f70c2f5-dfdb-c37c-8663-5f2a108e229e@mathematik.uni-wuerzburg.de>
2023-02-22 11:54 ` Benjamin Coddington
2023-02-22 12:22 ` Jeff Layton
2023-02-22 12:45 ` Benjamin Coddington
2023-02-22 12:48 ` Florian Möller
2023-02-22 19:43 ` Benjamin Coddington
2023-02-22 20:14 ` Rick Macklem
2023-02-23 11:06 ` Benjamin Coddington
2023-02-23 8:27 ` Florian Möller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8a02c86882bc47c1c1387dba8c7d756237cb3f3f.camel@kernel.org \
--to=jlayton@kernel.org \
--cc=andreas.seeg@mathematik.uni-wuerzburg.de \
--cc=fmoeller@mathematik.uni-wuerzburg.de \
--cc=linux-nfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox