From: "Mkrtchyan, Tigran" <tigran.mkrtchyan@desy.de>
To: Jeff Layton <jlayton@kernel.org>
Cc: Piyush Sachdeva <s.piyush1024@gmail.com>,
linux-nfs <linux-nfs@vger.kernel.org>,
Chuck Lever <cel@kernel.org>, trondmy <trondmy@kernel.org>,
sfrench@samba.org, sprasad@microsoft.com,
vaibsharma@microsoft.com
Subject: Re: NFS delegations behavior analysis
Date: Tue, 23 Jun 2026 13:04:31 +0200 (CEST) [thread overview]
Message-ID: <455619640.1622514.1782212671358.JavaMail.zimbra@desy.de> (raw)
In-Reply-To: <0b39c1e01a92f99fe456c76523ec7f3aa5dc1a81.camel@kernel.org>
[-- Attachment #1: Type: text/plain, Size: 5344 bytes --]
----- Original Message -----
> From: "Jeff Layton" <jlayton@kernel.org>
> To: "Piyush Sachdeva" <s.piyush1024@gmail.com>, "linux-nfs" <linux-nfs@vger.kernel.org>, "Chuck Lever" <cel@kernel.org>,
> "trondmy" <trondmy@kernel.org>, sfrench@samba.org, sprasad@microsoft.com
> Cc: vaibsharma@microsoft.com
> Sent: Tuesday, 23 June, 2026 12:50:16
> Subject: Re: NFS delegations behavior analysis
> On Tue, 2026-06-23 at 15:31 +0530, Piyush Sachdeva wrote:
>> Hi,
>> Lately I have been running micro benchmarks around the `ls` command and
>> reading through the code documentation of the NFS client to better
>> understand the client side caching behavior with and without
>> delegations.
>>
>> Understanding so far:
>> Delegations (both file and directory) are granted by the server to the
>> client, indefinitely (until revoked or under the watermark) to cache
>> attributes. The caching of data is a result of the attribute
>> cache. Hence forth, a directory delegation will cache the directory
>> attributes and the names of the files in the directory, and a file
>> delegation will cache the attributes of the file and the file data.
>>
>> Workload run:
>> I focused on the 2 workloads below, doing 2 passes of a large flat
>> directory (with close to 100K files) -
>> a cold pass, and warm pass using the cache from the cold pass:
>> - lslr - ls -lR on both runs
>> - lsmix - ls -R (cold) and then ls -lR (warm)
>>
>> I also played with the rdirplus behavior using both the default
>> heuristic behavior and the `rdirplus=force` set at mount time.
>>
>> Numbers:
>> actimeo=5s, rdirplus=force, ACLs off, flat_dir
>> ==================================================================
>>
>> | LSLR | LSMIX
>> | (ls -lR cold / warm) | (p1 ls -R / p2 ls -lR)
>> Operation | flat cold | flat warm | flat p1 | flat p2
>> -----------------+-------------+-----------+-------------+---------
>> READDIR calls | 27 | 0 | 27 | 0
>> READDIR recv B | 23,603,024 | 0 | 23,603,024 | 0
>> call type | readdirplus | -- | readdirplus | --
>> LOOKUP | 1 | 0 | 1 | 0
>> GETATTR | 3 | 100,000 | 2 | 100,001
>> ACCESS | 2 | 0 | 2 | 0
>> -----------------+-------------+-----------+-------------+---------
>> Elapsed (age) | ~14 s | ~62 s | ~16 s | ~63 s
>>
>>
>> Observations:
>> When doing `ls` or `ls -l` on a directory, due to the open(2) on the
>> directory, the client gets a directory delegation - caching the
>> directory attributes and file names. However, as we don't have file
>> delegations due to no open(2) calls to any of the files. Henceforth,
>> the cache of file attributes is governed by `actimeo`.
>> Now here is the interesting bit, if the next `ls -l` is issued after
>> the `actimeo`, a massive GETATTR storm hits the server, doing stat()
>> calls for every file in the directory. As a result, the performance of
>> this warm `ls -l` run ends up being worse than the cold pass. I am
>> guessing this is most likely due to the compounded "rdirplus" being more
>> efficient than stat() calls.
>>
>>
>> Proposal:
>> For large directories, this ends up being a massive problem, taking 1-2
>> minutes when enumerating a directory on the warm passes.
>> - An easier way to tackle this could be to do a rdirplus=[auto | forced]
>> instead of issuing the stat(2) storm to the server: When the client
>> notices that there are cache misses, which would be the case of file
>> attributes, instead of fetching file names from the directory-delegation
>> cache and attributes from GETATTR, the client does a READDIRPLUS to
>> the server, nonetheless.
>> - A more tedious would be the to cache file attributes as well, as a part
>> of the directory delegation. This would end up requiring a change in the
>> NFS protocol spec though.
>> - Bulk GETATTR calls: I am uncertain of the feasibility of this, but
>> what if, the client could do 1 GETATTR call for getting attributes
>> for multiple files.
>
>
> ls is such a hard workload to get right, because we don't really get an
100% agree. And there were a couple of attempts to address this issue
(second ls that is slow).
> indication in the kernel of what userland's intentions are. It's
> basically a readdir() call followed by a bunch of stat()'s, but at the
> point where we're getting the readdir() call, we don't know if userland
> intends to stat() those files or not. We have to make a guess about
> that intention.
>
> In this case, it sounds like the directory cache was valid, so the
> client decided it didn't need to do a READDIR at all, but the
> individual files had caches that timed out.
>
> So imagine you're the kernel client and have been given that second
> readdir() call: Why should you decide to do a READDIRPLUS at that point
> instead of a regular READDIR?
May we need some kind of client-side heuristics, like on the server side
for open-delegations, where after seeing some `stats` for files in the
In the same directory, the client will decide to switch to READDIR (v4)
to get all attributes in one go.
Best regards,
Tigran.
> --
> Jeff Layton <jlayton@kernel.org>
[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 2309 bytes --]
next prev parent reply other threads:[~2026-06-23 11:13 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-23 10:01 NFS delegations behavior analysis Piyush Sachdeva
2026-06-23 10:50 ` Jeff Layton
2026-06-23 11:04 ` Mkrtchyan, Tigran [this message]
2026-06-23 11:10 ` Jeff Layton
2026-06-23 13:11 ` Benjamin Coddington
2026-06-23 13:31 ` Daire Byrne
2026-06-23 13:32 ` Benjamin Coddington
2026-06-23 13:40 ` Jeff Layton
2026-06-23 13:59 ` Benjamin Coddington
2026-06-23 16:29 ` Trond Myklebust
2026-06-23 13:33 ` Jeff Layton
2026-06-23 13:11 ` Anna Schumaker
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=455619640.1622514.1782212671358.JavaMail.zimbra@desy.de \
--to=tigran.mkrtchyan@desy.de \
--cc=cel@kernel.org \
--cc=jlayton@kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=s.piyush1024@gmail.com \
--cc=sfrench@samba.org \
--cc=sprasad@microsoft.com \
--cc=trondmy@kernel.org \
--cc=vaibsharma@microsoft.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox