* [PATCH v2] NFSv4: Always ask for type with READDIR
@ 2023-08-30 19:42 Benjamin Coddington
2023-08-30 20:10 ` Jeff Layton
2023-09-07 12:43 ` Benjamin Coddington
0 siblings, 2 replies; 13+ messages in thread
From: Benjamin Coddington @ 2023-08-30 19:42 UTC (permalink / raw)
To: trond.myklebust, anna; +Cc: linux-nfs, jlayton
Again we have claimed regressions for walking a directory tree, this time
with the "find" utility which always tries to optimize away asking for any
attributes until it has a complete list of entries. This behavior makes
the readdir plus heuristic do the wrong thing, which causes a storm of
GETATTRs to determine each entry's type in order to continue the walk.
For v4 add the type attribute to each READDIR request to include it no
matter the heuristic. This allows a simple `find` command to proceed
quickly through a directory tree.
Suggested-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
--
On v2: Don't add the type attribute twice
---
fs/nfs/nfs4xdr.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
index deec76cf5afe..7200d6f7cd7b 100644
--- a/fs/nfs/nfs4xdr.c
+++ b/fs/nfs/nfs4xdr.c
@@ -1602,7 +1602,7 @@ static void encode_read(struct xdr_stream *xdr, const struct nfs_pgio_args *args
static void encode_readdir(struct xdr_stream *xdr, const struct nfs4_readdir_arg *readdir, struct rpc_rqst *req, struct compound_hdr *hdr)
{
uint32_t attrs[3] = {
- FATTR4_WORD0_RDATTR_ERROR,
+ FATTR4_WORD0_TYPE|FATTR4_WORD0_RDATTR_ERROR,
FATTR4_WORD1_MOUNTED_ON_FILEID,
};
uint32_t dircount = readdir->count;
@@ -1612,7 +1612,7 @@ static void encode_readdir(struct xdr_stream *xdr, const struct nfs4_readdir_arg
unsigned int i;
if (readdir->plus) {
- attrs[0] |= FATTR4_WORD0_TYPE|FATTR4_WORD0_CHANGE|FATTR4_WORD0_SIZE|
+ attrs[0] |= FATTR4_WORD0_CHANGE|FATTR4_WORD0_SIZE|
FATTR4_WORD0_FSID|FATTR4_WORD0_FILEHANDLE|FATTR4_WORD0_FILEID;
attrs[1] |= FATTR4_WORD1_MODE|FATTR4_WORD1_NUMLINKS|FATTR4_WORD1_OWNER|
FATTR4_WORD1_OWNER_GROUP|FATTR4_WORD1_RAWDEV|
--
2.40.1
^ permalink raw reply related [flat|nested] 13+ messages in thread* Re: [PATCH v2] NFSv4: Always ask for type with READDIR
2023-08-30 19:42 [PATCH v2] NFSv4: Always ask for type with READDIR Benjamin Coddington
@ 2023-08-30 20:10 ` Jeff Layton
2023-08-30 20:20 ` Trond Myklebust
2023-09-07 12:43 ` Benjamin Coddington
1 sibling, 1 reply; 13+ messages in thread
From: Jeff Layton @ 2023-08-30 20:10 UTC (permalink / raw)
To: Benjamin Coddington, trond.myklebust, anna; +Cc: linux-nfs
On Wed, 2023-08-30 at 15:42 -0400, Benjamin Coddington wrote:
> Again we have claimed regressions for walking a directory tree, this time
> with the "find" utility which always tries to optimize away asking for any
> attributes until it has a complete list of entries. This behavior makes
> the readdir plus heuristic do the wrong thing, which causes a storm of
> GETATTRs to determine each entry's type in order to continue the walk.
>
> For v4 add the type attribute to each READDIR request to include it no
> matter the heuristic. This allows a simple `find` command to proceed
> quickly through a directory tree.
>
The important bit here is that with v4, we can fill out d_type even when
"plus" is false, at little cost. The downside is that non-plus READDIR
replies will now be a bit larger on the wire. I think it's a worthwhile
tradeoff though.
Reviewed-by: Jeff Layton <jlayton@kernel.org>
> Suggested-by: Jeff Layton <jlayton@kernel.org>
> Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
>
> --
> On v2: Don't add the type attribute twice
> ---
> fs/nfs/nfs4xdr.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
> index deec76cf5afe..7200d6f7cd7b 100644
> --- a/fs/nfs/nfs4xdr.c
> +++ b/fs/nfs/nfs4xdr.c
> @@ -1602,7 +1602,7 @@ static void encode_read(struct xdr_stream *xdr, const struct nfs_pgio_args *args
> static void encode_readdir(struct xdr_stream *xdr, const struct nfs4_readdir_arg *readdir, struct rpc_rqst *req, struct compound_hdr *hdr)
> {
> uint32_t attrs[3] = {
> - FATTR4_WORD0_RDATTR_ERROR,
> + FATTR4_WORD0_TYPE|FATTR4_WORD0_RDATTR_ERROR,
> FATTR4_WORD1_MOUNTED_ON_FILEID,
> };
> uint32_t dircount = readdir->count;
> @@ -1612,7 +1612,7 @@ static void encode_readdir(struct xdr_stream *xdr, const struct nfs4_readdir_arg
> unsigned int i;
>
> if (readdir->plus) {
> - attrs[0] |= FATTR4_WORD0_TYPE|FATTR4_WORD0_CHANGE|FATTR4_WORD0_SIZE|
> + attrs[0] |= FATTR4_WORD0_CHANGE|FATTR4_WORD0_SIZE|
> FATTR4_WORD0_FSID|FATTR4_WORD0_FILEHANDLE|FATTR4_WORD0_FILEID;
> attrs[1] |= FATTR4_WORD1_MODE|FATTR4_WORD1_NUMLINKS|FATTR4_WORD1_OWNER|
> FATTR4_WORD1_OWNER_GROUP|FATTR4_WORD1_RAWDEV|
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: [PATCH v2] NFSv4: Always ask for type with READDIR
2023-08-30 20:10 ` Jeff Layton
@ 2023-08-30 20:20 ` Trond Myklebust
2023-08-30 21:14 ` Jeff Layton
0 siblings, 1 reply; 13+ messages in thread
From: Trond Myklebust @ 2023-08-30 20:20 UTC (permalink / raw)
To: anna@kernel.org, jlayton@kernel.org, bcodding@redhat.com
Cc: linux-nfs@vger.kernel.org
On Wed, 2023-08-30 at 16:10 -0400, Jeff Layton wrote:
> On Wed, 2023-08-30 at 15:42 -0400, Benjamin Coddington wrote:
> > Again we have claimed regressions for walking a directory tree,
> > this time
> > with the "find" utility which always tries to optimize away asking
> > for any
> > attributes until it has a complete list of entries. This behavior
> > makes
> > the readdir plus heuristic do the wrong thing, which causes a storm
> > of
> > GETATTRs to determine each entry's type in order to continue the
> > walk.
> >
> > For v4 add the type attribute to each READDIR request to include it
> > no
> > matter the heuristic. This allows a simple `find` command to
> > proceed
> > quickly through a directory tree.
> >
>
> The important bit here is that with v4, we can fill out d_type even
> when
> "plus" is false, at little cost. The downside is that non-plus
> READDIR
> replies will now be a bit larger on the wire. I think it's a
> worthwhile
> tradeoff though.
The reason why we never did it before is that for many servers, it
forces them to go to the inode in order to retrieve the information.
IOW: You might as well just do readdirplus.
--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2] NFSv4: Always ask for type with READDIR
2023-08-30 20:20 ` Trond Myklebust
@ 2023-08-30 21:14 ` Jeff Layton
2023-08-31 15:17 ` Benjamin Coddington
2023-08-31 18:41 ` Cedric Blancher
0 siblings, 2 replies; 13+ messages in thread
From: Jeff Layton @ 2023-08-30 21:14 UTC (permalink / raw)
To: Trond Myklebust, anna@kernel.org, bcodding@redhat.com
Cc: linux-nfs@vger.kernel.org
On Wed, 2023-08-30 at 20:20 +0000, Trond Myklebust wrote:
> On Wed, 2023-08-30 at 16:10 -0400, Jeff Layton wrote:
> > On Wed, 2023-08-30 at 15:42 -0400, Benjamin Coddington wrote:
> > > Again we have claimed regressions for walking a directory tree,
> > > this time
> > > with the "find" utility which always tries to optimize away asking
> > > for any
> > > attributes until it has a complete list of entries. This behavior
> > > makes
> > > the readdir plus heuristic do the wrong thing, which causes a storm
> > > of
> > > GETATTRs to determine each entry's type in order to continue the
> > > walk.
> > >
> > > For v4 add the type attribute to each READDIR request to include it
> > > no
> > > matter the heuristic. This allows a simple `find` command to
> > > proceed
> > > quickly through a directory tree.
> > >
> >
> > The important bit here is that with v4, we can fill out d_type even
> > when
> > "plus" is false, at little cost. The downside is that non-plus
> > READDIR
> > replies will now be a bit larger on the wire. I think it's a
> > worthwhile
> > tradeoff though.
>
> The reason why we never did it before is that for many servers, it
> forces them to go to the inode in order to retrieve the information.
>
> IOW: You might as well just do readdirplus.
>
That makes total sense, given how this code has evolved.
FWIW, the Linux NFS server already calls vfs_getattr for every dentry in
a v4 READDIR reply regardless of what the client requests. It has to in
order to detect junctions, so we're bringing in the inode no matter
what. Fetching the type is trivial, so I don't see this as costing
anything extra there.
Mileage could vary on other servers with more synthetic filesystems, but
one would hope that most of them can also return the type cheaply.
--
Jeff Layton <jlayton@kernel.org>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2] NFSv4: Always ask for type with READDIR
2023-08-30 21:14 ` Jeff Layton
@ 2023-08-31 15:17 ` Benjamin Coddington
2023-08-31 15:24 ` Jeff Layton
2023-08-31 18:41 ` Cedric Blancher
1 sibling, 1 reply; 13+ messages in thread
From: Benjamin Coddington @ 2023-08-31 15:17 UTC (permalink / raw)
To: Jeff Layton; +Cc: Trond Myklebust, anna, linux-nfs
On 30 Aug 2023, at 17:14, Jeff Layton wrote:
> On Wed, 2023-08-30 at 20:20 +0000, Trond Myklebust wrote:
>> On Wed, 2023-08-30 at 16:10 -0400, Jeff Layton wrote:
>>> On Wed, 2023-08-30 at 15:42 -0400, Benjamin Coddington wrote:
>>>> Again we have claimed regressions for walking a directory tree,
>>>> this time
>>>> with the "find" utility which always tries to optimize away asking
>>>> for any
>>>> attributes until it has a complete list of entries. This behavior
>>>> makes
>>>> the readdir plus heuristic do the wrong thing, which causes a storm
>>>> of
>>>> GETATTRs to determine each entry's type in order to continue the
>>>> walk.
>>>>
>>>> For v4 add the type attribute to each READDIR request to include it
>>>> no
>>>> matter the heuristic. This allows a simple `find` command to
>>>> proceed
>>>> quickly through a directory tree.
>>>>
>>>
>>> The important bit here is that with v4, we can fill out d_type even
>>> when
>>> "plus" is false, at little cost. The downside is that non-plus
>>> READDIR
>>> replies will now be a bit larger on the wire. I think it's a
>>> worthwhile
>>> tradeoff though.
>>
>> The reason why we never did it before is that for many servers, it
>> forces them to go to the inode in order to retrieve the information.
>>
>> IOW: You might as well just do readdirplus.
>>
>
> That makes total sense, given how this code has evolved.
>
> FWIW, the Linux NFS server already calls vfs_getattr for every dentry in
> a v4 READDIR reply regardless of what the client requests. It has to in
> order to detect junctions, so we're bringing in the inode no matter
> what. Fetching the type is trivial, so I don't see this as costing
> anything extra there.
>
> Mileage could vary on other servers with more synthetic filesystems, but
> one would hope that most of them can also return the type cheaply.
It occurred to me that we could let those other server folks ask for
whatever attributes they wanted if we make it tunable at runtime:
https://lore.kernel.org/linux-nfs/8f752f70daf73016e20c49508f825e8c2c94f5e7.1693494824.git.bcodding@redhat.com/T/#u
Ben
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2] NFSv4: Always ask for type with READDIR
2023-08-31 15:17 ` Benjamin Coddington
@ 2023-08-31 15:24 ` Jeff Layton
0 siblings, 0 replies; 13+ messages in thread
From: Jeff Layton @ 2023-08-31 15:24 UTC (permalink / raw)
To: Benjamin Coddington; +Cc: Trond Myklebust, anna, linux-nfs
On Thu, 2023-08-31 at 11:17 -0400, Benjamin Coddington wrote:
> On 30 Aug 2023, at 17:14, Jeff Layton wrote:
>
> > On Wed, 2023-08-30 at 20:20 +0000, Trond Myklebust wrote:
> > > On Wed, 2023-08-30 at 16:10 -0400, Jeff Layton wrote:
> > > > On Wed, 2023-08-30 at 15:42 -0400, Benjamin Coddington wrote:
> > > > > Again we have claimed regressions for walking a directory tree,
> > > > > this time
> > > > > with the "find" utility which always tries to optimize away asking
> > > > > for any
> > > > > attributes until it has a complete list of entries. This behavior
> > > > > makes
> > > > > the readdir plus heuristic do the wrong thing, which causes a storm
> > > > > of
> > > > > GETATTRs to determine each entry's type in order to continue the
> > > > > walk.
> > > > >
> > > > > For v4 add the type attribute to each READDIR request to include it
> > > > > no
> > > > > matter the heuristic. This allows a simple `find` command to
> > > > > proceed
> > > > > quickly through a directory tree.
> > > > >
> > > >
> > > > The important bit here is that with v4, we can fill out d_type even
> > > > when
> > > > "plus" is false, at little cost. The downside is that non-plus
> > > > READDIR
> > > > replies will now be a bit larger on the wire. I think it's a
> > > > worthwhile
> > > > tradeoff though.
> > >
> > > The reason why we never did it before is that for many servers, it
> > > forces them to go to the inode in order to retrieve the information.
> > >
> > > IOW: You might as well just do readdirplus.
> > >
> >
> > That makes total sense, given how this code has evolved.
> >
> > FWIW, the Linux NFS server already calls vfs_getattr for every dentry in
> > a v4 READDIR reply regardless of what the client requests. It has to in
> > order to detect junctions, so we're bringing in the inode no matter
> > what. Fetching the type is trivial, so I don't see this as costing
> > anything extra there.
> >
> > Mileage could vary on other servers with more synthetic filesystems, but
> > one would hope that most of them can also return the type cheaply.
>
> It occurred to me that we could let those other server folks ask for
> whatever attributes they wanted if we make it tunable at runtime:
>
> https://lore.kernel.org/linux-nfs/8f752f70daf73016e20c49508f825e8c2c94f5e7.1693494824.git.bcodding@redhat.com/T/#u
>
That's a possibility, but I probably wouldn't add tunables for this
until the need was more clear.
--
Jeff Layton <jlayton@kernel.org>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2] NFSv4: Always ask for type with READDIR
2023-08-30 21:14 ` Jeff Layton
2023-08-31 15:17 ` Benjamin Coddington
@ 2023-08-31 18:41 ` Cedric Blancher
2023-08-31 18:53 ` Jeff Layton
1 sibling, 1 reply; 13+ messages in thread
From: Cedric Blancher @ 2023-08-31 18:41 UTC (permalink / raw)
To: linux-nfs@vger.kernel.org
On Thu, 31 Aug 2023 at 02:17, Jeff Layton <jlayton@kernel.org> wrote:
>
> On Wed, 2023-08-30 at 20:20 +0000, Trond Myklebust wrote:
> > On Wed, 2023-08-30 at 16:10 -0400, Jeff Layton wrote:
> > > On Wed, 2023-08-30 at 15:42 -0400, Benjamin Coddington wrote:
> > > > Again we have claimed regressions for walking a directory tree,
> > > > this time
> > > > with the "find" utility which always tries to optimize away asking
> > > > for any
> > > > attributes until it has a complete list of entries. This behavior
> > > > makes
> > > > the readdir plus heuristic do the wrong thing, which causes a storm
> > > > of
> > > > GETATTRs to determine each entry's type in order to continue the
> > > > walk.
> > > >
> > > > For v4 add the type attribute to each READDIR request to include it
> > > > no
> > > > matter the heuristic. This allows a simple `find` command to
> > > > proceed
> > > > quickly through a directory tree.
> > > >
> > >
> > > The important bit here is that with v4, we can fill out d_type even
> > > when
> > > "plus" is false, at little cost. The downside is that non-plus
> > > READDIR
> > > replies will now be a bit larger on the wire. I think it's a
> > > worthwhile
> > > tradeoff though.
> >
> > The reason why we never did it before is that for many servers, it
> > forces them to go to the inode in order to retrieve the information.
> >
> > IOW: You might as well just do readdirplus.
> >
>
> That makes total sense, given how this code has evolved.
>
> FWIW, the Linux NFS server already calls vfs_getattr for every dentry in
> a v4 READDIR reply regardless of what the client requests. It has to in
> order to detect junctions, so we're bringing in the inode no matter
> what. Fetching the type is trivial, so I don't see this as costing
> anything extra there.
>
> Mileage could vary on other servers with more synthetic filesystems, but
> one would hope that most of them can also return the type cheaply.
Do you have examples for such synthetic filesystems?
Ced
--
Cedric Blancher <cedric.blancher@gmail.com>
[https://plus.google.com/u/0/+CedricBlancher/]
Institute Pasteur
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2] NFSv4: Always ask for type with READDIR
2023-08-31 18:41 ` Cedric Blancher
@ 2023-08-31 18:53 ` Jeff Layton
2023-08-31 20:08 ` Rick Macklem
0 siblings, 1 reply; 13+ messages in thread
From: Jeff Layton @ 2023-08-31 18:53 UTC (permalink / raw)
To: Cedric Blancher, linux-nfs@vger.kernel.org
On Thu, 2023-08-31 at 20:41 +0200, Cedric Blancher wrote:
> On Thu, 31 Aug 2023 at 02:17, Jeff Layton <jlayton@kernel.org> wrote:
> >
> > On Wed, 2023-08-30 at 20:20 +0000, Trond Myklebust wrote:
> > > On Wed, 2023-08-30 at 16:10 -0400, Jeff Layton wrote:
> > > > On Wed, 2023-08-30 at 15:42 -0400, Benjamin Coddington wrote:
> > > > > Again we have claimed regressions for walking a directory tree,
> > > > > this time
> > > > > with the "find" utility which always tries to optimize away asking
> > > > > for any
> > > > > attributes until it has a complete list of entries. This behavior
> > > > > makes
> > > > > the readdir plus heuristic do the wrong thing, which causes a storm
> > > > > of
> > > > > GETATTRs to determine each entry's type in order to continue the
> > > > > walk.
> > > > >
> > > > > For v4 add the type attribute to each READDIR request to include it
> > > > > no
> > > > > matter the heuristic. This allows a simple `find` command to
> > > > > proceed
> > > > > quickly through a directory tree.
> > > > >
> > > >
> > > > The important bit here is that with v4, we can fill out d_type even
> > > > when
> > > > "plus" is false, at little cost. The downside is that non-plus
> > > > READDIR
> > > > replies will now be a bit larger on the wire. I think it's a
> > > > worthwhile
> > > > tradeoff though.
> > >
> > > The reason why we never did it before is that for many servers, it
> > > forces them to go to the inode in order to retrieve the information.
> > >
> > > IOW: You might as well just do readdirplus.
> > >
> >
> > That makes total sense, given how this code has evolved.
> >
> > FWIW, the Linux NFS server already calls vfs_getattr for every dentry in
> > a v4 READDIR reply regardless of what the client requests. It has to in
> > order to detect junctions, so we're bringing in the inode no matter
> > what. Fetching the type is trivial, so I don't see this as costing
> > anything extra there.
> >
> > Mileage could vary on other servers with more synthetic filesystems, but
> > one would hope that most of them can also return the type cheaply.
>
> Do you have examples for such synthetic filesystems?
>
Synthetic is probably the wrong distinction here, actually.
If looking up the inode type info is expensive, then you'll feel it here
more with this change. That's true regardless of whether this is a
"normal" or "synthetic" fs.
I wouldn't expect a big performance hit from the Linux NFS server given
that we'll almost certainly have that info in-core, but other servers
(ganesha? some commercial servers?) could take a hit here.
--
Jeff Layton <jlayton@kernel.org>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2] NFSv4: Always ask for type with READDIR
2023-08-31 18:53 ` Jeff Layton
@ 2023-08-31 20:08 ` Rick Macklem
2023-08-31 21:33 ` Jeff Layton
0 siblings, 1 reply; 13+ messages in thread
From: Rick Macklem @ 2023-08-31 20:08 UTC (permalink / raw)
To: Jeff Layton; +Cc: Cedric Blancher, linux-nfs@vger.kernel.org
On Thu, Aug 31, 2023 at 11:53 AM Jeff Layton <jlayton@kernel.org> wrote:
>
> CAUTION: This email originated from outside of the University of Guelph. Do not click links or open attachments unless you recognize the sender and know the content is safe. If in doubt, forward suspicious emails to IThelp@uoguelph.ca.
>
>
> On Thu, 2023-08-31 at 20:41 +0200, Cedric Blancher wrote:
> > On Thu, 31 Aug 2023 at 02:17, Jeff Layton <jlayton@kernel.org> wrote:
> > >
> > > On Wed, 2023-08-30 at 20:20 +0000, Trond Myklebust wrote:
> > > > On Wed, 2023-08-30 at 16:10 -0400, Jeff Layton wrote:
> > > > > On Wed, 2023-08-30 at 15:42 -0400, Benjamin Coddington wrote:
> > > > > > Again we have claimed regressions for walking a directory tree,
> > > > > > this time
> > > > > > with the "find" utility which always tries to optimize away asking
> > > > > > for any
> > > > > > attributes until it has a complete list of entries. This behavior
> > > > > > makes
> > > > > > the readdir plus heuristic do the wrong thing, which causes a storm
> > > > > > of
> > > > > > GETATTRs to determine each entry's type in order to continue the
> > > > > > walk.
> > > > > >
> > > > > > For v4 add the type attribute to each READDIR request to include it
> > > > > > no
> > > > > > matter the heuristic. This allows a simple `find` command to
> > > > > > proceed
> > > > > > quickly through a directory tree.
> > > > > >
> > > > >
> > > > > The important bit here is that with v4, we can fill out d_type even
> > > > > when
> > > > > "plus" is false, at little cost. The downside is that non-plus
> > > > > READDIR
> > > > > replies will now be a bit larger on the wire. I think it's a
> > > > > worthwhile
> > > > > tradeoff though.
> > > >
> > > > The reason why we never did it before is that for many servers, it
> > > > forces them to go to the inode in order to retrieve the information.
> > > >
> > > > IOW: You might as well just do readdirplus.
> > > >
> > >
> > > That makes total sense, given how this code has evolved.
> > >
> > > FWIW, the Linux NFS server already calls vfs_getattr for every dentry in
> > > a v4 READDIR reply regardless of what the client requests. It has to in
> > > order to detect junctions, so we're bringing in the inode no matter
> > > what. Fetching the type is trivial, so I don't see this as costing
> > > anything extra there.
> > >
> > > Mileage could vary on other servers with more synthetic filesystems, but
> > > one would hope that most of them can also return the type cheaply.
> >
> > Do you have examples for such synthetic filesystems?
> >
>
> Synthetic is probably the wrong distinction here, actually.
>
> If looking up the inode type info is expensive, then you'll feel it here
> more with this change. That's true regardless of whether this is a
> "normal" or "synthetic" fs.
In case you are interested in an outsider's perspective...
I recently patched the FreeBSD server so that it did not need to
acquire a vnode to generate a Readdir reply if only the following
attributes are requested and the entry is not a directory.
(FreeBSD has a d_type field in its "struct dirent".)
RDAttr_error, Mounted_on_FileID, FileID, Type
--> Adding a requirement for Type to nordirplus would not
have any negative effect on the FreeBSD server.
This patch resulted in about a 5% improvement on Readdir RPC
response time for Readdirs only asking for the above attributes,
for some simple measurements I did using the FreeBSD client.
I still need to acquire the vnode for directories, to check for
server file system mount points. I do not know if what you
refer as "junctions" are directory specific?
rick
>
> I wouldn't expect a big performance hit from the Linux NFS server given
> that we'll almost certainly have that info in-core, but other servers
> (ganesha? some commercial servers?) could take a hit here.
> --
> Jeff Layton <jlayton@kernel.org>
>
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: [PATCH v2] NFSv4: Always ask for type with READDIR
2023-08-31 20:08 ` Rick Macklem
@ 2023-08-31 21:33 ` Jeff Layton
2023-09-01 16:03 ` Chuck Lever III
0 siblings, 1 reply; 13+ messages in thread
From: Jeff Layton @ 2023-08-31 21:33 UTC (permalink / raw)
To: Rick Macklem, Chuck Lever; +Cc: Cedric Blancher, linux-nfs@vger.kernel.org
On Thu, 2023-08-31 at 13:08 -0700, Rick Macklem wrote:
> On Thu, Aug 31, 2023 at 11:53 AM Jeff Layton <jlayton@kernel.org> wrote:
> >
> > CAUTION: This email originated from outside of the University of Guelph. Do not click links or open attachments unless you recognize the sender and know the content is safe. If in doubt, forward suspicious emails to IThelp@uoguelph.ca.
> >
> >
> > On Thu, 2023-08-31 at 20:41 +0200, Cedric Blancher wrote:
> > > On Thu, 31 Aug 2023 at 02:17, Jeff Layton <jlayton@kernel.org> wrote:
> > > >
> > > > On Wed, 2023-08-30 at 20:20 +0000, Trond Myklebust wrote:
> > > > > On Wed, 2023-08-30 at 16:10 -0400, Jeff Layton wrote:
> > > > > > On Wed, 2023-08-30 at 15:42 -0400, Benjamin Coddington wrote:
> > > > > > > Again we have claimed regressions for walking a directory tree,
> > > > > > > this time
> > > > > > > with the "find" utility which always tries to optimize away asking
> > > > > > > for any
> > > > > > > attributes until it has a complete list of entries. This behavior
> > > > > > > makes
> > > > > > > the readdir plus heuristic do the wrong thing, which causes a storm
> > > > > > > of
> > > > > > > GETATTRs to determine each entry's type in order to continue the
> > > > > > > walk.
> > > > > > >
> > > > > > > For v4 add the type attribute to each READDIR request to include it
> > > > > > > no
> > > > > > > matter the heuristic. This allows a simple `find` command to
> > > > > > > proceed
> > > > > > > quickly through a directory tree.
> > > > > > >
> > > > > >
> > > > > > The important bit here is that with v4, we can fill out d_type even
> > > > > > when
> > > > > > "plus" is false, at little cost. The downside is that non-plus
> > > > > > READDIR
> > > > > > replies will now be a bit larger on the wire. I think it's a
> > > > > > worthwhile
> > > > > > tradeoff though.
> > > > >
> > > > > The reason why we never did it before is that for many servers, it
> > > > > forces them to go to the inode in order to retrieve the information.
> > > > >
> > > > > IOW: You might as well just do readdirplus.
> > > > >
> > > >
> > > > That makes total sense, given how this code has evolved.
> > > >
> > > > FWIW, the Linux NFS server already calls vfs_getattr for every dentry in
> > > > a v4 READDIR reply regardless of what the client requests. It has to in
> > > > order to detect junctions, so we're bringing in the inode no matter
> > > > what. Fetching the type is trivial, so I don't see this as costing
> > > > anything extra there.
> > > >
> > > > Mileage could vary on other servers with more synthetic filesystems, but
> > > > one would hope that most of them can also return the type cheaply.
> > >
> > > Do you have examples for such synthetic filesystems?
> > >
> >
> > Synthetic is probably the wrong distinction here, actually.
> >
> > If looking up the inode type info is expensive, then you'll feel it here
> > more with this change. That's true regardless of whether this is a
> > "normal" or "synthetic" fs.
> In case you are interested in an outsider's perspective...
> I recently patched the FreeBSD server so that it did not need to
> acquire a vnode to generate a Readdir reply if only the following
> attributes are requested and the entry is not a directory.
> (FreeBSD has a d_type field in its "struct dirent".)
> RDAttr_error, Mounted_on_FileID, FileID, Type
> --> Adding a requirement for Type to nordirplus would not
> have any negative effect on the FreeBSD server.
>
> This patch resulted in about a 5% improvement on Readdir RPC
> response time for Readdirs only asking for the above attributes,
> for some simple measurements I did using the FreeBSD client.
Very nice!
> I still need to acquire the vnode for directories, to check for
> server file system mount points. I do not know if what you
> refer as "junctions" are directory specific?
>
The nfsref command looks like it only works on directories, but in the
kernel code, I don't see where it enforces that it be a directory. You
can have a file mountpoint in Linux, after all...
Chuck (cc'ed) would know for sure... ;)
> >
> > I wouldn't expect a big performance hit from the Linux NFS server given
> > that we'll almost certainly have that info in-core, but other servers
> > (ganesha? some commercial servers?) could take a hit here.
> > --
> > Jeff Layton <jlayton@kernel.org>
> >
--
Jeff Layton <jlayton@kernel.org>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2] NFSv4: Always ask for type with READDIR
2023-08-31 21:33 ` Jeff Layton
@ 2023-09-01 16:03 ` Chuck Lever III
0 siblings, 0 replies; 13+ messages in thread
From: Chuck Lever III @ 2023-09-01 16:03 UTC (permalink / raw)
To: Jeff Layton; +Cc: Rick Macklem, Cedric Blancher, Linux NFS Mailing List
> On Aug 31, 2023, at 5:33 PM, Jeff Layton <jlayton@kernel.org> wrote:
>
> On Thu, 2023-08-31 at 13:08 -0700, Rick Macklem wrote:
>> On Thu, Aug 31, 2023 at 11:53 AM Jeff Layton <jlayton@kernel.org> wrote:
>>>
>>> CAUTION: This email originated from outside of the University of Guelph. Do not click links or open attachments unless you recognize the sender and know the content is safe. If in doubt, forward suspicious emails to IThelp@uoguelph.ca.
>>>
>>>
>>> On Thu, 2023-08-31 at 20:41 +0200, Cedric Blancher wrote:
>>>> On Thu, 31 Aug 2023 at 02:17, Jeff Layton <jlayton@kernel.org> wrote:
>>>>>
>>>>> On Wed, 2023-08-30 at 20:20 +0000, Trond Myklebust wrote:
>>>>>> On Wed, 2023-08-30 at 16:10 -0400, Jeff Layton wrote:
>>>>>>> On Wed, 2023-08-30 at 15:42 -0400, Benjamin Coddington wrote:
>>>>>>>> Again we have claimed regressions for walking a directory tree,
>>>>>>>> this time
>>>>>>>> with the "find" utility which always tries to optimize away asking
>>>>>>>> for any
>>>>>>>> attributes until it has a complete list of entries. This behavior
>>>>>>>> makes
>>>>>>>> the readdir plus heuristic do the wrong thing, which causes a storm
>>>>>>>> of
>>>>>>>> GETATTRs to determine each entry's type in order to continue the
>>>>>>>> walk.
>>>>>>>>
>>>>>>>> For v4 add the type attribute to each READDIR request to include it
>>>>>>>> no
>>>>>>>> matter the heuristic. This allows a simple `find` command to
>>>>>>>> proceed
>>>>>>>> quickly through a directory tree.
>>>>>>>>
>>>>>>>
>>>>>>> The important bit here is that with v4, we can fill out d_type even
>>>>>>> when
>>>>>>> "plus" is false, at little cost. The downside is that non-plus
>>>>>>> READDIR
>>>>>>> replies will now be a bit larger on the wire. I think it's a
>>>>>>> worthwhile
>>>>>>> tradeoff though.
>>>>>>
>>>>>> The reason why we never did it before is that for many servers, it
>>>>>> forces them to go to the inode in order to retrieve the information.
>>>>>>
>>>>>> IOW: You might as well just do readdirplus.
>>>>>>
>>>>>
>>>>> That makes total sense, given how this code has evolved.
>>>>>
>>>>> FWIW, the Linux NFS server already calls vfs_getattr for every dentry in
>>>>> a v4 READDIR reply regardless of what the client requests. It has to in
>>>>> order to detect junctions, so we're bringing in the inode no matter
>>>>> what. Fetching the type is trivial, so I don't see this as costing
>>>>> anything extra there.
>>>>>
>>>>> Mileage could vary on other servers with more synthetic filesystems, but
>>>>> one would hope that most of them can also return the type cheaply.
>>>>
>>>> Do you have examples for such synthetic filesystems?
>>>>
>>>
>>> Synthetic is probably the wrong distinction here, actually.
>>>
>>> If looking up the inode type info is expensive, then you'll feel it here
>>> more with this change. That's true regardless of whether this is a
>>> "normal" or "synthetic" fs.
>> In case you are interested in an outsider's perspective...
>> I recently patched the FreeBSD server so that it did not need to
>> acquire a vnode to generate a Readdir reply if only the following
>> attributes are requested and the entry is not a directory.
>> (FreeBSD has a d_type field in its "struct dirent".)
>> RDAttr_error, Mounted_on_FileID, FileID, Type
>> --> Adding a requirement for Type to nordirplus would not
>> have any negative effect on the FreeBSD server.
>>
>> This patch resulted in about a 5% improvement on Readdir RPC
>> response time for Readdirs only asking for the above attributes,
>> for some simple measurements I did using the FreeBSD client.
>
>
> Very nice!
>
>> I still need to acquire the vnode for directories, to check for
>> server file system mount points. I do not know if what you
>> refer as "junctions" are directory specific?
>>
>
> The nfsref command looks like it only works on directories, but in the
> kernel code, I don't see where it enforces that it be a directory. You
> can have a file mountpoint in Linux, after all...
>
> Chuck (cc'ed) would know for sure... ;)
I did the junction work a decade ago, it's all leaked out of my head.
Junctions are marked with a special combination of mode bits. I'm not
sure there's any constraint on what type of file can be changed into
a junction, but we've only tested with directories.
--
Chuck Lever
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2] NFSv4: Always ask for type with READDIR
2023-08-30 19:42 [PATCH v2] NFSv4: Always ask for type with READDIR Benjamin Coddington
2023-08-30 20:10 ` Jeff Layton
@ 2023-09-07 12:43 ` Benjamin Coddington
1 sibling, 0 replies; 13+ messages in thread
From: Benjamin Coddington @ 2023-09-07 12:43 UTC (permalink / raw)
To: trond.myklebust, anna; +Cc: linux-nfs, jlayton
Hello Trond and Anna - two questions:
Any chance of this going this cycle upon its merits of simplicity outweighing the lateness?
If no - can we expect it on 6.7, or should I continue to look for another approach that doesn't potentially penalize some servers?
Ben
On 30 Aug 2023, at 15:42, Benjamin Coddington wrote:
> Again we have claimed regressions for walking a directory tree, this time
> with the "find" utility which always tries to optimize away asking for any
> attributes until it has a complete list of entries. This behavior makes
> the readdir plus heuristic do the wrong thing, which causes a storm of
> GETATTRs to determine each entry's type in order to continue the walk.
>
> For v4 add the type attribute to each READDIR request to include it no
> matter the heuristic. This allows a simple `find` command to proceed
> quickly through a directory tree.
>
> Suggested-by: Jeff Layton <jlayton@kernel.org>
> Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
>
> --
> On v2: Don't add the type attribute twice
> ---
> fs/nfs/nfs4xdr.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
> index deec76cf5afe..7200d6f7cd7b 100644
> --- a/fs/nfs/nfs4xdr.c
> +++ b/fs/nfs/nfs4xdr.c
> @@ -1602,7 +1602,7 @@ static void encode_read(struct xdr_stream *xdr, const struct nfs_pgio_args *args
> static void encode_readdir(struct xdr_stream *xdr, const struct nfs4_readdir_arg *readdir, struct rpc_rqst *req, struct compound_hdr *hdr)
> {
> uint32_t attrs[3] = {
> - FATTR4_WORD0_RDATTR_ERROR,
> + FATTR4_WORD0_TYPE|FATTR4_WORD0_RDATTR_ERROR,
> FATTR4_WORD1_MOUNTED_ON_FILEID,
> };
> uint32_t dircount = readdir->count;
> @@ -1612,7 +1612,7 @@ static void encode_readdir(struct xdr_stream *xdr, const struct nfs4_readdir_arg
> unsigned int i;
>
> if (readdir->plus) {
> - attrs[0] |= FATTR4_WORD0_TYPE|FATTR4_WORD0_CHANGE|FATTR4_WORD0_SIZE|
> + attrs[0] |= FATTR4_WORD0_CHANGE|FATTR4_WORD0_SIZE|
> FATTR4_WORD0_FSID|FATTR4_WORD0_FILEHANDLE|FATTR4_WORD0_FILEID;
> attrs[1] |= FATTR4_WORD1_MODE|FATTR4_WORD1_NUMLINKS|FATTR4_WORD1_OWNER|
> FATTR4_WORD1_OWNER_GROUP|FATTR4_WORD1_RAWDEV|
> --
> 2.
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH v2] NFSv4: Always ask for type with READDIR
@ 2023-12-06 13:10 Benjamin Coddington
0 siblings, 0 replies; 13+ messages in thread
From: Benjamin Coddington @ 2023-12-06 13:10 UTC (permalink / raw)
To: trond.myklebust, anna; +Cc: linux-nfs, Christoph Hellwig
Again we have claimed regressions for walking a directory tree, this time
with the "find" utility which always tries to optimize away asking for any
attributes until it has a complete list of entries. This behavior makes
the readdir plus heuristic do the wrong thing, which causes a storm of
GETATTRs to determine each entry's type in order to continue the walk.
For v4 add the type attribute to each READDIR request to include it no
matter the heuristic. This allows a simple `find` command to proceed
quickly through a directory tree.
Suggested-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
fs/nfs/nfs4xdr.c | 23 ++++++++++++++++-------
1 file changed, 16 insertions(+), 7 deletions(-)
diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
index deec76cf5afe..69406e60f391 100644
--- a/fs/nfs/nfs4xdr.c
+++ b/fs/nfs/nfs4xdr.c
@@ -1602,7 +1602,8 @@ static void encode_read(struct xdr_stream *xdr, const struct nfs_pgio_args *args
static void encode_readdir(struct xdr_stream *xdr, const struct nfs4_readdir_arg *readdir, struct rpc_rqst *req, struct compound_hdr *hdr)
{
uint32_t attrs[3] = {
- FATTR4_WORD0_RDATTR_ERROR,
+ FATTR4_WORD0_TYPE
+ | FATTR4_WORD0_RDATTR_ERROR,
FATTR4_WORD1_MOUNTED_ON_FILEID,
};
uint32_t dircount = readdir->count;
@@ -1612,12 +1613,20 @@ static void encode_readdir(struct xdr_stream *xdr, const struct nfs4_readdir_arg
unsigned int i;
if (readdir->plus) {
- attrs[0] |= FATTR4_WORD0_TYPE|FATTR4_WORD0_CHANGE|FATTR4_WORD0_SIZE|
- FATTR4_WORD0_FSID|FATTR4_WORD0_FILEHANDLE|FATTR4_WORD0_FILEID;
- attrs[1] |= FATTR4_WORD1_MODE|FATTR4_WORD1_NUMLINKS|FATTR4_WORD1_OWNER|
- FATTR4_WORD1_OWNER_GROUP|FATTR4_WORD1_RAWDEV|
- FATTR4_WORD1_SPACE_USED|FATTR4_WORD1_TIME_ACCESS|
- FATTR4_WORD1_TIME_METADATA|FATTR4_WORD1_TIME_MODIFY;
+ attrs[0] |= FATTR4_WORD0_CHANGE
+ | FATTR4_WORD0_SIZE
+ | FATTR4_WORD0_FSID
+ | FATTR4_WORD0_FILEHANDLE
+ | FATTR4_WORD0_FILEID;
+ attrs[1] |= FATTR4_WORD1_MODE
+ | FATTR4_WORD1_NUMLINKS
+ | FATTR4_WORD1_OWNER
+ | FATTR4_WORD1_OWNER_GROUP
+ | FATTR4_WORD1_RAWDEV
+ | FATTR4_WORD1_SPACE_USED
+ | FATTR4_WORD1_TIME_ACCESS
+ | FATTR4_WORD1_TIME_METADATA
+ | FATTR4_WORD1_TIME_MODIFY;
attrs[2] |= FATTR4_WORD2_SECURITY_LABEL;
}
/* Use mounted_on_fileid only if the server supports it */
--
2.43.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
end of thread, other threads:[~2023-12-06 13:10 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-08-30 19:42 [PATCH v2] NFSv4: Always ask for type with READDIR Benjamin Coddington
2023-08-30 20:10 ` Jeff Layton
2023-08-30 20:20 ` Trond Myklebust
2023-08-30 21:14 ` Jeff Layton
2023-08-31 15:17 ` Benjamin Coddington
2023-08-31 15:24 ` Jeff Layton
2023-08-31 18:41 ` Cedric Blancher
2023-08-31 18:53 ` Jeff Layton
2023-08-31 20:08 ` Rick Macklem
2023-08-31 21:33 ` Jeff Layton
2023-09-01 16:03 ` Chuck Lever III
2023-09-07 12:43 ` Benjamin Coddington
-- strict thread matches above, loose matches on Subject: below --
2023-12-06 13:10 Benjamin Coddington
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox