From: Trond Myklebust <trondmy@hammerspace.com>
To: "Nagendra.Tomar@microsoft.com" <Nagendra.Tomar@microsoft.com>
Cc: "linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>
Subject: Re: [EXTERNAL] [PATCH] NFS: Fix handling of cookie verifier in uncached_readdir()
Date: Wed, 17 Mar 2021 12:39:49 +0000 [thread overview]
Message-ID: <e9531fe55c7ccdc14ff7b9af3fe933d522c05e03.camel@hammerspace.com> (raw)
In-Reply-To: <KU1P153MB019740C5B86FFA134C7458DE9E6A9@KU1P153MB0197.APCP153.PROD.OUTLOOK.COM>
On Wed, 2021-03-17 at 10:39 +0000, Nagendra Tomar wrote:
> Ignore this, we do need the inode cookieverifier for the cached
> cookies in the absence
> of any open dir handle. I was clearly overthinking!
>
Right. When we're caching cookies, then we should also cache the
verifier that is needed to use them. That's why we cache in both the
inode and in the directory context.
> I've a question on your patch:
> If the server changes the cookieverifier for a directory, shouldn't
> we zap the
> cached directory pages since the cached cookies do not correspond to
> this new cookie
> verifier for this directory? Or, do we depend on the server to return
> BADCOOKIE when we
> present a bad cookieverifier+cookie combo, and then we invalidate. I
> guess that's fine too.
>
When the server changes the verifier it will start replying to requests
that use the older verifier with NFS4ERR_NOT_SAME (NFSv4), which we
convert into ENOTSYNC, or NFS3ERR_BAD_COOKIE (NFSv3), which we convert
into EBADCOOKIE.
Both errors are handled in the generic readdir code.
> Also my earlier comment about find_and_lock_cache_page() accessing
> nfsi->cookieverf
> w/o locking the inode. This one is not introduced by your current
> patch.
Yes. I'm aware of that issue. We can just add a check for desc-
>page_index == 0. Nothing should be changing the verifier unless the
cache is empty.
>
> Thanks,
> Tomar
>
> -----Original Message-----
> From: Nagendra Tomar
> Sent: 17 March 2021 14:59
> To: trondmy@kernel.org
> Cc: linux-nfs@vger.kernel.org
> Subject: RE: [EXTERNAL] [PATCH] NFS: Fix handling of cookie verifier
> in uncached_readdir()
>
> Hi Trond,
> I think it'll be better to remove cookieverifier from nfs_inode,
> as truly speaking
> it doesn't belong there. A cookie verifier (along with the cookie) is
> part of a readdir
> context, and context is per open dir instance and not per file.
> Ideally we should have
> the following sequence
>
> 1. Application opens directory, nfs_open_dir_context->verf is set to
> 0.
> nfs_open_dir_context->verf is used to populate
> nfs_readdir_descriptor->verf, which is our cursor.
> 2. The first readdir call using the above newly opened handle carries
> the 0 cookie verifier (with cookie==0).
> 3. A cookie verifier received as a response of #2 is saved in
> nfs_readdir_descriptor->verf, which
> is later saved in nfs_open_dir_context->verf. Thus subsequent
> readdir calls carry the cookie
> verifier (and cookie) returned by the server.
>
> What do you think about this patch?
> It includes the patch that I sent y'day so only this should be
> enough.
>
> PS: Currently find_and_lock_cache_page() seems to be accessing
> nfs_inode->cookieverifier
> w/o locking the inode.
>
> Signed-off-by: Nagendra S Tomar <natomar@xxxxxxxxxxxxx>
> ---
> diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
> index fc4f490f2d78..a52917800e37 100644
> --- a/fs/nfs/dir.c
> +++ b/fs/nfs/dir.c
> @@ -76,6 +76,7 @@ static struct nfs_open_dir_context
> *alloc_nfs_open_dir_context(struct inode *dir
> if (ctx != NULL) {
> ctx->duped = 0;
> ctx->attr_gencount = nfsi->attr_gencount;
> + memset(ctx->verf, 0, sizeof(ctx->verf));
> ctx->dir_cookie = 0;
> ctx->dup_cookie = 0;
> spin_lock(&dir->i_lock);
> @@ -820,7 +821,6 @@ static struct page
> **nfs_readdir_alloc_pages(size_t npages)
> }
>
> static int nfs_readdir_xdr_to_array(struct nfs_readdir_descriptor
> *desc,
> - __be32 *verf_arg, __be32
> *verf_res,
> struct page **arrays, size_t
> narrays)
> {
> struct page **pages;
> @@ -829,6 +829,7 @@ static int nfs_readdir_xdr_to_array(struct
> nfs_readdir_descriptor *desc,
> size_t array_size;
> struct inode *inode = file_inode(desc->file);
> size_t dtsize = NFS_SERVER(inode)->dtsize;
> + __be32 verf[NFS_DIR_VERIFIER_SIZE];
> int status = -ENOMEM;
>
> entry = kzalloc(sizeof(*entry), GFP_KERNEL);
> @@ -854,9 +855,9 @@ static int nfs_readdir_xdr_to_array(struct
> nfs_readdir_descriptor *desc,
>
> do {
> unsigned int pglen;
> - status = nfs_readdir_xdr_filler(desc, verf_arg,
> entry->cookie,
> + status = nfs_readdir_xdr_filler(desc, desc->verf,
> entry->cookie,
> pages, dtsize,
> - verf_res);
> + verf);
> if (status < 0)
> break;
>
> @@ -866,6 +867,8 @@ static int nfs_readdir_xdr_to_array(struct
> nfs_readdir_descriptor *desc,
> break;
> }
>
> + memcpy(desc->verf, verf, sizeof(desc->verf));
> +
> status = nfs_readdir_page_filler(desc, entry, pages,
> pglen,
> arrays, narrays);
> } while (!status && nfs_readdir_page_needs_filling(page));
> @@ -909,15 +912,13 @@ static int find_and_lock_cache_page(struct
> nfs_readdir_descriptor *desc)
> {
> struct inode *inode = file_inode(desc->file);
> struct nfs_inode *nfsi = NFS_I(inode);
> - __be32 verf[NFS_DIR_VERIFIER_SIZE];
> int res;
>
> desc->page = nfs_readdir_page_get_cached(desc);
> if (!desc->page)
> return -ENOMEM;
> if (nfs_readdir_page_needs_filling(desc->page)) {
> - res = nfs_readdir_xdr_to_array(desc, nfsi-
> >cookieverf, verf,
> - &desc->page, 1);
> + res = nfs_readdir_xdr_to_array(desc, &desc->page, 1);
> if (res < 0) {
> nfs_readdir_page_unlock_and_put_cached(desc);
> if (res == -EBADCOOKIE || res == -ENOTSYNC) {
> @@ -927,7 +928,6 @@ static int find_and_lock_cache_page(struct
> nfs_readdir_descriptor *desc)
> }
> return res;
> }
> - memcpy(nfsi->cookieverf, verf, sizeof(nfsi-
> >cookieverf));
> }
> res = nfs_readdir_search_array(desc);
> if (res == 0) {
> @@ -977,7 +977,6 @@ static int readdir_search_pagecache(struct
> nfs_readdir_descriptor *desc)
> static void nfs_do_filldir(struct nfs_readdir_descriptor *desc)
> {
> struct file *file = desc->file;
> - struct nfs_inode *nfsi = NFS_I(file_inode(file));
> struct nfs_cache_array *array;
> unsigned int i = 0;
>
> @@ -991,7 +990,6 @@ static void nfs_do_filldir(struct
> nfs_readdir_descriptor *desc)
> desc->eof = true;
> break;
> }
> - memcpy(desc->verf, nfsi->cookieverf, sizeof(desc-
> >verf));
> if (i < (array->size-1))
> desc->dir_cookie = array->array[i+1].cookie;
> else
> @@ -1027,7 +1025,6 @@ static int uncached_readdir(struct
> nfs_readdir_descriptor *desc)
> {
> struct page **arrays;
> size_t i, sz = 512;
> - __be32 verf[NFS_DIR_VERIFIER_SIZE];
> int status = -ENOMEM;
>
> dfprintk(DIRCACHE, "NFS: uncached_readdir() searching for
> cookie %llu\n",
> @@ -1044,7 +1041,7 @@ static int uncached_readdir(struct
> nfs_readdir_descriptor *desc)
> desc->last_cookie = desc->dir_cookie;
> desc->duped = 0;
>
> - status = nfs_readdir_xdr_to_array(desc, desc->verf, verf,
> arrays, sz);
> + status = nfs_readdir_xdr_to_array(desc, arrays, sz);
>
> for (i = 0; !desc->eof && i < sz && arrays[i]; i++) {
> desc->page = arrays[i];
> diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
> index a7fb076a5f44..ab3e666ed097 100644
> --- a/fs/nfs/inode.c
> +++ b/fs/nfs/inode.c
> @@ -522,7 +522,6 @@ nfs_fhget(struct super_block *sb, struct nfs_fh
> *fh, struct nfs_fattr *fattr, st
> inode->i_uid = make_kuid(&init_user_ns, -2);
> inode->i_gid = make_kgid(&init_user_ns, -2);
> inode->i_blocks = 0;
> - memset(nfsi->cookieverf, 0, sizeof(nfsi-
> >cookieverf));
> nfsi->write_io = 0;
> nfsi->read_io = 0;
>
> diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h
> index eadaabd18dc7..752de9e43e36 100644
> --- a/include/linux/nfs_fs.h
> +++ b/include/linux/nfs_fs.h
> @@ -158,12 +158,6 @@ struct nfs_inode {
> struct list_head access_cache_entry_lru;
> struct list_head access_cache_inode_lru;
>
> - /*
> - * This is the cookie verifier used for NFSv3 readdir
> - * operations
> - */
> - __be32 cookieverf[NFS_DIR_VERIFIER_SIZE];
> -
> atomic_long_t nrequests;
> struct nfs_mds_commit_info commit_info;
>
>
> -----Original Message-----
> From: trondmy@kernel.org <trondmy@kernel.org>
> Sent: 16 March 2021 19:06
> To: Nagendra Tomar <Nagendra.Tomar@microsoft.com>
> Cc: linux-nfs@vger.kernel.org
> Subject: [EXTERNAL] [PATCH] NFS: Fix handling of cookie verifier in
> uncached_readdir()
>
> From: Trond Myklebust <trond.myklebust@hammerspace.com>
>
> If we're doing uncached readdir(), then the readdir cookie could be
> different from the one cached in the nfs_inode. We should therefore
> ensure that we save that one in the struct nfs_open_dir_context.
>
> Fixes: 35df59d3ef69 ("NFS: Reduce number of RPC calls when doing
> uncached readdir")
> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
> ---
> fs/nfs/dir.c | 13 +++++++------
> 1 file changed, 7 insertions(+), 6 deletions(-)
>
> diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
> index 08a1e2e31d0b..2cf2a7d92faf 100644
> --- a/fs/nfs/dir.c
> +++ b/fs/nfs/dir.c
> @@ -976,10 +976,10 @@ static int readdir_search_pagecache(struct
> nfs_readdir_descriptor *desc)
> /*
> * Once we've found the start of the dirent within a page: fill 'er
> up...
> */
> -static void nfs_do_filldir(struct nfs_readdir_descriptor *desc)
> +static void nfs_do_filldir(struct nfs_readdir_descriptor *desc,
> + const __be32 *verf)
> {
> struct file *file = desc->file;
> - struct nfs_inode *nfsi = NFS_I(file_inode(file));
> struct nfs_cache_array *array;
> unsigned int i = 0;
>
> @@ -993,7 +993,7 @@ static void nfs_do_filldir(struct
> nfs_readdir_descriptor *desc)
> desc->eof = true;
> break;
> }
> - memcpy(desc->verf, nfsi->cookieverf, sizeof(desc-
> >verf));
> + memcpy(desc->verf, verf, sizeof(desc->verf));
> if (i < (array->size-1))
> desc->dir_cookie = array->array[i+1].cookie;
> else
> @@ -1050,7 +1050,7 @@ static int uncached_readdir(struct
> nfs_readdir_descriptor *desc)
>
> for (i = 0; !desc->eof && i < sz && arrays[i]; i++) {
> desc->page = arrays[i];
> - nfs_do_filldir(desc);
> + nfs_do_filldir(desc, verf);
> }
> desc->page = NULL;
>
> @@ -1071,6 +1071,7 @@ static int nfs_readdir(struct file *file,
> struct dir_context *ctx)
> {
> struct dentry *dentry = file_dentry(file);
> struct inode *inode = d_inode(dentry);
> + struct nfs_inode *nfsi = NFS_I(inode);
> struct nfs_open_dir_context *dir_ctx = file->private_data;
> struct nfs_readdir_descriptor *desc;
> int res;
> @@ -1124,7 +1125,7 @@ static int nfs_readdir(struct file *file,
> struct dir_context *ctx)
> break;
> }
> if (res == -ETOOSMALL && desc->plus) {
> - clear_bit(NFS_INO_ADVISE_RDPLUS,
> &NFS_I(inode)->flags);
> + clear_bit(NFS_INO_ADVISE_RDPLUS, &nfsi-
> >flags);
> nfs_zap_caches(inode);
> desc->page_index = 0;
> desc->plus = false;
> @@ -1134,7 +1135,7 @@ static int nfs_readdir(struct file *file,
> struct dir_context *ctx)
> if (res < 0)
> break;
>
> - nfs_do_filldir(desc);
> + nfs_do_filldir(desc, nfsi->cookieverf);
> nfs_readdir_page_unlock_and_put_cached(desc);
> } while (!desc->eof);
>
--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com
prev parent reply other threads:[~2021-03-17 12:40 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-03-16 13:36 [PATCH] NFS: Fix handling of cookie verifier in uncached_readdir() trondmy
2021-03-17 9:29 ` [EXTERNAL] " Nagendra Tomar
2021-03-17 10:39 ` Nagendra Tomar
2021-03-17 12:39 ` Trond Myklebust [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e9531fe55c7ccdc14ff7b9af3fe933d522c05e03.camel@hammerspace.com \
--to=trondmy@hammerspace.com \
--cc=Nagendra.Tomar@microsoft.com \
--cc=linux-nfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).