From: Jeff Layton <jlayton@kernel.org>
To: "Yan, Zheng" <ukernel@gmail.com>
Cc: ceph-devel <ceph-devel@vger.kernel.org>,
nfs-ganesha-devel@lists.sourceforge.net
Subject: Re: [nfs-ganesha]{PATCH] FSAL_CEPH: do an inode lookup vs. MDS when the Inode is not in cache
Date: Fri, 10 Nov 2017 06:24:01 -0500 [thread overview]
Message-ID: <1510313041.5377.5.camel@kernel.org> (raw)
In-Reply-To: <CAAM7YAkWd1xz=7y0GPBG-fN7P73KpMGNEMhRpADVkf93qBix8g@mail.gmail.com>
On Fri, 2017-11-10 at 10:31 +0800, Yan, Zheng wrote:
> On Fri, Nov 10, 2017 at 12:03 AM, Jeff Layton <jlayton@kernel.org> wrote:
> > From: Jeff Layton <jlayton@redhat.com>
> >
> > FSAL_CEPH just does a ceph_ll_get_inode when asked to create a
> > filehandle today, which only does a query of the local Ceph client
> > cache. This means, it can only find Inodes that were previously found
> > via name lookup.
> >
> > When ganesha is restarted, we have a blank-slate Client, and the Inode
> > will not be in cache. That ends up with the NFS client getting back an
> > ESTALE.
> >
> > We must be able to find Inodes that may not be in the cache. When
> > ceph_ll_get_inode can't find an Inode, try a lookup vs. the MDS for it.
> >
> > The only problem is that we currently have no way to look up a snapped
> > inode from scratch. I think we'll probably need to add that ability to
> > libcephfs, but for now the best we can do is just bail out with an
> > ESTALE in that case.
> >
> > While we're in there, remove the comment about ceph_ll_connectable_m.
> > There is no such function in libcephfs and never has been. The ds
> > code does seem to refer to such a call, but it's not clear to me
> > what it's supposed to do.
> >
> > Change-Id: I2d0e362ef8f28ba78575b60f3fb2890096d98fc6
> > Signed-off-by: Jeff Layton <jlayton@redhat.com>
> > ---
> >
> > I sent this up through git-review to ganesha's gerrit, but I'd
> > like some cephfs folks to have a look too. This patch is basically
> > a workaround for the FSAL_CEPH code for now.
> >
> > Do we need to worry about inodes that are not CEPH_NOSNAP? It
> > seems like we might need for libcephfs to grow an interface that
> > can look up snapped inodes as well?
> >
> > Either that, or teach ceph_ll_get_inode how to look up inodes
> > that are not in its cache. Would that be a better fix here?
> >
> > src/FSAL/FSAL_CEPH/export.c | 19 +++++++++++++++----
> > 1 file changed, 15 insertions(+), 4 deletions(-)
> >
> > diff --git a/src/FSAL/FSAL_CEPH/export.c b/src/FSAL/FSAL_CEPH/export.c
> > index 3de25fdfaf41..a31e3bb3ebef 100644
> > --- a/src/FSAL/FSAL_CEPH/export.c
> > +++ b/src/FSAL/FSAL_CEPH/export.c
> > @@ -234,12 +234,23 @@ static fsal_status_t create_handle(struct fsal_export *export_pub,
> > return status;
> > }
> >
> > + /* Check our local cache first */
> > i = ceph_ll_get_inode(export->cmount, *vi);
> > - if (!i)
> > - return ceph2fsal_error(-ESTALE);
> > + if (!i) {
> > + /*
> > + * Try the slow way, may not be in cache now.
> > + *
> > + * Currently, there is no interface for looking up a snapped
> > + * inode, so we just bail here in that case.
> > + */
> > + if (vi->snapid.val != CEPH_NOSNAP)
> > + return ceph2fsal_error(-ESTALE);
> > +
> > + rc = ceph_ll_lookup_inode(export->cmount, vi->ino, &i);
> > + if (rc)
> > + return ceph2fsal_error(rc);
> > + }
> >
> > - /* The ceph_ll_connectable_m should have populated libceph's
> > - cache with all this anyway */
> > rc = fsal_ceph_ll_getattr(export->cmount, i, &stx,
> > attrs_out ? CEPH_STATX_ATTR_MASK : CEPH_STATX_HANDLE_MASK,
> > op_ctx->creds);
> > --
> > 2.13.6
> >
>
> Looks good to me,
>
> Tested-by: "Yan, Zheng" <zyan@redhat.com>
>
Thanks. I still wonder whether this is a sufficient fix.
ceph_ll_lookup_inode does not take the snapid into account. Do we need a
version of ceph_ll_lookup_inode that can find snapshotted inodes after
ganesha has restarted?
It does seem like we could allow a nfs client do work in a snapshotted
directory, and then have ganesha restart. That client may be very
confused when the server comes back and all of the existing filehandles
go stale.
--
Jeff Layton <jlayton@kernel.org>
next prev parent reply other threads:[~2017-11-10 11:24 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-11-09 16:03 [nfs-ganesha]{PATCH] FSAL_CEPH: do an inode lookup vs. MDS when the Inode is not in cache Jeff Layton
2017-11-10 2:31 ` Yan, Zheng
2017-11-10 11:24 ` Jeff Layton [this message]
2017-11-10 13:39 ` Yan, Zheng
2017-11-10 13:56 ` Jeff Layton
2017-11-10 14:19 ` Sage Weil
2017-11-10 14:49 ` Jeff Layton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1510313041.5377.5.camel@kernel.org \
--to=jlayton@kernel.org \
--cc=ceph-devel@vger.kernel.org \
--cc=nfs-ganesha-devel@lists.sourceforge.net \
--cc=ukernel@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.