All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeff Layton <jlayton@redhat.com>
To: "Yan, Zheng" <ukernel@gmail.com>
Cc: ceph-devel <ceph-devel@vger.kernel.org>,
	Ilya Dryomov <idryomov@gmail.com>, Zheng Yan <zyan@redhat.com>,
	Sage Weil <sage@redhat.com>
Subject: Re: [RFC PATCH 07/10] ceph: update cap message struct version to 9
Date: Mon, 07 Nov 2016 06:21:34 -0500	[thread overview]
Message-ID: <1478517694.2386.13.camel@redhat.com> (raw)
In-Reply-To: <CAAM7YAnzECeDocm5Mf+5DwDzyMgC0HXrh3jC9RLbi17T76XXOw@mail.gmail.com>

On Mon, 2016-11-07 at 16:43 +0800, Yan, Zheng wrote:
> On Fri, Nov 4, 2016 at 8:57 PM, Jeff Layton <jlayton@redhat.com> wrote:
> > 
> > On Fri, 2016-11-04 at 07:34 -0400, Jeff Layton wrote:
> > > 
> > > The userland ceph has MClientCaps at struct version 9. This brings the
> > > kernel up the same version.
> > > 
> > > With this change, we have to start tracking the btime and change_attr,
> > > so that the client can pass back sane values in cap messages. The
> > > client doesn't care about the btime at all, so this is just passed
> > > around, but the change_attr is used when ceph is exported via NFS.
> > > 
> > > For now, the new "sync" parm is left at 0, to preserve the existing
> > > behavior of the client.
> > > 
> > > Signed-off-by: Jeff Layton <jlayton@redhat.com>
> > > ---
> > >  fs/ceph/caps.c | 33 +++++++++++++++++++++++++--------
> > >  1 file changed, 25 insertions(+), 8 deletions(-)
> > > 
> > > diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c
> > > index 6e99866b1946..452f5024589f 100644
> > > --- a/fs/ceph/caps.c
> > > +++ b/fs/ceph/caps.c
> > > @@ -991,9 +991,9 @@ struct cap_msg_args {
> > >       struct ceph_mds_session *session;
> > >       u64                     ino, cid, follows;
> > >       u64                     flush_tid, oldest_flush_tid, size, max_size;
> > > -     u64                     xattr_version;
> > > +     u64                     xattr_version, change_attr;
> > >       struct ceph_buffer      *xattr_buf;
> > > -     struct timespec         atime, mtime, ctime;
> > > +     struct timespec         atime, mtime, ctime, btime;
> > >       int                     op, caps, wanted, dirty;
> > >       u32                     seq, issue_seq, mseq, time_warp_seq;
> > >       kuid_t                  uid;
> > > @@ -1026,13 +1026,13 @@ static int send_cap_msg(struct cap_msg_args *arg)
> > > 
> > >       /* flock buffer size + inline version + inline data size +
> > >        * osd_epoch_barrier + oldest_flush_tid */
> > > -     extra_len = 4 + 8 + 4 + 4 + 8;
> > > +     extra_len = 4 + 8 + 4 + 4 + 8 + 4 + 4 + 4 + 8 + 8 + 1;
> > >       msg = ceph_msg_new(CEPH_MSG_CLIENT_CAPS, sizeof(*fc) + extra_len,
> > >                          GFP_NOFS, false);
> > >       if (!msg)
> > >               return -ENOMEM;
> > > 
> > > -     msg->hdr.version = cpu_to_le16(6);
> > > +     msg->hdr.version = cpu_to_le16(9);
> > >       msg->hdr.tid = cpu_to_le64(arg->flush_tid);
> > > 
> > >       fc = msg->front.iov_base;
> > > @@ -1068,17 +1068,30 @@ static int send_cap_msg(struct cap_msg_args *arg)
> > >       }
> > > 
> > >       p = fc + 1;
> > > -     /* flock buffer size */
> > > +     /* flock buffer size (version 2) */
> > >       ceph_encode_32(&p, 0);
> > > -     /* inline version */
> > > +     /* inline version (version 4) */
> > >       ceph_encode_64(&p, arg->inline_data ? 0 : CEPH_INLINE_NONE);
> > >       /* inline data size */
> > >       ceph_encode_32(&p, 0);
> > > -     /* osd_epoch_barrier */
> > > +     /* osd_epoch_barrier (version 5) */
> > >       ceph_encode_32(&p, 0);
> > > -     /* oldest_flush_tid */
> > > +     /* oldest_flush_tid (version 6) */
> > >       ceph_encode_64(&p, arg->oldest_flush_tid);
> > > 
> > > +     /* caller_uid/caller_gid (version 7) */
> > > +     ceph_encode_32(&p, (u32)-1);
> > > +     ceph_encode_32(&p, (u32)-1);
> > 
> > A bit of self-review...
> > 
> > Not sure if we want to set the above to something else -- maybe 0 or to
> > current's creds? That may not always make sense though (during e.g.
> > writeback).
> > 

Looking further, I'm not quite sure I understand why we send creds at
all in cap messages. Can you clarify where that matters?

The way I look at it, would be to consider caps to be something like a
more granular NFS delegation or SMB oplock.

In that light, a cap flush is just the client sending updated attrs for
the exclusive caps that it has already been granted. Is there a
situation where we would ever want to refuse that update?

Note that nothing ever checks the return code for _do_cap_update in the
userland code. If the permissions check fails, then we'll end up
silently dropping the updated attrs on the floor.

> > > 
> > > +
> > > +     /* pool namespace (version 8) */
> > > +     ceph_encode_32(&p, 0);
> > > +
> > 
> > I'm a little unclear on how the above should be set, but I'll look over
> > the userland code and ape what it does.
> 
> pool namespace is useless for client->mds cap message, set its length
> to 0 should be OK.
> 

Thanks. I went ahead and added a comment to that effect in the updated
set I'm testing now.

> > 
> > 
> > > 
> > > +     /* btime, change_attr, sync (version 9) */
> > > +     ceph_encode_timespec(p, &arg->btime);
> > > +     p += sizeof(struct ceph_timespec);
> > > +     ceph_encode_64(&p, arg->change_attr);
> > > +     ceph_encode_8(&p, 0);
> > > +
> > >       ceph_con_send(&arg->session->s_con, msg);
> > >       return 0;
> > >  }
> > > @@ -1189,9 +1202,11 @@ static int __send_cap(struct ceph_mds_client *mdsc, struct ceph_cap *cap,
> > >               arg.xattr_buf = NULL;
> > >       }
> > > 
> > > +     arg.change_attr = inode->i_version;
> > >       arg.mtime = inode->i_mtime;
> > >       arg.atime = inode->i_atime;
> > >       arg.ctime = inode->i_ctime;
> > > +     arg.btime = ci->i_btime;
> > > 
> > >       arg.op = op;
> > >       arg.caps = cap->implemented;
> > > @@ -1241,10 +1256,12 @@ static inline int __send_flush_snap(struct inode *inode,
> > >       arg.max_size = 0;
> > >       arg.xattr_version = capsnap->xattr_version;
> > >       arg.xattr_buf = capsnap->xattr_blob;
> > > +     arg.change_attr = capsnap->change_attr;
> > > 
> > >       arg.atime = capsnap->atime;
> > >       arg.mtime = capsnap->mtime;
> > >       arg.ctime = capsnap->ctime;
> > > +     arg.btime = capsnap->btime;
> > > 
> > >       arg.op = CEPH_CAP_OP_FLUSHSNAP;
> > >       arg.caps = capsnap->issued;
> > 
> > --
> > Jeff Layton <jlayton@redhat.com>
> > --
> > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Jeff Layton <jlayton@redhat.com>

  reply	other threads:[~2016-11-07 11:21 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-04 11:34 [RFC PATCH 00/10] ceph: fix long stalls during fsync and write_inode Jeff Layton
2016-11-04 11:34 ` [RFC PATCH 01/10] ceph: fix minor typo in unsafe_request_wait Jeff Layton
2016-11-04 11:34 ` [RFC PATCH 02/10] ceph: move xattr initialzation before the encoding past the ceph_mds_caps Jeff Layton
2016-11-04 11:34 ` [RFC PATCH 03/10] ceph: initialize i_version to 0 in new ceph inodes Jeff Layton
2016-11-04 11:34 ` [RFC PATCH 04/10] ceph: save off btime and change_attr when we get an InodeStat Jeff Layton
2016-11-04 11:34 ` [RFC PATCH 05/10] ceph: handle btime and change_attr updates in cap messages Jeff Layton
2016-11-04 11:34 ` [RFC PATCH 06/10] ceph: define new argument structure for send_cap_msg Jeff Layton
2016-11-04 11:34 ` [RFC PATCH 07/10] ceph: update cap message struct version to 9 Jeff Layton
2016-11-04 12:57   ` Jeff Layton
2016-11-07  8:43     ` Yan, Zheng
2016-11-07 11:21       ` Jeff Layton [this message]
2016-11-07 14:05         ` Sage Weil
2016-11-07 14:22           ` Jeff Layton
2016-11-07 14:36             ` Sage Weil
2016-11-07 18:39               ` Jeff Layton
2016-11-07 19:15                 ` Sage Weil
2016-11-07 19:53                 ` Gregory Farnum
2016-11-07 20:09                   ` Sage Weil
2016-11-07 21:16                     ` Jeff Layton
2016-11-04 11:34 ` [RFC PATCH 08/10] ceph: add sync parameter to send_cap_msg Jeff Layton
2016-11-07  8:32   ` Yan, Zheng
2016-11-07 10:51     ` Jeff Layton
2016-11-04 11:34 ` [RFC PATCH 09/10] ceph: plumb "sync" parameter into __send_cap Jeff Layton
2016-11-04 11:34 ` [RFC PATCH 10/10] ceph: turn on btime and change_attr support Jeff Layton
  -- strict thread matches above, loose matches on Subject: below --
2016-11-07 21:21 [RFC PATCH 07/10] ceph: update cap message struct version to 9 Sage Weil
2016-11-07 21:51 ` Jeff Layton
2016-11-07 23:15 ` Gregory Farnum
2016-11-07 23:21   ` Sage Weil
2016-11-11 12:45     ` Jeff Layton
2016-11-11 14:48       ` Sage Weil

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1478517694.2386.13.camel@redhat.com \
    --to=jlayton@redhat.com \
    --cc=ceph-devel@vger.kernel.org \
    --cc=idryomov@gmail.com \
    --cc=sage@redhat.com \
    --cc=ukernel@gmail.com \
    --cc=zyan@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.