From: Vivek Goyal <vgoyal@redhat.com>
To: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: virtio-fs@redhat.com
Subject: Re: [Virtio-fs] [PATCH 4/4] virtiofsd: Drop CAP_FSETID if client asked for it
Date: Wed, 14 Aug 2019 08:43:39 -0400 [thread overview]
Message-ID: <20190814124339.GA32294@redhat.com> (raw)
In-Reply-To: <20190814095216.GD2920@work-vm>
On Wed, Aug 14, 2019 at 10:52:16AM +0100, Dr. David Alan Gilbert wrote:
> * Vivek Goyal (vgoyal@redhat.com) wrote:
> > If client requested killing setuid/setgid bits on file being written, drop
> > CAP_FSETID capability so that setuid/setgid bits are cleared upon write
> > automatically.
> >
> > pjdfstest chown/12.t needs this.
> >
> > Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
>
> Some high level questions:
> a) This is going to be *really* expensive isn't it - that's a lot of
> syscalls for each write
>
> b) and isn't this the normal case, i.e. a non-root guest writing to a
> normal file?
Hi David,
FUSE_WRITE_KILL_PRIV flag will not be set frequently. It is set only
if a write happens to a file which has setuid/setgid bit set. Very
few files in filesystem typically have this bit set and for those
also it will be set only for first write and not subsequent writes.
So I will not be worried about performance impact of it.
>
> c) I guess an unpriv virtiofsd still works OK because the cap is
> already gone.
>
> d) How is this thread safe? My reading is cap_set_proc is process
> wide; you'd have other writesin other threads doing non-kill-priv
> writes; also you'd have multiple threads doing (clear) write (set) -
> that feels like they'd interact badly.
man page of capabilities says that its a per thread property.
**************************
Starting with kernel 2.2, Linux divides the privileges traditionally
associated with superuser into distinct units, known as capabilities,
which can be independently enabled and disabled. Capabilities are a
per-thread attribute.
**************************
So dropping capability per thread and enabling it back should be fine.
>
> e) I think you could share some stuff between the drop and the
> restore; for example you only ever need to do cap_from_name once
> in the entire run. You can probably also do teh cap_get_proc when
> you drop it, keep that 'caps' around and then set it again.
Given its not a very frequent operation, I am not too worried about
it. It will just make code structure little more complicated.
But if you have strong opinion about it, I am open to change it.
Thanks
Vivek
>
>
> Dave
>
> > ---
> > contrib/virtiofsd/Makefile.objs | 2 +
> > contrib/virtiofsd/passthrough_ll.c | 127 +++++++++++++++++++++++++++++
> > contrib/virtiofsd/seccomp.c | 2 +
> > 3 files changed, 131 insertions(+)
> >
> > diff --git a/contrib/virtiofsd/Makefile.objs b/contrib/virtiofsd/Makefile.objs
> > index 941b19f18e..25f1e8dd73 100644
> > --- a/contrib/virtiofsd/Makefile.objs
> > +++ b/contrib/virtiofsd/Makefile.objs
> > @@ -11,3 +11,5 @@ virtiofsd-obj-y = buffer.o \
> >
> > seccomp.o-cflags := $(SECCOMP_CFLAGS)
> > seccomp.o-libs := $(SECCOMP_LIBS)
> > +
> > +passthrough_ll.o-libs += -lcap
> > diff --git a/contrib/virtiofsd/passthrough_ll.c b/contrib/virtiofsd/passthrough_ll.c
> > index 321bbb20be..412663653a 100644
> > --- a/contrib/virtiofsd/passthrough_ll.c
> > +++ b/contrib/virtiofsd/passthrough_ll.c
> > @@ -56,6 +56,7 @@
> > #include <sys/mman.h>
> > #include <sys/socket.h>
> > #include <sys/un.h>
> > +#include <sys/capability.h>
> >
> > #include "ireg.h"
> > #include <sys/mount.h>
> > @@ -230,6 +231,115 @@ static struct lo_data *lo_data(fuse_req_t req)
> > return (struct lo_data *) fuse_req_userdata(req);
> > }
> >
> > +/* Helpers for dropping and regaining effective capabilities. Returns 0
> > + * on success, error otherwise */
> > +static int drop_effective_cap(const char *cap_name, bool *cap_dropped)
> > +{
> > + cap_t caps;
> > + cap_value_t cap;
> > + cap_flag_value_t cap_value;
> > + int ret = 0;
> > +
> > + ret = cap_from_name(cap_name, &cap);
> > + if (ret == -1) {
> > + ret = errno;
> > + fuse_err("cap_from_name(%s) failed:%s\n", cap_name,
> > + strerror(errno));
> > + goto out;
> > + }
> > +
> > + if (!CAP_IS_SUPPORTED(cap)) {
> > + fuse_err("capability(%s) is not supported\n", cap_name);
> > + return EINVAL;
> > + }
> > +
> > + caps = cap_get_proc();
> > + if (caps == NULL) {
> > + ret = errno;
> > + fuse_err("cap_get_proc() failed\n");
> > + goto out;
> > + }
> > +
> > + if (cap_get_flag(caps, cap, CAP_EFFECTIVE, &cap_value) == -1) {
> > + ret = errno;
> > + fuse_err("cap_get_flag() failed\n");
> > + goto out_cap_free;
> > + }
> > +
> > + /* We dont have this capability in effective set already. */
> > + if (cap_value == CAP_CLEAR) {
> > + ret = 0;
> > + goto out_cap_free;
> > + }
> > +
> > + if (cap_set_flag(caps, CAP_EFFECTIVE, 1, &cap, CAP_CLEAR) == -1) {
> > + ret = errno;
> > + fuse_err("cap_set_flag() failed\n");
> > + goto out_cap_free;
> > + }
> > +
> > + if (cap_set_proc(caps) == -1) {
> > + ret = errno;
> > + fuse_err("cap_set_procs() failed\n");
> > + goto out_cap_free;
> > + }
> > +
> > + ret = 0;
> > + if (cap_dropped)
> > + *cap_dropped = true;
> > +
> > +out_cap_free:
> > + cap_free(caps);
> > +out:
> > + return ret;
> > +}
> > +
> > +static int gain_effective_cap(const char *cap_name)
> > +{
> > + cap_t caps;
> > + cap_value_t cap;
> > + int ret = 0;
> > +
> > + ret = cap_from_name(cap_name, &cap);
> > + if (ret == -1) {
> > + ret = errno;
> > + fuse_err("cap_from_name(%s) failed:%s\n", cap_name,
> > + strerror(errno));
> > + goto out;
> > + }
> > +
> > + if (!CAP_IS_SUPPORTED(cap)) {
> > + fuse_err("capability(%s) is not supported\n", cap_name);
> > + return EINVAL;
> > + }
> > +
> > + caps = cap_get_proc();
> > + if (caps == NULL) {
> > + ret = errno;
> > + fuse_err("cap_get_proc() failed\n");
> > + goto out;
> > + }
> > +
> > +
> > + if (cap_set_flag(caps, CAP_EFFECTIVE, 1, &cap, CAP_SET) == -1) {
> > + ret = errno;
> > + fuse_err("cap_set_flag() failed\n");
> > + goto out_cap_free;
> > + }
> > +
> > + if (cap_set_proc(caps) == -1) {
> > + ret = errno;
> > + fuse_err("cap_set_procs() failed\n");
> > + goto out_cap_free;
> > + }
> > + ret = 0;
> > +
> > +out_cap_free:
> > + cap_free(caps);
> > +out:
> > + return ret;
> > +}
> > +
> > static void lo_map_init(struct lo_map *map)
> > {
> > map->elems = NULL;
> > @@ -2014,6 +2124,7 @@ static void lo_write_buf(fuse_req_t req, fuse_ino_t ino,
> > ssize_t res;
> > struct fuse_bufvec out_buf = FUSE_BUFVEC_INIT(fuse_buf_size(in_buf));
> > struct lo_data *lo = lo_data(req);
> > + bool cap_fsetid_dropped = false;
> >
> > out_buf.buf[0].flags = FUSE_BUF_IS_FD | FUSE_BUF_FD_SEEK;
> > out_buf.buf[0].fd = lo_fi_fd(req, fi);
> > @@ -2023,6 +2134,16 @@ static void lo_write_buf(fuse_req_t req, fuse_ino_t ino,
> > fuse_debug("lo_write_buf(ino=%" PRIu64 ", size=%zd, off=%lu)\n",
> > ino, out_buf.buf[0].size, (unsigned long) off);
> >
> > + /*
> > + * If kill_priv is set, drop CAP_FSETID which should lead to kernel
> > + * clearing setuid/setgid on file.
> > + */
> > + if (fi->kill_priv) {
> > + res = drop_effective_cap("cap_fsetid", &cap_fsetid_dropped);
> > + if (res != 0)
> > + fuse_reply_err(req, res);
> > + }
> > +
> > res = fuse_buf_copy(req, &out_buf, in_buf, 0);
> > if(res < 0) {
> > fuse_reply_err(req, -res);
> > @@ -2037,6 +2158,12 @@ static void lo_write_buf(fuse_req_t req, fuse_ino_t ino,
> >
> > fuse_reply_write(req, (size_t) res);
> > }
> > +
> > + if (cap_fsetid_dropped) {
> > + res = gain_effective_cap("cap_fsetid");
> > + if (res)
> > + fuse_err("Failed to gain CAP_FSETID\n");
> > + }
> > }
> >
> > static void lo_statfs(fuse_req_t req, fuse_ino_t ino)
> > diff --git a/contrib/virtiofsd/seccomp.c b/contrib/virtiofsd/seccomp.c
> > index 5f1c873b82..7384ebee0a 100644
> > --- a/contrib/virtiofsd/seccomp.c
> > +++ b/contrib/virtiofsd/seccomp.c
> > @@ -78,6 +78,8 @@ static const int syscall_whitelist[] = {
> > SCMP_SYS(unlinkat),
> > SCMP_SYS(utimensat),
> > SCMP_SYS(write),
> > + SCMP_SYS(capget),
> > + SCMP_SYS(capset),
> > };
> >
> > /* Syscalls used when --syslog is enabled */
> > --
> > 2.17.2
> >
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
next prev parent reply other threads:[~2019-08-14 12:43 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-08-13 19:29 [Virtio-fs] [PATCH 0/4] Drop CAP_FSETID if client needs to kill setuid/setgid bits Vivek Goyal
2019-08-13 19:29 ` [Virtio-fs] [PATCH 1/4] virtiofsd: Fix number of padding bits in fuse_file_info Vivek Goyal
2019-08-14 8:54 ` Dr. David Alan Gilbert
2019-08-21 11:25 ` Dr. David Alan Gilbert
2019-08-13 19:29 ` [Virtio-fs] [PATCH 2/4] virtiofsd: Use macros for write_flag parsing Vivek Goyal
2019-08-14 9:13 ` Dr. David Alan Gilbert
2019-08-13 19:29 ` [Virtio-fs] [PATCH 3/4] virtiofsd: Parse flag FUSE_WRITE_KILL_PRIV Vivek Goyal
2019-08-14 9:28 ` Dr. David Alan Gilbert
2019-08-13 19:29 ` [Virtio-fs] [PATCH 4/4] virtiofsd: Drop CAP_FSETID if client asked for it Vivek Goyal
2019-08-14 9:52 ` Dr. David Alan Gilbert
2019-08-14 12:43 ` Vivek Goyal [this message]
2019-08-14 13:17 ` Dr. David Alan Gilbert
2019-08-14 9:55 ` [Virtio-fs] [PATCH 0/4] Drop CAP_FSETID if client needs to kill setuid/setgid bits Dr. David Alan Gilbert
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190814124339.GA32294@redhat.com \
--to=vgoyal@redhat.com \
--cc=dgilbert@redhat.com \
--cc=virtio-fs@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.