* [Virtio-fs] [PATCH 0/4] Drop CAP_FSETID if client needs to kill setuid/setgid bits
@ 2019-08-13 19:29 Vivek Goyal
2019-08-13 19:29 ` [Virtio-fs] [PATCH 1/4] virtiofsd: Fix number of padding bits in fuse_file_info Vivek Goyal
` (4 more replies)
0 siblings, 5 replies; 13+ messages in thread
From: Vivek Goyal @ 2019-08-13 19:29 UTC (permalink / raw)
To: virtio-fs
If a file has setuid/setuid bit set and a writer writes to file without
having CAP_FSETID capability, kernel clears setuid/setgid bit on file.
pjdfstest test chmod/12.t tests for this. With moving to 5.3 kernel and
cache=none this test fails.
Now Miklos has introducd a commit where if client thinks that
setuid/setgid bit should be cleared, it sets FUSE_KILL_PRIV flag
in fuse_write_in->write_flags. This is an indication to daemon to
clear setuid/setgid bit atomically.
So drop CAP_FSETID capability and then proceed with write and that
should automatically clear setuid bit.
Vivek Goyal (4):
virtiofsd: Fix number of padding bits in fuse_file_info
virtiofsd: Use macros for write_flag parsing
virtiofsd: Parse flag FUSE_WRITE_KILL_PRIV
virtiofsd: Drop CAP_FSETID if client asked for it
contrib/virtiofsd/Makefile.objs | 2 +
contrib/virtiofsd/fuse_common.h | 5 +-
contrib/virtiofsd/fuse_kernel.h | 1 +
contrib/virtiofsd/fuse_lowlevel.c | 6 +-
contrib/virtiofsd/passthrough_ll.c | 127 +++++++++++++++++++++++++++++
contrib/virtiofsd/seccomp.c | 2 +
6 files changed, 140 insertions(+), 3 deletions(-)
--
2.17.2
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Virtio-fs] [PATCH 1/4] virtiofsd: Fix number of padding bits in fuse_file_info
2019-08-13 19:29 [Virtio-fs] [PATCH 0/4] Drop CAP_FSETID if client needs to kill setuid/setgid bits Vivek Goyal
@ 2019-08-13 19:29 ` Vivek Goyal
2019-08-14 8:54 ` Dr. David Alan Gilbert
2019-08-13 19:29 ` [Virtio-fs] [PATCH 2/4] virtiofsd: Use macros for write_flag parsing Vivek Goyal
` (3 subsequent siblings)
4 siblings, 1 reply; 13+ messages in thread
From: Vivek Goyal @ 2019-08-13 19:29 UTC (permalink / raw)
To: virtio-fs
Currently we have 27 padding bits while there are 6 bit fields. I suspect
this is wrong. We probably are trying to aling to 32 bits and hence padding
bits should be 26 instead.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
contrib/virtiofsd/fuse_common.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/contrib/virtiofsd/fuse_common.h b/contrib/virtiofsd/fuse_common.h
index 6419f66470..4d95f6f28b 100644
--- a/contrib/virtiofsd/fuse_common.h
+++ b/contrib/virtiofsd/fuse_common.h
@@ -65,7 +65,7 @@ struct fuse_file_info {
unsigned int flock_release : 1;
/** Padding. Do not use*/
- unsigned int padding : 27;
+ unsigned int padding : 26;
/** File handle. May be filled in by filesystem in open().
Available in all other file operations */
--
2.17.2
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [Virtio-fs] [PATCH 2/4] virtiofsd: Use macros for write_flag parsing
2019-08-13 19:29 [Virtio-fs] [PATCH 0/4] Drop CAP_FSETID if client needs to kill setuid/setgid bits Vivek Goyal
2019-08-13 19:29 ` [Virtio-fs] [PATCH 1/4] virtiofsd: Fix number of padding bits in fuse_file_info Vivek Goyal
@ 2019-08-13 19:29 ` Vivek Goyal
2019-08-14 9:13 ` Dr. David Alan Gilbert
2019-08-13 19:29 ` [Virtio-fs] [PATCH 3/4] virtiofsd: Parse flag FUSE_WRITE_KILL_PRIV Vivek Goyal
` (2 subsequent siblings)
4 siblings, 1 reply; 13+ messages in thread
From: Vivek Goyal @ 2019-08-13 19:29 UTC (permalink / raw)
To: virtio-fs
Use macros instead of hard coded bit positions.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
contrib/virtiofsd/fuse_lowlevel.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/contrib/virtiofsd/fuse_lowlevel.c b/contrib/virtiofsd/fuse_lowlevel.c
index 8d3f141d23..417f99e8dc 100644
--- a/contrib/virtiofsd/fuse_lowlevel.c
+++ b/contrib/virtiofsd/fuse_lowlevel.c
@@ -1161,7 +1161,7 @@ static void do_write(fuse_req_t req, fuse_ino_t nodeid,
memset(&fi, 0, sizeof(fi));
fi.fh = arg->fh;
- fi.writepage = (arg->write_flags & 1) != 0;
+ fi.writepage = (arg->write_flags & FUSE_WRITE_CACHE) != 0;
if (!compat) {
fi.lock_owner = arg->lock_owner;
@@ -1208,7 +1208,7 @@ static void do_write_buf(fuse_req_t req, fuse_ino_t nodeid,
}
fi.fh = arg->fh;
- fi.writepage = arg->write_flags & 1;
+ fi.writepage = !!(arg->write_flags & FUSE_WRITE_CACHE);
if (ibufv->count == 1) {
assert(!(tmpbufv.buf[0].flags & FUSE_BUF_IS_FD));
--
2.17.2
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [Virtio-fs] [PATCH 3/4] virtiofsd: Parse flag FUSE_WRITE_KILL_PRIV
2019-08-13 19:29 [Virtio-fs] [PATCH 0/4] Drop CAP_FSETID if client needs to kill setuid/setgid bits Vivek Goyal
2019-08-13 19:29 ` [Virtio-fs] [PATCH 1/4] virtiofsd: Fix number of padding bits in fuse_file_info Vivek Goyal
2019-08-13 19:29 ` [Virtio-fs] [PATCH 2/4] virtiofsd: Use macros for write_flag parsing Vivek Goyal
@ 2019-08-13 19:29 ` Vivek Goyal
2019-08-14 9:28 ` Dr. David Alan Gilbert
2019-08-13 19:29 ` [Virtio-fs] [PATCH 4/4] virtiofsd: Drop CAP_FSETID if client asked for it Vivek Goyal
2019-08-14 9:55 ` [Virtio-fs] [PATCH 0/4] Drop CAP_FSETID if client needs to kill setuid/setgid bits Dr. David Alan Gilbert
4 siblings, 1 reply; 13+ messages in thread
From: Vivek Goyal @ 2019-08-13 19:29 UTC (permalink / raw)
To: virtio-fs
Caller can set FUSE_WRITE_KILL_PRIV in write_flags. Parse it and pass it
to the filesystem.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
contrib/virtiofsd/fuse_common.h | 5 ++++-
contrib/virtiofsd/fuse_kernel.h | 1 +
contrib/virtiofsd/fuse_lowlevel.c | 2 ++
3 files changed, 7 insertions(+), 1 deletion(-)
diff --git a/contrib/virtiofsd/fuse_common.h b/contrib/virtiofsd/fuse_common.h
index 4d95f6f28b..5f4af98999 100644
--- a/contrib/virtiofsd/fuse_common.h
+++ b/contrib/virtiofsd/fuse_common.h
@@ -64,8 +64,11 @@ struct fuse_file_info {
May only be set in ->release(). */
unsigned int flock_release : 1;
+ /* Indicates that suid/sgid bits should be removed upon write */
+ unsigned int kill_priv : 1;
+
/** Padding. Do not use*/
- unsigned int padding : 26;
+ unsigned int padding : 25;
/** File handle. May be filled in by filesystem in open().
Available in all other file operations */
diff --git a/contrib/virtiofsd/fuse_kernel.h b/contrib/virtiofsd/fuse_kernel.h
index d477c70028..d2b7ccf96b 100644
--- a/contrib/virtiofsd/fuse_kernel.h
+++ b/contrib/virtiofsd/fuse_kernel.h
@@ -325,6 +325,7 @@ struct fuse_file_lock {
*/
#define FUSE_WRITE_CACHE (1 << 0)
#define FUSE_WRITE_LOCKOWNER (1 << 1)
+#define FUSE_WRITE_KILL_PRIV (1 << 2)
/**
* Read flags
diff --git a/contrib/virtiofsd/fuse_lowlevel.c b/contrib/virtiofsd/fuse_lowlevel.c
index 417f99e8dc..ce27a6dd16 100644
--- a/contrib/virtiofsd/fuse_lowlevel.c
+++ b/contrib/virtiofsd/fuse_lowlevel.c
@@ -1162,6 +1162,7 @@ static void do_write(fuse_req_t req, fuse_ino_t nodeid,
memset(&fi, 0, sizeof(fi));
fi.fh = arg->fh;
fi.writepage = (arg->write_flags & FUSE_WRITE_CACHE) != 0;
+ fi.kill_priv = !!(arg->write_flags & FUSE_WRITE_KILL_PRIV);
if (!compat) {
fi.lock_owner = arg->lock_owner;
@@ -1209,6 +1210,7 @@ static void do_write_buf(fuse_req_t req, fuse_ino_t nodeid,
fi.fh = arg->fh;
fi.writepage = !!(arg->write_flags & FUSE_WRITE_CACHE);
+ fi.kill_priv = !!(arg->write_flags & FUSE_WRITE_KILL_PRIV);
if (ibufv->count == 1) {
assert(!(tmpbufv.buf[0].flags & FUSE_BUF_IS_FD));
--
2.17.2
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [Virtio-fs] [PATCH 4/4] virtiofsd: Drop CAP_FSETID if client asked for it
2019-08-13 19:29 [Virtio-fs] [PATCH 0/4] Drop CAP_FSETID if client needs to kill setuid/setgid bits Vivek Goyal
` (2 preceding siblings ...)
2019-08-13 19:29 ` [Virtio-fs] [PATCH 3/4] virtiofsd: Parse flag FUSE_WRITE_KILL_PRIV Vivek Goyal
@ 2019-08-13 19:29 ` Vivek Goyal
2019-08-14 9:52 ` Dr. David Alan Gilbert
2019-08-14 9:55 ` [Virtio-fs] [PATCH 0/4] Drop CAP_FSETID if client needs to kill setuid/setgid bits Dr. David Alan Gilbert
4 siblings, 1 reply; 13+ messages in thread
From: Vivek Goyal @ 2019-08-13 19:29 UTC (permalink / raw)
To: virtio-fs
If client requested killing setuid/setgid bits on file being written, drop
CAP_FSETID capability so that setuid/setgid bits are cleared upon write
automatically.
pjdfstest chown/12.t needs this.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
contrib/virtiofsd/Makefile.objs | 2 +
contrib/virtiofsd/passthrough_ll.c | 127 +++++++++++++++++++++++++++++
contrib/virtiofsd/seccomp.c | 2 +
3 files changed, 131 insertions(+)
diff --git a/contrib/virtiofsd/Makefile.objs b/contrib/virtiofsd/Makefile.objs
index 941b19f18e..25f1e8dd73 100644
--- a/contrib/virtiofsd/Makefile.objs
+++ b/contrib/virtiofsd/Makefile.objs
@@ -11,3 +11,5 @@ virtiofsd-obj-y = buffer.o \
seccomp.o-cflags := $(SECCOMP_CFLAGS)
seccomp.o-libs := $(SECCOMP_LIBS)
+
+passthrough_ll.o-libs += -lcap
diff --git a/contrib/virtiofsd/passthrough_ll.c b/contrib/virtiofsd/passthrough_ll.c
index 321bbb20be..412663653a 100644
--- a/contrib/virtiofsd/passthrough_ll.c
+++ b/contrib/virtiofsd/passthrough_ll.c
@@ -56,6 +56,7 @@
#include <sys/mman.h>
#include <sys/socket.h>
#include <sys/un.h>
+#include <sys/capability.h>
#include "ireg.h"
#include <sys/mount.h>
@@ -230,6 +231,115 @@ static struct lo_data *lo_data(fuse_req_t req)
return (struct lo_data *) fuse_req_userdata(req);
}
+/* Helpers for dropping and regaining effective capabilities. Returns 0
+ * on success, error otherwise */
+static int drop_effective_cap(const char *cap_name, bool *cap_dropped)
+{
+ cap_t caps;
+ cap_value_t cap;
+ cap_flag_value_t cap_value;
+ int ret = 0;
+
+ ret = cap_from_name(cap_name, &cap);
+ if (ret == -1) {
+ ret = errno;
+ fuse_err("cap_from_name(%s) failed:%s\n", cap_name,
+ strerror(errno));
+ goto out;
+ }
+
+ if (!CAP_IS_SUPPORTED(cap)) {
+ fuse_err("capability(%s) is not supported\n", cap_name);
+ return EINVAL;
+ }
+
+ caps = cap_get_proc();
+ if (caps == NULL) {
+ ret = errno;
+ fuse_err("cap_get_proc() failed\n");
+ goto out;
+ }
+
+ if (cap_get_flag(caps, cap, CAP_EFFECTIVE, &cap_value) == -1) {
+ ret = errno;
+ fuse_err("cap_get_flag() failed\n");
+ goto out_cap_free;
+ }
+
+ /* We dont have this capability in effective set already. */
+ if (cap_value == CAP_CLEAR) {
+ ret = 0;
+ goto out_cap_free;
+ }
+
+ if (cap_set_flag(caps, CAP_EFFECTIVE, 1, &cap, CAP_CLEAR) == -1) {
+ ret = errno;
+ fuse_err("cap_set_flag() failed\n");
+ goto out_cap_free;
+ }
+
+ if (cap_set_proc(caps) == -1) {
+ ret = errno;
+ fuse_err("cap_set_procs() failed\n");
+ goto out_cap_free;
+ }
+
+ ret = 0;
+ if (cap_dropped)
+ *cap_dropped = true;
+
+out_cap_free:
+ cap_free(caps);
+out:
+ return ret;
+}
+
+static int gain_effective_cap(const char *cap_name)
+{
+ cap_t caps;
+ cap_value_t cap;
+ int ret = 0;
+
+ ret = cap_from_name(cap_name, &cap);
+ if (ret == -1) {
+ ret = errno;
+ fuse_err("cap_from_name(%s) failed:%s\n", cap_name,
+ strerror(errno));
+ goto out;
+ }
+
+ if (!CAP_IS_SUPPORTED(cap)) {
+ fuse_err("capability(%s) is not supported\n", cap_name);
+ return EINVAL;
+ }
+
+ caps = cap_get_proc();
+ if (caps == NULL) {
+ ret = errno;
+ fuse_err("cap_get_proc() failed\n");
+ goto out;
+ }
+
+
+ if (cap_set_flag(caps, CAP_EFFECTIVE, 1, &cap, CAP_SET) == -1) {
+ ret = errno;
+ fuse_err("cap_set_flag() failed\n");
+ goto out_cap_free;
+ }
+
+ if (cap_set_proc(caps) == -1) {
+ ret = errno;
+ fuse_err("cap_set_procs() failed\n");
+ goto out_cap_free;
+ }
+ ret = 0;
+
+out_cap_free:
+ cap_free(caps);
+out:
+ return ret;
+}
+
static void lo_map_init(struct lo_map *map)
{
map->elems = NULL;
@@ -2014,6 +2124,7 @@ static void lo_write_buf(fuse_req_t req, fuse_ino_t ino,
ssize_t res;
struct fuse_bufvec out_buf = FUSE_BUFVEC_INIT(fuse_buf_size(in_buf));
struct lo_data *lo = lo_data(req);
+ bool cap_fsetid_dropped = false;
out_buf.buf[0].flags = FUSE_BUF_IS_FD | FUSE_BUF_FD_SEEK;
out_buf.buf[0].fd = lo_fi_fd(req, fi);
@@ -2023,6 +2134,16 @@ static void lo_write_buf(fuse_req_t req, fuse_ino_t ino,
fuse_debug("lo_write_buf(ino=%" PRIu64 ", size=%zd, off=%lu)\n",
ino, out_buf.buf[0].size, (unsigned long) off);
+ /*
+ * If kill_priv is set, drop CAP_FSETID which should lead to kernel
+ * clearing setuid/setgid on file.
+ */
+ if (fi->kill_priv) {
+ res = drop_effective_cap("cap_fsetid", &cap_fsetid_dropped);
+ if (res != 0)
+ fuse_reply_err(req, res);
+ }
+
res = fuse_buf_copy(req, &out_buf, in_buf, 0);
if(res < 0) {
fuse_reply_err(req, -res);
@@ -2037,6 +2158,12 @@ static void lo_write_buf(fuse_req_t req, fuse_ino_t ino,
fuse_reply_write(req, (size_t) res);
}
+
+ if (cap_fsetid_dropped) {
+ res = gain_effective_cap("cap_fsetid");
+ if (res)
+ fuse_err("Failed to gain CAP_FSETID\n");
+ }
}
static void lo_statfs(fuse_req_t req, fuse_ino_t ino)
diff --git a/contrib/virtiofsd/seccomp.c b/contrib/virtiofsd/seccomp.c
index 5f1c873b82..7384ebee0a 100644
--- a/contrib/virtiofsd/seccomp.c
+++ b/contrib/virtiofsd/seccomp.c
@@ -78,6 +78,8 @@ static const int syscall_whitelist[] = {
SCMP_SYS(unlinkat),
SCMP_SYS(utimensat),
SCMP_SYS(write),
+ SCMP_SYS(capget),
+ SCMP_SYS(capset),
};
/* Syscalls used when --syslog is enabled */
--
2.17.2
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [Virtio-fs] [PATCH 1/4] virtiofsd: Fix number of padding bits in fuse_file_info
2019-08-13 19:29 ` [Virtio-fs] [PATCH 1/4] virtiofsd: Fix number of padding bits in fuse_file_info Vivek Goyal
@ 2019-08-14 8:54 ` Dr. David Alan Gilbert
2019-08-21 11:25 ` Dr. David Alan Gilbert
0 siblings, 1 reply; 13+ messages in thread
From: Dr. David Alan Gilbert @ 2019-08-14 8:54 UTC (permalink / raw)
To: Vivek Goyal; +Cc: virtio-fs
* Vivek Goyal (vgoyal@redhat.com) wrote:
> Currently we have 27 padding bits while there are 6 bit fields. I suspect
> this is wrong. We probably are trying to aling to 32 bits and hence padding
> bits should be 26 instead.
>
> Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Yeh, I reckon this was broken upstream by 19accdf ~6 years ago; whether
it's fixable upstream I don't know because they have API requierments
since it's a library.
Still, it's definitely wrong so:
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> ---
> contrib/virtiofsd/fuse_common.h | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/contrib/virtiofsd/fuse_common.h b/contrib/virtiofsd/fuse_common.h
> index 6419f66470..4d95f6f28b 100644
> --- a/contrib/virtiofsd/fuse_common.h
> +++ b/contrib/virtiofsd/fuse_common.h
> @@ -65,7 +65,7 @@ struct fuse_file_info {
> unsigned int flock_release : 1;
>
> /** Padding. Do not use*/
> - unsigned int padding : 27;
> + unsigned int padding : 26;
>
> /** File handle. May be filled in by filesystem in open().
> Available in all other file operations */
> --
> 2.17.2
>
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Virtio-fs] [PATCH 2/4] virtiofsd: Use macros for write_flag parsing
2019-08-13 19:29 ` [Virtio-fs] [PATCH 2/4] virtiofsd: Use macros for write_flag parsing Vivek Goyal
@ 2019-08-14 9:13 ` Dr. David Alan Gilbert
0 siblings, 0 replies; 13+ messages in thread
From: Dr. David Alan Gilbert @ 2019-08-14 9:13 UTC (permalink / raw)
To: Vivek Goyal; +Cc: virtio-fs
* Vivek Goyal (vgoyal@redhat.com) wrote:
> Use macros instead of hard coded bit positions.
>
> Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
OK, this is part of upstream 1b7d2b8
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> ---
> contrib/virtiofsd/fuse_lowlevel.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/contrib/virtiofsd/fuse_lowlevel.c b/contrib/virtiofsd/fuse_lowlevel.c
> index 8d3f141d23..417f99e8dc 100644
> --- a/contrib/virtiofsd/fuse_lowlevel.c
> +++ b/contrib/virtiofsd/fuse_lowlevel.c
> @@ -1161,7 +1161,7 @@ static void do_write(fuse_req_t req, fuse_ino_t nodeid,
>
> memset(&fi, 0, sizeof(fi));
> fi.fh = arg->fh;
> - fi.writepage = (arg->write_flags & 1) != 0;
> + fi.writepage = (arg->write_flags & FUSE_WRITE_CACHE) != 0;
>
> if (!compat) {
> fi.lock_owner = arg->lock_owner;
> @@ -1208,7 +1208,7 @@ static void do_write_buf(fuse_req_t req, fuse_ino_t nodeid,
> }
>
> fi.fh = arg->fh;
> - fi.writepage = arg->write_flags & 1;
> + fi.writepage = !!(arg->write_flags & FUSE_WRITE_CACHE);
>
> if (ibufv->count == 1) {
> assert(!(tmpbufv.buf[0].flags & FUSE_BUF_IS_FD));
> --
> 2.17.2
>
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Virtio-fs] [PATCH 3/4] virtiofsd: Parse flag FUSE_WRITE_KILL_PRIV
2019-08-13 19:29 ` [Virtio-fs] [PATCH 3/4] virtiofsd: Parse flag FUSE_WRITE_KILL_PRIV Vivek Goyal
@ 2019-08-14 9:28 ` Dr. David Alan Gilbert
0 siblings, 0 replies; 13+ messages in thread
From: Dr. David Alan Gilbert @ 2019-08-14 9:28 UTC (permalink / raw)
To: Vivek Goyal; +Cc: virtio-fs
* Vivek Goyal (vgoyal@redhat.com) wrote:
> Caller can set FUSE_WRITE_KILL_PRIV in write_flags. Parse it and pass it
> to the filesystem.
>
> Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> ---
> contrib/virtiofsd/fuse_common.h | 5 ++++-
> contrib/virtiofsd/fuse_kernel.h | 1 +
> contrib/virtiofsd/fuse_lowlevel.c | 2 ++
> 3 files changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/contrib/virtiofsd/fuse_common.h b/contrib/virtiofsd/fuse_common.h
> index 4d95f6f28b..5f4af98999 100644
> --- a/contrib/virtiofsd/fuse_common.h
> +++ b/contrib/virtiofsd/fuse_common.h
> @@ -64,8 +64,11 @@ struct fuse_file_info {
> May only be set in ->release(). */
> unsigned int flock_release : 1;
>
> + /* Indicates that suid/sgid bits should be removed upon write */
> + unsigned int kill_priv : 1;
> +
> /** Padding. Do not use*/
> - unsigned int padding : 26;
> + unsigned int padding : 25;
>
> /** File handle. May be filled in by filesystem in open().
> Available in all other file operations */
> diff --git a/contrib/virtiofsd/fuse_kernel.h b/contrib/virtiofsd/fuse_kernel.h
> index d477c70028..d2b7ccf96b 100644
> --- a/contrib/virtiofsd/fuse_kernel.h
> +++ b/contrib/virtiofsd/fuse_kernel.h
> @@ -325,6 +325,7 @@ struct fuse_file_lock {
> */
> #define FUSE_WRITE_CACHE (1 << 0)
> #define FUSE_WRITE_LOCKOWNER (1 << 1)
> +#define FUSE_WRITE_KILL_PRIV (1 << 2)
>
> /**
> * Read flags
> diff --git a/contrib/virtiofsd/fuse_lowlevel.c b/contrib/virtiofsd/fuse_lowlevel.c
> index 417f99e8dc..ce27a6dd16 100644
> --- a/contrib/virtiofsd/fuse_lowlevel.c
> +++ b/contrib/virtiofsd/fuse_lowlevel.c
> @@ -1162,6 +1162,7 @@ static void do_write(fuse_req_t req, fuse_ino_t nodeid,
> memset(&fi, 0, sizeof(fi));
> fi.fh = arg->fh;
> fi.writepage = (arg->write_flags & FUSE_WRITE_CACHE) != 0;
> + fi.kill_priv = !!(arg->write_flags & FUSE_WRITE_KILL_PRIV);
>
> if (!compat) {
> fi.lock_owner = arg->lock_owner;
> @@ -1209,6 +1210,7 @@ static void do_write_buf(fuse_req_t req, fuse_ino_t nodeid,
>
> fi.fh = arg->fh;
> fi.writepage = !!(arg->write_flags & FUSE_WRITE_CACHE);
> + fi.kill_priv = !!(arg->write_flags & FUSE_WRITE_KILL_PRIV);
>
> if (ibufv->count == 1) {
> assert(!(tmpbufv.buf[0].flags & FUSE_BUF_IS_FD));
> --
> 2.17.2
>
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Virtio-fs] [PATCH 4/4] virtiofsd: Drop CAP_FSETID if client asked for it
2019-08-13 19:29 ` [Virtio-fs] [PATCH 4/4] virtiofsd: Drop CAP_FSETID if client asked for it Vivek Goyal
@ 2019-08-14 9:52 ` Dr. David Alan Gilbert
2019-08-14 12:43 ` Vivek Goyal
0 siblings, 1 reply; 13+ messages in thread
From: Dr. David Alan Gilbert @ 2019-08-14 9:52 UTC (permalink / raw)
To: Vivek Goyal; +Cc: virtio-fs
* Vivek Goyal (vgoyal@redhat.com) wrote:
> If client requested killing setuid/setgid bits on file being written, drop
> CAP_FSETID capability so that setuid/setgid bits are cleared upon write
> automatically.
>
> pjdfstest chown/12.t needs this.
>
> Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Some high level questions:
a) This is going to be *really* expensive isn't it - that's a lot of
syscalls for each write
b) and isn't this the normal case, i.e. a non-root guest writing to a
normal file?
c) I guess an unpriv virtiofsd still works OK because the cap is
already gone.
d) How is this thread safe? My reading is cap_set_proc is process
wide; you'd have other writesin other threads doing non-kill-priv
writes; also you'd have multiple threads doing (clear) write (set) -
that feels like they'd interact badly.
e) I think you could share some stuff between the drop and the
restore; for example you only ever need to do cap_from_name once
in the entire run. You can probably also do teh cap_get_proc when
you drop it, keep that 'caps' around and then set it again.
Dave
> ---
> contrib/virtiofsd/Makefile.objs | 2 +
> contrib/virtiofsd/passthrough_ll.c | 127 +++++++++++++++++++++++++++++
> contrib/virtiofsd/seccomp.c | 2 +
> 3 files changed, 131 insertions(+)
>
> diff --git a/contrib/virtiofsd/Makefile.objs b/contrib/virtiofsd/Makefile.objs
> index 941b19f18e..25f1e8dd73 100644
> --- a/contrib/virtiofsd/Makefile.objs
> +++ b/contrib/virtiofsd/Makefile.objs
> @@ -11,3 +11,5 @@ virtiofsd-obj-y = buffer.o \
>
> seccomp.o-cflags := $(SECCOMP_CFLAGS)
> seccomp.o-libs := $(SECCOMP_LIBS)
> +
> +passthrough_ll.o-libs += -lcap
> diff --git a/contrib/virtiofsd/passthrough_ll.c b/contrib/virtiofsd/passthrough_ll.c
> index 321bbb20be..412663653a 100644
> --- a/contrib/virtiofsd/passthrough_ll.c
> +++ b/contrib/virtiofsd/passthrough_ll.c
> @@ -56,6 +56,7 @@
> #include <sys/mman.h>
> #include <sys/socket.h>
> #include <sys/un.h>
> +#include <sys/capability.h>
>
> #include "ireg.h"
> #include <sys/mount.h>
> @@ -230,6 +231,115 @@ static struct lo_data *lo_data(fuse_req_t req)
> return (struct lo_data *) fuse_req_userdata(req);
> }
>
> +/* Helpers for dropping and regaining effective capabilities. Returns 0
> + * on success, error otherwise */
> +static int drop_effective_cap(const char *cap_name, bool *cap_dropped)
> +{
> + cap_t caps;
> + cap_value_t cap;
> + cap_flag_value_t cap_value;
> + int ret = 0;
> +
> + ret = cap_from_name(cap_name, &cap);
> + if (ret == -1) {
> + ret = errno;
> + fuse_err("cap_from_name(%s) failed:%s\n", cap_name,
> + strerror(errno));
> + goto out;
> + }
> +
> + if (!CAP_IS_SUPPORTED(cap)) {
> + fuse_err("capability(%s) is not supported\n", cap_name);
> + return EINVAL;
> + }
> +
> + caps = cap_get_proc();
> + if (caps == NULL) {
> + ret = errno;
> + fuse_err("cap_get_proc() failed\n");
> + goto out;
> + }
> +
> + if (cap_get_flag(caps, cap, CAP_EFFECTIVE, &cap_value) == -1) {
> + ret = errno;
> + fuse_err("cap_get_flag() failed\n");
> + goto out_cap_free;
> + }
> +
> + /* We dont have this capability in effective set already. */
> + if (cap_value == CAP_CLEAR) {
> + ret = 0;
> + goto out_cap_free;
> + }
> +
> + if (cap_set_flag(caps, CAP_EFFECTIVE, 1, &cap, CAP_CLEAR) == -1) {
> + ret = errno;
> + fuse_err("cap_set_flag() failed\n");
> + goto out_cap_free;
> + }
> +
> + if (cap_set_proc(caps) == -1) {
> + ret = errno;
> + fuse_err("cap_set_procs() failed\n");
> + goto out_cap_free;
> + }
> +
> + ret = 0;
> + if (cap_dropped)
> + *cap_dropped = true;
> +
> +out_cap_free:
> + cap_free(caps);
> +out:
> + return ret;
> +}
> +
> +static int gain_effective_cap(const char *cap_name)
> +{
> + cap_t caps;
> + cap_value_t cap;
> + int ret = 0;
> +
> + ret = cap_from_name(cap_name, &cap);
> + if (ret == -1) {
> + ret = errno;
> + fuse_err("cap_from_name(%s) failed:%s\n", cap_name,
> + strerror(errno));
> + goto out;
> + }
> +
> + if (!CAP_IS_SUPPORTED(cap)) {
> + fuse_err("capability(%s) is not supported\n", cap_name);
> + return EINVAL;
> + }
> +
> + caps = cap_get_proc();
> + if (caps == NULL) {
> + ret = errno;
> + fuse_err("cap_get_proc() failed\n");
> + goto out;
> + }
> +
> +
> + if (cap_set_flag(caps, CAP_EFFECTIVE, 1, &cap, CAP_SET) == -1) {
> + ret = errno;
> + fuse_err("cap_set_flag() failed\n");
> + goto out_cap_free;
> + }
> +
> + if (cap_set_proc(caps) == -1) {
> + ret = errno;
> + fuse_err("cap_set_procs() failed\n");
> + goto out_cap_free;
> + }
> + ret = 0;
> +
> +out_cap_free:
> + cap_free(caps);
> +out:
> + return ret;
> +}
> +
> static void lo_map_init(struct lo_map *map)
> {
> map->elems = NULL;
> @@ -2014,6 +2124,7 @@ static void lo_write_buf(fuse_req_t req, fuse_ino_t ino,
> ssize_t res;
> struct fuse_bufvec out_buf = FUSE_BUFVEC_INIT(fuse_buf_size(in_buf));
> struct lo_data *lo = lo_data(req);
> + bool cap_fsetid_dropped = false;
>
> out_buf.buf[0].flags = FUSE_BUF_IS_FD | FUSE_BUF_FD_SEEK;
> out_buf.buf[0].fd = lo_fi_fd(req, fi);
> @@ -2023,6 +2134,16 @@ static void lo_write_buf(fuse_req_t req, fuse_ino_t ino,
> fuse_debug("lo_write_buf(ino=%" PRIu64 ", size=%zd, off=%lu)\n",
> ino, out_buf.buf[0].size, (unsigned long) off);
>
> + /*
> + * If kill_priv is set, drop CAP_FSETID which should lead to kernel
> + * clearing setuid/setgid on file.
> + */
> + if (fi->kill_priv) {
> + res = drop_effective_cap("cap_fsetid", &cap_fsetid_dropped);
> + if (res != 0)
> + fuse_reply_err(req, res);
> + }
> +
> res = fuse_buf_copy(req, &out_buf, in_buf, 0);
> if(res < 0) {
> fuse_reply_err(req, -res);
> @@ -2037,6 +2158,12 @@ static void lo_write_buf(fuse_req_t req, fuse_ino_t ino,
>
> fuse_reply_write(req, (size_t) res);
> }
> +
> + if (cap_fsetid_dropped) {
> + res = gain_effective_cap("cap_fsetid");
> + if (res)
> + fuse_err("Failed to gain CAP_FSETID\n");
> + }
> }
>
> static void lo_statfs(fuse_req_t req, fuse_ino_t ino)
> diff --git a/contrib/virtiofsd/seccomp.c b/contrib/virtiofsd/seccomp.c
> index 5f1c873b82..7384ebee0a 100644
> --- a/contrib/virtiofsd/seccomp.c
> +++ b/contrib/virtiofsd/seccomp.c
> @@ -78,6 +78,8 @@ static const int syscall_whitelist[] = {
> SCMP_SYS(unlinkat),
> SCMP_SYS(utimensat),
> SCMP_SYS(write),
> + SCMP_SYS(capget),
> + SCMP_SYS(capset),
> };
>
> /* Syscalls used when --syslog is enabled */
> --
> 2.17.2
>
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Virtio-fs] [PATCH 0/4] Drop CAP_FSETID if client needs to kill setuid/setgid bits
2019-08-13 19:29 [Virtio-fs] [PATCH 0/4] Drop CAP_FSETID if client needs to kill setuid/setgid bits Vivek Goyal
` (3 preceding siblings ...)
2019-08-13 19:29 ` [Virtio-fs] [PATCH 4/4] virtiofsd: Drop CAP_FSETID if client asked for it Vivek Goyal
@ 2019-08-14 9:55 ` Dr. David Alan Gilbert
4 siblings, 0 replies; 13+ messages in thread
From: Dr. David Alan Gilbert @ 2019-08-14 9:55 UTC (permalink / raw)
To: Vivek Goyal; +Cc: virtio-fs
* Vivek Goyal (vgoyal@redhat.com) wrote:
> If a file has setuid/setuid bit set and a writer writes to file without
> having CAP_FSETID capability, kernel clears setuid/setgid bit on file.
>
> pjdfstest test chmod/12.t tests for this. With moving to 5.3 kernel and
> cache=none this test fails.
>
> Now Miklos has introducd a commit where if client thinks that
> setuid/setgid bit should be cleared, it sets FUSE_KILL_PRIV flag
> in fuse_write_in->write_flags. This is an indication to daemon to
> clear setuid/setgid bit atomically.
>
> So drop CAP_FSETID capability and then proceed with write and that
> should automatically clear setuid bit.
1,2,3 added to my world.
4 still to be discussed
> Vivek Goyal (4):
> virtiofsd: Fix number of padding bits in fuse_file_info
> virtiofsd: Use macros for write_flag parsing
> virtiofsd: Parse flag FUSE_WRITE_KILL_PRIV
> virtiofsd: Drop CAP_FSETID if client asked for it
>
> contrib/virtiofsd/Makefile.objs | 2 +
> contrib/virtiofsd/fuse_common.h | 5 +-
> contrib/virtiofsd/fuse_kernel.h | 1 +
> contrib/virtiofsd/fuse_lowlevel.c | 6 +-
> contrib/virtiofsd/passthrough_ll.c | 127 +++++++++++++++++++++++++++++
> contrib/virtiofsd/seccomp.c | 2 +
> 6 files changed, 140 insertions(+), 3 deletions(-)
>
> --
> 2.17.2
>
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Virtio-fs] [PATCH 4/4] virtiofsd: Drop CAP_FSETID if client asked for it
2019-08-14 9:52 ` Dr. David Alan Gilbert
@ 2019-08-14 12:43 ` Vivek Goyal
2019-08-14 13:17 ` Dr. David Alan Gilbert
0 siblings, 1 reply; 13+ messages in thread
From: Vivek Goyal @ 2019-08-14 12:43 UTC (permalink / raw)
To: Dr. David Alan Gilbert; +Cc: virtio-fs
On Wed, Aug 14, 2019 at 10:52:16AM +0100, Dr. David Alan Gilbert wrote:
> * Vivek Goyal (vgoyal@redhat.com) wrote:
> > If client requested killing setuid/setgid bits on file being written, drop
> > CAP_FSETID capability so that setuid/setgid bits are cleared upon write
> > automatically.
> >
> > pjdfstest chown/12.t needs this.
> >
> > Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
>
> Some high level questions:
> a) This is going to be *really* expensive isn't it - that's a lot of
> syscalls for each write
>
> b) and isn't this the normal case, i.e. a non-root guest writing to a
> normal file?
Hi David,
FUSE_WRITE_KILL_PRIV flag will not be set frequently. It is set only
if a write happens to a file which has setuid/setgid bit set. Very
few files in filesystem typically have this bit set and for those
also it will be set only for first write and not subsequent writes.
So I will not be worried about performance impact of it.
>
> c) I guess an unpriv virtiofsd still works OK because the cap is
> already gone.
>
> d) How is this thread safe? My reading is cap_set_proc is process
> wide; you'd have other writesin other threads doing non-kill-priv
> writes; also you'd have multiple threads doing (clear) write (set) -
> that feels like they'd interact badly.
man page of capabilities says that its a per thread property.
**************************
Starting with kernel 2.2, Linux divides the privileges traditionally
associated with superuser into distinct units, known as capabilities,
which can be independently enabled and disabled. Capabilities are a
per-thread attribute.
**************************
So dropping capability per thread and enabling it back should be fine.
>
> e) I think you could share some stuff between the drop and the
> restore; for example you only ever need to do cap_from_name once
> in the entire run. You can probably also do teh cap_get_proc when
> you drop it, keep that 'caps' around and then set it again.
Given its not a very frequent operation, I am not too worried about
it. It will just make code structure little more complicated.
But if you have strong opinion about it, I am open to change it.
Thanks
Vivek
>
>
> Dave
>
> > ---
> > contrib/virtiofsd/Makefile.objs | 2 +
> > contrib/virtiofsd/passthrough_ll.c | 127 +++++++++++++++++++++++++++++
> > contrib/virtiofsd/seccomp.c | 2 +
> > 3 files changed, 131 insertions(+)
> >
> > diff --git a/contrib/virtiofsd/Makefile.objs b/contrib/virtiofsd/Makefile.objs
> > index 941b19f18e..25f1e8dd73 100644
> > --- a/contrib/virtiofsd/Makefile.objs
> > +++ b/contrib/virtiofsd/Makefile.objs
> > @@ -11,3 +11,5 @@ virtiofsd-obj-y = buffer.o \
> >
> > seccomp.o-cflags := $(SECCOMP_CFLAGS)
> > seccomp.o-libs := $(SECCOMP_LIBS)
> > +
> > +passthrough_ll.o-libs += -lcap
> > diff --git a/contrib/virtiofsd/passthrough_ll.c b/contrib/virtiofsd/passthrough_ll.c
> > index 321bbb20be..412663653a 100644
> > --- a/contrib/virtiofsd/passthrough_ll.c
> > +++ b/contrib/virtiofsd/passthrough_ll.c
> > @@ -56,6 +56,7 @@
> > #include <sys/mman.h>
> > #include <sys/socket.h>
> > #include <sys/un.h>
> > +#include <sys/capability.h>
> >
> > #include "ireg.h"
> > #include <sys/mount.h>
> > @@ -230,6 +231,115 @@ static struct lo_data *lo_data(fuse_req_t req)
> > return (struct lo_data *) fuse_req_userdata(req);
> > }
> >
> > +/* Helpers for dropping and regaining effective capabilities. Returns 0
> > + * on success, error otherwise */
> > +static int drop_effective_cap(const char *cap_name, bool *cap_dropped)
> > +{
> > + cap_t caps;
> > + cap_value_t cap;
> > + cap_flag_value_t cap_value;
> > + int ret = 0;
> > +
> > + ret = cap_from_name(cap_name, &cap);
> > + if (ret == -1) {
> > + ret = errno;
> > + fuse_err("cap_from_name(%s) failed:%s\n", cap_name,
> > + strerror(errno));
> > + goto out;
> > + }
> > +
> > + if (!CAP_IS_SUPPORTED(cap)) {
> > + fuse_err("capability(%s) is not supported\n", cap_name);
> > + return EINVAL;
> > + }
> > +
> > + caps = cap_get_proc();
> > + if (caps == NULL) {
> > + ret = errno;
> > + fuse_err("cap_get_proc() failed\n");
> > + goto out;
> > + }
> > +
> > + if (cap_get_flag(caps, cap, CAP_EFFECTIVE, &cap_value) == -1) {
> > + ret = errno;
> > + fuse_err("cap_get_flag() failed\n");
> > + goto out_cap_free;
> > + }
> > +
> > + /* We dont have this capability in effective set already. */
> > + if (cap_value == CAP_CLEAR) {
> > + ret = 0;
> > + goto out_cap_free;
> > + }
> > +
> > + if (cap_set_flag(caps, CAP_EFFECTIVE, 1, &cap, CAP_CLEAR) == -1) {
> > + ret = errno;
> > + fuse_err("cap_set_flag() failed\n");
> > + goto out_cap_free;
> > + }
> > +
> > + if (cap_set_proc(caps) == -1) {
> > + ret = errno;
> > + fuse_err("cap_set_procs() failed\n");
> > + goto out_cap_free;
> > + }
> > +
> > + ret = 0;
> > + if (cap_dropped)
> > + *cap_dropped = true;
> > +
> > +out_cap_free:
> > + cap_free(caps);
> > +out:
> > + return ret;
> > +}
> > +
> > +static int gain_effective_cap(const char *cap_name)
> > +{
> > + cap_t caps;
> > + cap_value_t cap;
> > + int ret = 0;
> > +
> > + ret = cap_from_name(cap_name, &cap);
> > + if (ret == -1) {
> > + ret = errno;
> > + fuse_err("cap_from_name(%s) failed:%s\n", cap_name,
> > + strerror(errno));
> > + goto out;
> > + }
> > +
> > + if (!CAP_IS_SUPPORTED(cap)) {
> > + fuse_err("capability(%s) is not supported\n", cap_name);
> > + return EINVAL;
> > + }
> > +
> > + caps = cap_get_proc();
> > + if (caps == NULL) {
> > + ret = errno;
> > + fuse_err("cap_get_proc() failed\n");
> > + goto out;
> > + }
> > +
> > +
> > + if (cap_set_flag(caps, CAP_EFFECTIVE, 1, &cap, CAP_SET) == -1) {
> > + ret = errno;
> > + fuse_err("cap_set_flag() failed\n");
> > + goto out_cap_free;
> > + }
> > +
> > + if (cap_set_proc(caps) == -1) {
> > + ret = errno;
> > + fuse_err("cap_set_procs() failed\n");
> > + goto out_cap_free;
> > + }
> > + ret = 0;
> > +
> > +out_cap_free:
> > + cap_free(caps);
> > +out:
> > + return ret;
> > +}
> > +
> > static void lo_map_init(struct lo_map *map)
> > {
> > map->elems = NULL;
> > @@ -2014,6 +2124,7 @@ static void lo_write_buf(fuse_req_t req, fuse_ino_t ino,
> > ssize_t res;
> > struct fuse_bufvec out_buf = FUSE_BUFVEC_INIT(fuse_buf_size(in_buf));
> > struct lo_data *lo = lo_data(req);
> > + bool cap_fsetid_dropped = false;
> >
> > out_buf.buf[0].flags = FUSE_BUF_IS_FD | FUSE_BUF_FD_SEEK;
> > out_buf.buf[0].fd = lo_fi_fd(req, fi);
> > @@ -2023,6 +2134,16 @@ static void lo_write_buf(fuse_req_t req, fuse_ino_t ino,
> > fuse_debug("lo_write_buf(ino=%" PRIu64 ", size=%zd, off=%lu)\n",
> > ino, out_buf.buf[0].size, (unsigned long) off);
> >
> > + /*
> > + * If kill_priv is set, drop CAP_FSETID which should lead to kernel
> > + * clearing setuid/setgid on file.
> > + */
> > + if (fi->kill_priv) {
> > + res = drop_effective_cap("cap_fsetid", &cap_fsetid_dropped);
> > + if (res != 0)
> > + fuse_reply_err(req, res);
> > + }
> > +
> > res = fuse_buf_copy(req, &out_buf, in_buf, 0);
> > if(res < 0) {
> > fuse_reply_err(req, -res);
> > @@ -2037,6 +2158,12 @@ static void lo_write_buf(fuse_req_t req, fuse_ino_t ino,
> >
> > fuse_reply_write(req, (size_t) res);
> > }
> > +
> > + if (cap_fsetid_dropped) {
> > + res = gain_effective_cap("cap_fsetid");
> > + if (res)
> > + fuse_err("Failed to gain CAP_FSETID\n");
> > + }
> > }
> >
> > static void lo_statfs(fuse_req_t req, fuse_ino_t ino)
> > diff --git a/contrib/virtiofsd/seccomp.c b/contrib/virtiofsd/seccomp.c
> > index 5f1c873b82..7384ebee0a 100644
> > --- a/contrib/virtiofsd/seccomp.c
> > +++ b/contrib/virtiofsd/seccomp.c
> > @@ -78,6 +78,8 @@ static const int syscall_whitelist[] = {
> > SCMP_SYS(unlinkat),
> > SCMP_SYS(utimensat),
> > SCMP_SYS(write),
> > + SCMP_SYS(capget),
> > + SCMP_SYS(capset),
> > };
> >
> > /* Syscalls used when --syslog is enabled */
> > --
> > 2.17.2
> >
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Virtio-fs] [PATCH 4/4] virtiofsd: Drop CAP_FSETID if client asked for it
2019-08-14 12:43 ` Vivek Goyal
@ 2019-08-14 13:17 ` Dr. David Alan Gilbert
0 siblings, 0 replies; 13+ messages in thread
From: Dr. David Alan Gilbert @ 2019-08-14 13:17 UTC (permalink / raw)
To: Vivek Goyal; +Cc: virtio-fs
* Vivek Goyal (vgoyal@redhat.com) wrote:
> On Wed, Aug 14, 2019 at 10:52:16AM +0100, Dr. David Alan Gilbert wrote:
> > * Vivek Goyal (vgoyal@redhat.com) wrote:
> > > If client requested killing setuid/setgid bits on file being written, drop
> > > CAP_FSETID capability so that setuid/setgid bits are cleared upon write
> > > automatically.
> > >
> > > pjdfstest chown/12.t needs this.
> > >
> > > Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
> >
> > Some high level questions:
> > a) This is going to be *really* expensive isn't it - that's a lot of
> > syscalls for each write
> >
> > b) and isn't this the normal case, i.e. a non-root guest writing to a
> > normal file?
>
> Hi David,
>
> FUSE_WRITE_KILL_PRIV flag will not be set frequently. It is set only
> if a write happens to a file which has setuid/setgid bit set. Very
> few files in filesystem typically have this bit set and for those
> also it will be set only for first write and not subsequent writes.
>
> So I will not be worried about performance impact of it.
OK, that makes me a lot less worried.
>
> >
> > c) I guess an unpriv virtiofsd still works OK because the cap is
> > already gone.
> >
> > d) How is this thread safe? My reading is cap_set_proc is process
> > wide; you'd have other writesin other threads doing non-kill-priv
> > writes; also you'd have multiple threads doing (clear) write (set) -
> > that feels like they'd interact badly.
>
> man page of capabilities says that its a per thread property.
>
> **************************
> Starting with kernel 2.2, Linux divides the privileges traditionally
> associated with superuser into distinct units, known as capabilities,
> which can be independently enabled and disabled. Capabilities are a
> per-thread attribute.
> **************************
>
> So dropping capability per thread and enabling it back should be fine.
Huh OK; the man page of cap_get_proc/cap_set_proc say process
everywhere.
>
> >
> > e) I think you could share some stuff between the drop and the
> > restore; for example you only ever need to do cap_from_name once
> > in the entire run. You can probably also do teh cap_get_proc when
> > you drop it, keep that 'caps' around and then set it again.
>
> Given its not a very frequent operation, I am not too worried about
> it. It will just make code structure little more complicated.
>
> But if you have strong opinion about it, I am open to change it.
No that's fine given your explanation above.
Dave
> Thanks
> Vivek
>
> >
> >
> > Dave
> >
> > > ---
> > > contrib/virtiofsd/Makefile.objs | 2 +
> > > contrib/virtiofsd/passthrough_ll.c | 127 +++++++++++++++++++++++++++++
> > > contrib/virtiofsd/seccomp.c | 2 +
> > > 3 files changed, 131 insertions(+)
> > >
> > > diff --git a/contrib/virtiofsd/Makefile.objs b/contrib/virtiofsd/Makefile.objs
> > > index 941b19f18e..25f1e8dd73 100644
> > > --- a/contrib/virtiofsd/Makefile.objs
> > > +++ b/contrib/virtiofsd/Makefile.objs
> > > @@ -11,3 +11,5 @@ virtiofsd-obj-y = buffer.o \
> > >
> > > seccomp.o-cflags := $(SECCOMP_CFLAGS)
> > > seccomp.o-libs := $(SECCOMP_LIBS)
> > > +
> > > +passthrough_ll.o-libs += -lcap
> > > diff --git a/contrib/virtiofsd/passthrough_ll.c b/contrib/virtiofsd/passthrough_ll.c
> > > index 321bbb20be..412663653a 100644
> > > --- a/contrib/virtiofsd/passthrough_ll.c
> > > +++ b/contrib/virtiofsd/passthrough_ll.c
> > > @@ -56,6 +56,7 @@
> > > #include <sys/mman.h>
> > > #include <sys/socket.h>
> > > #include <sys/un.h>
> > > +#include <sys/capability.h>
> > >
> > > #include "ireg.h"
> > > #include <sys/mount.h>
> > > @@ -230,6 +231,115 @@ static struct lo_data *lo_data(fuse_req_t req)
> > > return (struct lo_data *) fuse_req_userdata(req);
> > > }
> > >
> > > +/* Helpers for dropping and regaining effective capabilities. Returns 0
> > > + * on success, error otherwise */
> > > +static int drop_effective_cap(const char *cap_name, bool *cap_dropped)
> > > +{
> > > + cap_t caps;
> > > + cap_value_t cap;
> > > + cap_flag_value_t cap_value;
> > > + int ret = 0;
> > > +
> > > + ret = cap_from_name(cap_name, &cap);
> > > + if (ret == -1) {
> > > + ret = errno;
> > > + fuse_err("cap_from_name(%s) failed:%s\n", cap_name,
> > > + strerror(errno));
> > > + goto out;
> > > + }
> > > +
> > > + if (!CAP_IS_SUPPORTED(cap)) {
> > > + fuse_err("capability(%s) is not supported\n", cap_name);
> > > + return EINVAL;
> > > + }
> > > +
> > > + caps = cap_get_proc();
> > > + if (caps == NULL) {
> > > + ret = errno;
> > > + fuse_err("cap_get_proc() failed\n");
> > > + goto out;
> > > + }
> > > +
> > > + if (cap_get_flag(caps, cap, CAP_EFFECTIVE, &cap_value) == -1) {
> > > + ret = errno;
> > > + fuse_err("cap_get_flag() failed\n");
> > > + goto out_cap_free;
> > > + }
> > > +
> > > + /* We dont have this capability in effective set already. */
> > > + if (cap_value == CAP_CLEAR) {
> > > + ret = 0;
> > > + goto out_cap_free;
> > > + }
> > > +
> > > + if (cap_set_flag(caps, CAP_EFFECTIVE, 1, &cap, CAP_CLEAR) == -1) {
> > > + ret = errno;
> > > + fuse_err("cap_set_flag() failed\n");
> > > + goto out_cap_free;
> > > + }
> > > +
> > > + if (cap_set_proc(caps) == -1) {
> > > + ret = errno;
> > > + fuse_err("cap_set_procs() failed\n");
> > > + goto out_cap_free;
> > > + }
> > > +
> > > + ret = 0;
> > > + if (cap_dropped)
> > > + *cap_dropped = true;
> > > +
> > > +out_cap_free:
> > > + cap_free(caps);
> > > +out:
> > > + return ret;
> > > +}
> > > +
> > > +static int gain_effective_cap(const char *cap_name)
> > > +{
> > > + cap_t caps;
> > > + cap_value_t cap;
> > > + int ret = 0;
> > > +
> > > + ret = cap_from_name(cap_name, &cap);
> > > + if (ret == -1) {
> > > + ret = errno;
> > > + fuse_err("cap_from_name(%s) failed:%s\n", cap_name,
> > > + strerror(errno));
> > > + goto out;
> > > + }
> > > +
> > > + if (!CAP_IS_SUPPORTED(cap)) {
> > > + fuse_err("capability(%s) is not supported\n", cap_name);
> > > + return EINVAL;
> > > + }
> > > +
> > > + caps = cap_get_proc();
> > > + if (caps == NULL) {
> > > + ret = errno;
> > > + fuse_err("cap_get_proc() failed\n");
> > > + goto out;
> > > + }
> > > +
> > > +
> > > + if (cap_set_flag(caps, CAP_EFFECTIVE, 1, &cap, CAP_SET) == -1) {
> > > + ret = errno;
> > > + fuse_err("cap_set_flag() failed\n");
> > > + goto out_cap_free;
> > > + }
> > > +
> > > + if (cap_set_proc(caps) == -1) {
> > > + ret = errno;
> > > + fuse_err("cap_set_procs() failed\n");
> > > + goto out_cap_free;
> > > + }
> > > + ret = 0;
> > > +
> > > +out_cap_free:
> > > + cap_free(caps);
> > > +out:
> > > + return ret;
> > > +}
> > > +
> > > static void lo_map_init(struct lo_map *map)
> > > {
> > > map->elems = NULL;
> > > @@ -2014,6 +2124,7 @@ static void lo_write_buf(fuse_req_t req, fuse_ino_t ino,
> > > ssize_t res;
> > > struct fuse_bufvec out_buf = FUSE_BUFVEC_INIT(fuse_buf_size(in_buf));
> > > struct lo_data *lo = lo_data(req);
> > > + bool cap_fsetid_dropped = false;
> > >
> > > out_buf.buf[0].flags = FUSE_BUF_IS_FD | FUSE_BUF_FD_SEEK;
> > > out_buf.buf[0].fd = lo_fi_fd(req, fi);
> > > @@ -2023,6 +2134,16 @@ static void lo_write_buf(fuse_req_t req, fuse_ino_t ino,
> > > fuse_debug("lo_write_buf(ino=%" PRIu64 ", size=%zd, off=%lu)\n",
> > > ino, out_buf.buf[0].size, (unsigned long) off);
> > >
> > > + /*
> > > + * If kill_priv is set, drop CAP_FSETID which should lead to kernel
> > > + * clearing setuid/setgid on file.
> > > + */
> > > + if (fi->kill_priv) {
> > > + res = drop_effective_cap("cap_fsetid", &cap_fsetid_dropped);
> > > + if (res != 0)
> > > + fuse_reply_err(req, res);
> > > + }
> > > +
> > > res = fuse_buf_copy(req, &out_buf, in_buf, 0);
> > > if(res < 0) {
> > > fuse_reply_err(req, -res);
> > > @@ -2037,6 +2158,12 @@ static void lo_write_buf(fuse_req_t req, fuse_ino_t ino,
> > >
> > > fuse_reply_write(req, (size_t) res);
> > > }
> > > +
> > > + if (cap_fsetid_dropped) {
> > > + res = gain_effective_cap("cap_fsetid");
> > > + if (res)
> > > + fuse_err("Failed to gain CAP_FSETID\n");
> > > + }
> > > }
> > >
> > > static void lo_statfs(fuse_req_t req, fuse_ino_t ino)
> > > diff --git a/contrib/virtiofsd/seccomp.c b/contrib/virtiofsd/seccomp.c
> > > index 5f1c873b82..7384ebee0a 100644
> > > --- a/contrib/virtiofsd/seccomp.c
> > > +++ b/contrib/virtiofsd/seccomp.c
> > > @@ -78,6 +78,8 @@ static const int syscall_whitelist[] = {
> > > SCMP_SYS(unlinkat),
> > > SCMP_SYS(utimensat),
> > > SCMP_SYS(write),
> > > + SCMP_SYS(capget),
> > > + SCMP_SYS(capset),
> > > };
> > >
> > > /* Syscalls used when --syslog is enabled */
> > > --
> > > 2.17.2
> > >
> > --
> > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Virtio-fs] [PATCH 1/4] virtiofsd: Fix number of padding bits in fuse_file_info
2019-08-14 8:54 ` Dr. David Alan Gilbert
@ 2019-08-21 11:25 ` Dr. David Alan Gilbert
0 siblings, 0 replies; 13+ messages in thread
From: Dr. David Alan Gilbert @ 2019-08-21 11:25 UTC (permalink / raw)
To: Vivek Goyal; +Cc: virtio-fs
* Dr. David Alan Gilbert (dgilbert@redhat.com) wrote:
> * Vivek Goyal (vgoyal@redhat.com) wrote:
> > Currently we have 27 padding bits while there are 6 bit fields. I suspect
> > this is wrong. We probably are trying to aling to 32 bits and hence padding
> > bits should be 26 instead.
> >
> > Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
>
> Yeh, I reckon this was broken upstream by 19accdf ~6 years ago; whether
> it's fixable upstream I don't know because they have API requierments
> since it's a library.
>
> Still, it's definitely wrong so:
>
> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Actually I found this upstream bug:
https://github.com/libfuse/libfuse/issues/396
which is a 'wontfix'; I think the padding only really exists because
of the ABI requirements; so there's not much point in us changing it.
We could just drop padding.
Dave
> > ---
> > contrib/virtiofsd/fuse_common.h | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/contrib/virtiofsd/fuse_common.h b/contrib/virtiofsd/fuse_common.h
> > index 6419f66470..4d95f6f28b 100644
> > --- a/contrib/virtiofsd/fuse_common.h
> > +++ b/contrib/virtiofsd/fuse_common.h
> > @@ -65,7 +65,7 @@ struct fuse_file_info {
> > unsigned int flock_release : 1;
> >
> > /** Padding. Do not use*/
> > - unsigned int padding : 27;
> > + unsigned int padding : 26;
> >
> > /** File handle. May be filled in by filesystem in open().
> > Available in all other file operations */
> > --
> > 2.17.2
> >
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2019-08-21 11:25 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-08-13 19:29 [Virtio-fs] [PATCH 0/4] Drop CAP_FSETID if client needs to kill setuid/setgid bits Vivek Goyal
2019-08-13 19:29 ` [Virtio-fs] [PATCH 1/4] virtiofsd: Fix number of padding bits in fuse_file_info Vivek Goyal
2019-08-14 8:54 ` Dr. David Alan Gilbert
2019-08-21 11:25 ` Dr. David Alan Gilbert
2019-08-13 19:29 ` [Virtio-fs] [PATCH 2/4] virtiofsd: Use macros for write_flag parsing Vivek Goyal
2019-08-14 9:13 ` Dr. David Alan Gilbert
2019-08-13 19:29 ` [Virtio-fs] [PATCH 3/4] virtiofsd: Parse flag FUSE_WRITE_KILL_PRIV Vivek Goyal
2019-08-14 9:28 ` Dr. David Alan Gilbert
2019-08-13 19:29 ` [Virtio-fs] [PATCH 4/4] virtiofsd: Drop CAP_FSETID if client asked for it Vivek Goyal
2019-08-14 9:52 ` Dr. David Alan Gilbert
2019-08-14 12:43 ` Vivek Goyal
2019-08-14 13:17 ` Dr. David Alan Gilbert
2019-08-14 9:55 ` [Virtio-fs] [PATCH 0/4] Drop CAP_FSETID if client needs to kill setuid/setgid bits Dr. David Alan Gilbert
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.