* [Qemu-devel] [PATCH v3 0/6] seccomp: feature refactoring @ 2017-07-28 12:10 Eduardo Otubo 2017-07-28 12:10 ` [Qemu-devel] [PATCH v3 1/6] seccomp: changing from whitelist to blacklist Eduardo Otubo ` (5 more replies) 0 siblings, 6 replies; 20+ messages in thread From: Eduardo Otubo @ 2017-07-28 12:10 UTC (permalink / raw) To: qemu-devel; +Cc: thuth, berrange, pbonzini obs: Same version as before, this time CCing everyone involved in the first discussion since it didn't get any comments or reviews. v3: * Style problems fixed v2: * The semantics of the options "allow/deny" instead of booleans "on/off" remains. * Added option 'children' to elevateprivileges * Added documentation to docs/ v1: * First version based on the discussion https://lists.gnu.org/archive/html/qemu-devel/2017-02/msg03348.html Eduardo Otubo (6): seccomp: changing from whitelist to blacklist seccomp: add obsolete argument to command line seccomp: add elevateprivileges argument to command line seccomp: add spawn argument to command line seccomp: add resourcecontrol argument to command line seccomp: adding documentation to new seccomp model docs/seccomp.txt | 31 ++++ include/sysemu/seccomp.h | 7 +- qemu-options.hx | 19 ++- qemu-seccomp.c | 364 ++++++++++++++++++----------------------------- vl.c | 61 +++++++- 5 files changed, 248 insertions(+), 234 deletions(-) create mode 100644 docs/seccomp.txt -- 2.13.3 ^ permalink raw reply [flat|nested] 20+ messages in thread
* [Qemu-devel] [PATCH v3 1/6] seccomp: changing from whitelist to blacklist 2017-07-28 12:10 [Qemu-devel] [PATCH v3 0/6] seccomp: feature refactoring Eduardo Otubo @ 2017-07-28 12:10 ` Eduardo Otubo 2017-08-02 12:25 ` Daniel P. Berrange 2017-08-03 16:54 ` Thomas Huth 2017-07-28 12:10 ` [Qemu-devel] [PATCH v3 2/6] seccomp: add obsolete argument to command line Eduardo Otubo ` (4 subsequent siblings) 5 siblings, 2 replies; 20+ messages in thread From: Eduardo Otubo @ 2017-07-28 12:10 UTC (permalink / raw) To: qemu-devel; +Cc: thuth, berrange, pbonzini This patch changes the default behavior of the seccomp filter from whitelist to blacklist. By default now all system calls are allowed and a small black list of definitely forbidden ones was created. Signed-off-by: Eduardo Otubo <otubo@redhat.com> --- qemu-seccomp.c | 256 +++++++-------------------------------------------------- vl.c | 5 +- 2 files changed, 32 insertions(+), 229 deletions(-) diff --git a/qemu-seccomp.c b/qemu-seccomp.c index df75d9c471..f8877b07b5 100644 --- a/qemu-seccomp.c +++ b/qemu-seccomp.c @@ -31,229 +31,29 @@ struct QemuSeccompSyscall { uint8_t priority; }; -static const struct QemuSeccompSyscall seccomp_whitelist[] = { - { SCMP_SYS(timer_settime), 255 }, - { SCMP_SYS(timer_gettime), 254 }, - { SCMP_SYS(futex), 253 }, - { SCMP_SYS(select), 252 }, - { SCMP_SYS(recvfrom), 251 }, - { SCMP_SYS(sendto), 250 }, - { SCMP_SYS(socketcall), 250 }, - { SCMP_SYS(read), 249 }, - { SCMP_SYS(io_submit), 249 }, - { SCMP_SYS(brk), 248 }, - { SCMP_SYS(clone), 247 }, - { SCMP_SYS(mmap), 247 }, - { SCMP_SYS(mprotect), 246 }, - { SCMP_SYS(execve), 245 }, - { SCMP_SYS(open), 245 }, - { SCMP_SYS(ioctl), 245 }, - { SCMP_SYS(socket), 245 }, - { SCMP_SYS(setsockopt), 245 }, - { SCMP_SYS(recvmsg), 245 }, - { SCMP_SYS(sendmsg), 245 }, - { SCMP_SYS(accept), 245 }, - { SCMP_SYS(connect), 245 }, - { SCMP_SYS(socketpair), 245 }, - { SCMP_SYS(bind), 245 }, - { SCMP_SYS(listen), 245 }, - { SCMP_SYS(semget), 245 }, - { SCMP_SYS(ipc), 245 }, - { SCMP_SYS(gettimeofday), 245 }, - { SCMP_SYS(readlink), 245 }, - { SCMP_SYS(access), 245 }, - { SCMP_SYS(prctl), 245 }, - { SCMP_SYS(signalfd), 245 }, - { SCMP_SYS(getrlimit), 245 }, - { SCMP_SYS(getrusage), 245 }, - { SCMP_SYS(set_tid_address), 245 }, - { SCMP_SYS(statfs), 245 }, - { SCMP_SYS(unlink), 245 }, - { SCMP_SYS(wait4), 245 }, - { SCMP_SYS(fcntl64), 245 }, - { SCMP_SYS(fstat64), 245 }, - { SCMP_SYS(stat64), 245 }, - { SCMP_SYS(getgid32), 245 }, - { SCMP_SYS(getegid32), 245 }, - { SCMP_SYS(getuid32), 245 }, - { SCMP_SYS(geteuid32), 245 }, - { SCMP_SYS(sigreturn), 245 }, - { SCMP_SYS(_newselect), 245 }, - { SCMP_SYS(_llseek), 245 }, - { SCMP_SYS(mmap2), 245 }, - { SCMP_SYS(sigprocmask), 245 }, - { SCMP_SYS(sched_getparam), 245 }, - { SCMP_SYS(sched_getscheduler), 245 }, - { SCMP_SYS(fstat), 245 }, - { SCMP_SYS(clock_getres), 245 }, - { SCMP_SYS(sched_get_priority_min), 245 }, - { SCMP_SYS(sched_get_priority_max), 245 }, - { SCMP_SYS(stat), 245 }, - { SCMP_SYS(uname), 245 }, - { SCMP_SYS(eventfd2), 245 }, - { SCMP_SYS(io_getevents), 245 }, - { SCMP_SYS(dup), 245 }, - { SCMP_SYS(dup2), 245 }, - { SCMP_SYS(dup3), 245 }, - { SCMP_SYS(gettid), 245 }, - { SCMP_SYS(getgid), 245 }, - { SCMP_SYS(getegid), 245 }, - { SCMP_SYS(getuid), 245 }, - { SCMP_SYS(geteuid), 245 }, - { SCMP_SYS(timer_create), 245 }, - { SCMP_SYS(times), 245 }, - { SCMP_SYS(exit), 245 }, - { SCMP_SYS(clock_gettime), 245 }, - { SCMP_SYS(time), 245 }, - { SCMP_SYS(restart_syscall), 245 }, - { SCMP_SYS(pwrite64), 245 }, - { SCMP_SYS(nanosleep), 245 }, - { SCMP_SYS(chown), 245 }, - { SCMP_SYS(openat), 245 }, - { SCMP_SYS(getdents), 245 }, - { SCMP_SYS(timer_delete), 245 }, - { SCMP_SYS(exit_group), 245 }, - { SCMP_SYS(rt_sigreturn), 245 }, - { SCMP_SYS(sync), 245 }, - { SCMP_SYS(pread64), 245 }, - { SCMP_SYS(madvise), 245 }, - { SCMP_SYS(set_robust_list), 245 }, - { SCMP_SYS(lseek), 245 }, - { SCMP_SYS(pselect6), 245 }, - { SCMP_SYS(fork), 245 }, - { SCMP_SYS(rt_sigprocmask), 245 }, - { SCMP_SYS(write), 244 }, - { SCMP_SYS(fcntl), 243 }, - { SCMP_SYS(tgkill), 242 }, - { SCMP_SYS(kill), 242 }, - { SCMP_SYS(rt_sigaction), 242 }, - { SCMP_SYS(pipe2), 242 }, - { SCMP_SYS(munmap), 242 }, - { SCMP_SYS(mremap), 242 }, - { SCMP_SYS(fdatasync), 242 }, - { SCMP_SYS(close), 242 }, - { SCMP_SYS(rt_sigpending), 242 }, - { SCMP_SYS(rt_sigtimedwait), 242 }, - { SCMP_SYS(readv), 242 }, - { SCMP_SYS(writev), 242 }, - { SCMP_SYS(preadv), 242 }, - { SCMP_SYS(pwritev), 242 }, - { SCMP_SYS(setrlimit), 242 }, - { SCMP_SYS(ftruncate), 242 }, - { SCMP_SYS(lstat), 242 }, - { SCMP_SYS(pipe), 242 }, - { SCMP_SYS(umask), 242 }, - { SCMP_SYS(chdir), 242 }, - { SCMP_SYS(setitimer), 242 }, - { SCMP_SYS(setsid), 242 }, - { SCMP_SYS(poll), 242 }, - { SCMP_SYS(epoll_create), 242 }, - { SCMP_SYS(epoll_ctl), 242 }, - { SCMP_SYS(epoll_wait), 242 }, - { SCMP_SYS(waitpid), 242 }, - { SCMP_SYS(getsockname), 242 }, - { SCMP_SYS(getpeername), 242 }, - { SCMP_SYS(accept4), 242 }, - { SCMP_SYS(timerfd_settime), 242 }, - { SCMP_SYS(newfstatat), 241 }, - { SCMP_SYS(shutdown), 241 }, - { SCMP_SYS(getsockopt), 241 }, - { SCMP_SYS(semop), 241 }, - { SCMP_SYS(semtimedop), 241 }, - { SCMP_SYS(epoll_ctl_old), 241 }, - { SCMP_SYS(epoll_wait_old), 241 }, - { SCMP_SYS(epoll_pwait), 241 }, - { SCMP_SYS(epoll_create1), 241 }, - { SCMP_SYS(ppoll), 241 }, - { SCMP_SYS(creat), 241 }, - { SCMP_SYS(link), 241 }, - { SCMP_SYS(getpid), 241 }, - { SCMP_SYS(getppid), 241 }, - { SCMP_SYS(getpgrp), 241 }, - { SCMP_SYS(getpgid), 241 }, - { SCMP_SYS(getsid), 241 }, - { SCMP_SYS(getdents64), 241 }, - { SCMP_SYS(getresuid), 241 }, - { SCMP_SYS(getresgid), 241 }, - { SCMP_SYS(getgroups), 241 }, - { SCMP_SYS(getresuid32), 241 }, - { SCMP_SYS(getresgid32), 241 }, - { SCMP_SYS(getgroups32), 241 }, - { SCMP_SYS(signal), 241 }, - { SCMP_SYS(sigaction), 241 }, - { SCMP_SYS(sigsuspend), 241 }, - { SCMP_SYS(sigpending), 241 }, - { SCMP_SYS(truncate64), 241 }, - { SCMP_SYS(ftruncate64), 241 }, - { SCMP_SYS(fchown32), 241 }, - { SCMP_SYS(chown32), 241 }, - { SCMP_SYS(lchown32), 241 }, - { SCMP_SYS(statfs64), 241 }, - { SCMP_SYS(fstatfs64), 241 }, - { SCMP_SYS(fstatat64), 241 }, - { SCMP_SYS(lstat64), 241 }, - { SCMP_SYS(sendfile64), 241 }, - { SCMP_SYS(ugetrlimit), 241 }, - { SCMP_SYS(alarm), 241 }, - { SCMP_SYS(rt_sigsuspend), 241 }, - { SCMP_SYS(rt_sigqueueinfo), 241 }, - { SCMP_SYS(rt_tgsigqueueinfo), 241 }, - { SCMP_SYS(sigaltstack), 241 }, - { SCMP_SYS(signalfd4), 241 }, - { SCMP_SYS(truncate), 241 }, - { SCMP_SYS(fchown), 241 }, - { SCMP_SYS(lchown), 241 }, - { SCMP_SYS(fchownat), 241 }, - { SCMP_SYS(fstatfs), 241 }, - { SCMP_SYS(getitimer), 241 }, - { SCMP_SYS(syncfs), 241 }, - { SCMP_SYS(fsync), 241 }, - { SCMP_SYS(fchdir), 241 }, - { SCMP_SYS(msync), 241 }, - { SCMP_SYS(sched_setparam), 241 }, - { SCMP_SYS(sched_setscheduler), 241 }, - { SCMP_SYS(sched_yield), 241 }, - { SCMP_SYS(sched_rr_get_interval), 241 }, - { SCMP_SYS(sched_setaffinity), 241 }, - { SCMP_SYS(sched_getaffinity), 241 }, - { SCMP_SYS(readahead), 241 }, - { SCMP_SYS(timer_getoverrun), 241 }, - { SCMP_SYS(unlinkat), 241 }, - { SCMP_SYS(readlinkat), 241 }, - { SCMP_SYS(faccessat), 241 }, - { SCMP_SYS(get_robust_list), 241 }, - { SCMP_SYS(splice), 241 }, - { SCMP_SYS(vmsplice), 241 }, - { SCMP_SYS(getcpu), 241 }, - { SCMP_SYS(sendmmsg), 241 }, - { SCMP_SYS(recvmmsg), 241 }, - { SCMP_SYS(prlimit64), 241 }, - { SCMP_SYS(waitid), 241 }, - { SCMP_SYS(io_cancel), 241 }, - { SCMP_SYS(io_setup), 241 }, - { SCMP_SYS(io_destroy), 241 }, - { SCMP_SYS(arch_prctl), 240 }, - { SCMP_SYS(mkdir), 240 }, - { SCMP_SYS(fchmod), 240 }, - { SCMP_SYS(shmget), 240 }, - { SCMP_SYS(shmat), 240 }, - { SCMP_SYS(shmdt), 240 }, - { SCMP_SYS(timerfd_create), 240 }, - { SCMP_SYS(shmctl), 240 }, - { SCMP_SYS(mlockall), 240 }, - { SCMP_SYS(mlock), 240 }, - { SCMP_SYS(munlock), 240 }, - { SCMP_SYS(semctl), 240 }, - { SCMP_SYS(fallocate), 240 }, - { SCMP_SYS(fadvise64), 240 }, - { SCMP_SYS(inotify_init1), 240 }, - { SCMP_SYS(inotify_add_watch), 240 }, - { SCMP_SYS(mbind), 240 }, - { SCMP_SYS(memfd_create), 240 }, -#ifdef HAVE_CACHEFLUSH - { SCMP_SYS(cacheflush), 240 }, -#endif - { SCMP_SYS(sysinfo), 240 }, +static const struct QemuSeccompSyscall blacklist[] = { + { SCMP_SYS(reboot), 255 }, + { SCMP_SYS(swapon), 255 }, + { SCMP_SYS(swapoff), 255 }, + { SCMP_SYS(syslog), 255 }, + { SCMP_SYS(mount), 255 }, + { SCMP_SYS(umount), 255 }, + { SCMP_SYS(kexec_load), 255 }, + { SCMP_SYS(afs_syscall), 255 }, + { SCMP_SYS(break), 255 }, + { SCMP_SYS(ftime), 255 }, + { SCMP_SYS(getpmsg), 255 }, + { SCMP_SYS(gtty), 255 }, + { SCMP_SYS(lock), 255 }, + { SCMP_SYS(mpx), 255 }, + { SCMP_SYS(prof), 255 }, + { SCMP_SYS(profil), 255 }, + { SCMP_SYS(putpmsg), 255 }, + { SCMP_SYS(security), 255 }, + { SCMP_SYS(stty), 255 }, + { SCMP_SYS(tuxcall), 255 }, + { SCMP_SYS(ulimit), 255 }, + { SCMP_SYS(vserver), 255 }, }; int seccomp_start(void) @@ -262,19 +62,19 @@ int seccomp_start(void) unsigned int i = 0; scmp_filter_ctx ctx; - ctx = seccomp_init(SCMP_ACT_KILL); + ctx = seccomp_init(SCMP_ACT_ALLOW); if (ctx == NULL) { rc = -1; goto seccomp_return; } - for (i = 0; i < ARRAY_SIZE(seccomp_whitelist); i++) { - rc = seccomp_rule_add(ctx, SCMP_ACT_ALLOW, seccomp_whitelist[i].num, 0); + for (i = 0; i < ARRAY_SIZE(blacklist); i++) { + rc = seccomp_rule_add(ctx, SCMP_ACT_KILL, blacklist[i].num, 0); if (rc < 0) { goto seccomp_return; } - rc = seccomp_syscall_priority(ctx, seccomp_whitelist[i].num, - seccomp_whitelist[i].priority); + rc = seccomp_syscall_priority(ctx, blacklist[i].num, + blacklist[i].priority); if (rc < 0) { goto seccomp_return; } diff --git a/vl.c b/vl.c index fb6b2efafa..15b98800e9 100644 --- a/vl.c +++ b/vl.c @@ -1030,13 +1030,16 @@ static int bt_parse(const char *opt) static int parse_sandbox(void *opaque, QemuOpts *opts, Error **errp) { - /* FIXME: change this to true for 1.3 */ if (qemu_opt_get_bool(opts, "enable", false)) { #ifdef CONFIG_SECCOMP if (seccomp_start() < 0) { error_report("failed to install seccomp syscall filter " "in the kernel"); return -1; + } else { + error_report("warning: -sandbox on has been converted to blacklist " + "approach. Refer to the manual for new sandbox " + "options in order to increase security."); } #else error_report("seccomp support is disabled"); -- 2.13.3 ^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [Qemu-devel] [PATCH v3 1/6] seccomp: changing from whitelist to blacklist 2017-07-28 12:10 ` [Qemu-devel] [PATCH v3 1/6] seccomp: changing from whitelist to blacklist Eduardo Otubo @ 2017-08-02 12:25 ` Daniel P. Berrange 2017-08-03 16:54 ` Thomas Huth 1 sibling, 0 replies; 20+ messages in thread From: Daniel P. Berrange @ 2017-08-02 12:25 UTC (permalink / raw) To: Eduardo Otubo; +Cc: qemu-devel, thuth, pbonzini On Fri, Jul 28, 2017 at 02:10:35PM +0200, Eduardo Otubo wrote: > This patch changes the default behavior of the seccomp filter from > whitelist to blacklist. By default now all system calls are allowed and > a small black list of definitely forbidden ones was created. > > Signed-off-by: Eduardo Otubo <otubo@redhat.com> > --- > qemu-seccomp.c | 256 +++++++-------------------------------------------------- > vl.c | 5 +- > 2 files changed, 32 insertions(+), 229 deletions(-) > > diff --git a/qemu-seccomp.c b/qemu-seccomp.c > index df75d9c471..f8877b07b5 100644 > --- a/qemu-seccomp.c > +++ b/qemu-seccomp.c > @@ -31,229 +31,29 @@ struct QemuSeccompSyscall { > uint8_t priority; > }; > > -static const struct QemuSeccompSyscall seccomp_whitelist[] = { > - { SCMP_SYS(timer_settime), 255 }, > - { SCMP_SYS(timer_gettime), 254 }, > - { SCMP_SYS(futex), 253 }, > - { SCMP_SYS(select), 252 }, > - { SCMP_SYS(recvfrom), 251 }, > - { SCMP_SYS(sendto), 250 }, > - { SCMP_SYS(socketcall), 250 }, > - { SCMP_SYS(read), 249 }, > - { SCMP_SYS(io_submit), 249 }, > - { SCMP_SYS(brk), 248 }, > - { SCMP_SYS(clone), 247 }, > - { SCMP_SYS(mmap), 247 }, > - { SCMP_SYS(mprotect), 246 }, > - { SCMP_SYS(execve), 245 }, > - { SCMP_SYS(open), 245 }, > - { SCMP_SYS(ioctl), 245 }, > - { SCMP_SYS(socket), 245 }, > - { SCMP_SYS(setsockopt), 245 }, > - { SCMP_SYS(recvmsg), 245 }, > - { SCMP_SYS(sendmsg), 245 }, > - { SCMP_SYS(accept), 245 }, > - { SCMP_SYS(connect), 245 }, > - { SCMP_SYS(socketpair), 245 }, > - { SCMP_SYS(bind), 245 }, > - { SCMP_SYS(listen), 245 }, > - { SCMP_SYS(semget), 245 }, > - { SCMP_SYS(ipc), 245 }, > - { SCMP_SYS(gettimeofday), 245 }, > - { SCMP_SYS(readlink), 245 }, > - { SCMP_SYS(access), 245 }, > - { SCMP_SYS(prctl), 245 }, > - { SCMP_SYS(signalfd), 245 }, > - { SCMP_SYS(getrlimit), 245 }, > - { SCMP_SYS(getrusage), 245 }, > - { SCMP_SYS(set_tid_address), 245 }, > - { SCMP_SYS(statfs), 245 }, > - { SCMP_SYS(unlink), 245 }, > - { SCMP_SYS(wait4), 245 }, > - { SCMP_SYS(fcntl64), 245 }, > - { SCMP_SYS(fstat64), 245 }, > - { SCMP_SYS(stat64), 245 }, > - { SCMP_SYS(getgid32), 245 }, > - { SCMP_SYS(getegid32), 245 }, > - { SCMP_SYS(getuid32), 245 }, > - { SCMP_SYS(geteuid32), 245 }, > - { SCMP_SYS(sigreturn), 245 }, > - { SCMP_SYS(_newselect), 245 }, > - { SCMP_SYS(_llseek), 245 }, > - { SCMP_SYS(mmap2), 245 }, > - { SCMP_SYS(sigprocmask), 245 }, > - { SCMP_SYS(sched_getparam), 245 }, > - { SCMP_SYS(sched_getscheduler), 245 }, > - { SCMP_SYS(fstat), 245 }, > - { SCMP_SYS(clock_getres), 245 }, > - { SCMP_SYS(sched_get_priority_min), 245 }, > - { SCMP_SYS(sched_get_priority_max), 245 }, > - { SCMP_SYS(stat), 245 }, > - { SCMP_SYS(uname), 245 }, > - { SCMP_SYS(eventfd2), 245 }, > - { SCMP_SYS(io_getevents), 245 }, > - { SCMP_SYS(dup), 245 }, > - { SCMP_SYS(dup2), 245 }, > - { SCMP_SYS(dup3), 245 }, > - { SCMP_SYS(gettid), 245 }, > - { SCMP_SYS(getgid), 245 }, > - { SCMP_SYS(getegid), 245 }, > - { SCMP_SYS(getuid), 245 }, > - { SCMP_SYS(geteuid), 245 }, > - { SCMP_SYS(timer_create), 245 }, > - { SCMP_SYS(times), 245 }, > - { SCMP_SYS(exit), 245 }, > - { SCMP_SYS(clock_gettime), 245 }, > - { SCMP_SYS(time), 245 }, > - { SCMP_SYS(restart_syscall), 245 }, > - { SCMP_SYS(pwrite64), 245 }, > - { SCMP_SYS(nanosleep), 245 }, > - { SCMP_SYS(chown), 245 }, > - { SCMP_SYS(openat), 245 }, > - { SCMP_SYS(getdents), 245 }, > - { SCMP_SYS(timer_delete), 245 }, > - { SCMP_SYS(exit_group), 245 }, > - { SCMP_SYS(rt_sigreturn), 245 }, > - { SCMP_SYS(sync), 245 }, > - { SCMP_SYS(pread64), 245 }, > - { SCMP_SYS(madvise), 245 }, > - { SCMP_SYS(set_robust_list), 245 }, > - { SCMP_SYS(lseek), 245 }, > - { SCMP_SYS(pselect6), 245 }, > - { SCMP_SYS(fork), 245 }, > - { SCMP_SYS(rt_sigprocmask), 245 }, > - { SCMP_SYS(write), 244 }, > - { SCMP_SYS(fcntl), 243 }, > - { SCMP_SYS(tgkill), 242 }, > - { SCMP_SYS(kill), 242 }, > - { SCMP_SYS(rt_sigaction), 242 }, > - { SCMP_SYS(pipe2), 242 }, > - { SCMP_SYS(munmap), 242 }, > - { SCMP_SYS(mremap), 242 }, > - { SCMP_SYS(fdatasync), 242 }, > - { SCMP_SYS(close), 242 }, > - { SCMP_SYS(rt_sigpending), 242 }, > - { SCMP_SYS(rt_sigtimedwait), 242 }, > - { SCMP_SYS(readv), 242 }, > - { SCMP_SYS(writev), 242 }, > - { SCMP_SYS(preadv), 242 }, > - { SCMP_SYS(pwritev), 242 }, > - { SCMP_SYS(setrlimit), 242 }, > - { SCMP_SYS(ftruncate), 242 }, > - { SCMP_SYS(lstat), 242 }, > - { SCMP_SYS(pipe), 242 }, > - { SCMP_SYS(umask), 242 }, > - { SCMP_SYS(chdir), 242 }, > - { SCMP_SYS(setitimer), 242 }, > - { SCMP_SYS(setsid), 242 }, > - { SCMP_SYS(poll), 242 }, > - { SCMP_SYS(epoll_create), 242 }, > - { SCMP_SYS(epoll_ctl), 242 }, > - { SCMP_SYS(epoll_wait), 242 }, > - { SCMP_SYS(waitpid), 242 }, > - { SCMP_SYS(getsockname), 242 }, > - { SCMP_SYS(getpeername), 242 }, > - { SCMP_SYS(accept4), 242 }, > - { SCMP_SYS(timerfd_settime), 242 }, > - { SCMP_SYS(newfstatat), 241 }, > - { SCMP_SYS(shutdown), 241 }, > - { SCMP_SYS(getsockopt), 241 }, > - { SCMP_SYS(semop), 241 }, > - { SCMP_SYS(semtimedop), 241 }, > - { SCMP_SYS(epoll_ctl_old), 241 }, > - { SCMP_SYS(epoll_wait_old), 241 }, > - { SCMP_SYS(epoll_pwait), 241 }, > - { SCMP_SYS(epoll_create1), 241 }, > - { SCMP_SYS(ppoll), 241 }, > - { SCMP_SYS(creat), 241 }, > - { SCMP_SYS(link), 241 }, > - { SCMP_SYS(getpid), 241 }, > - { SCMP_SYS(getppid), 241 }, > - { SCMP_SYS(getpgrp), 241 }, > - { SCMP_SYS(getpgid), 241 }, > - { SCMP_SYS(getsid), 241 }, > - { SCMP_SYS(getdents64), 241 }, > - { SCMP_SYS(getresuid), 241 }, > - { SCMP_SYS(getresgid), 241 }, > - { SCMP_SYS(getgroups), 241 }, > - { SCMP_SYS(getresuid32), 241 }, > - { SCMP_SYS(getresgid32), 241 }, > - { SCMP_SYS(getgroups32), 241 }, > - { SCMP_SYS(signal), 241 }, > - { SCMP_SYS(sigaction), 241 }, > - { SCMP_SYS(sigsuspend), 241 }, > - { SCMP_SYS(sigpending), 241 }, > - { SCMP_SYS(truncate64), 241 }, > - { SCMP_SYS(ftruncate64), 241 }, > - { SCMP_SYS(fchown32), 241 }, > - { SCMP_SYS(chown32), 241 }, > - { SCMP_SYS(lchown32), 241 }, > - { SCMP_SYS(statfs64), 241 }, > - { SCMP_SYS(fstatfs64), 241 }, > - { SCMP_SYS(fstatat64), 241 }, > - { SCMP_SYS(lstat64), 241 }, > - { SCMP_SYS(sendfile64), 241 }, > - { SCMP_SYS(ugetrlimit), 241 }, > - { SCMP_SYS(alarm), 241 }, > - { SCMP_SYS(rt_sigsuspend), 241 }, > - { SCMP_SYS(rt_sigqueueinfo), 241 }, > - { SCMP_SYS(rt_tgsigqueueinfo), 241 }, > - { SCMP_SYS(sigaltstack), 241 }, > - { SCMP_SYS(signalfd4), 241 }, > - { SCMP_SYS(truncate), 241 }, > - { SCMP_SYS(fchown), 241 }, > - { SCMP_SYS(lchown), 241 }, > - { SCMP_SYS(fchownat), 241 }, > - { SCMP_SYS(fstatfs), 241 }, > - { SCMP_SYS(getitimer), 241 }, > - { SCMP_SYS(syncfs), 241 }, > - { SCMP_SYS(fsync), 241 }, > - { SCMP_SYS(fchdir), 241 }, > - { SCMP_SYS(msync), 241 }, > - { SCMP_SYS(sched_setparam), 241 }, > - { SCMP_SYS(sched_setscheduler), 241 }, > - { SCMP_SYS(sched_yield), 241 }, > - { SCMP_SYS(sched_rr_get_interval), 241 }, > - { SCMP_SYS(sched_setaffinity), 241 }, > - { SCMP_SYS(sched_getaffinity), 241 }, > - { SCMP_SYS(readahead), 241 }, > - { SCMP_SYS(timer_getoverrun), 241 }, > - { SCMP_SYS(unlinkat), 241 }, > - { SCMP_SYS(readlinkat), 241 }, > - { SCMP_SYS(faccessat), 241 }, > - { SCMP_SYS(get_robust_list), 241 }, > - { SCMP_SYS(splice), 241 }, > - { SCMP_SYS(vmsplice), 241 }, > - { SCMP_SYS(getcpu), 241 }, > - { SCMP_SYS(sendmmsg), 241 }, > - { SCMP_SYS(recvmmsg), 241 }, > - { SCMP_SYS(prlimit64), 241 }, > - { SCMP_SYS(waitid), 241 }, > - { SCMP_SYS(io_cancel), 241 }, > - { SCMP_SYS(io_setup), 241 }, > - { SCMP_SYS(io_destroy), 241 }, > - { SCMP_SYS(arch_prctl), 240 }, > - { SCMP_SYS(mkdir), 240 }, > - { SCMP_SYS(fchmod), 240 }, > - { SCMP_SYS(shmget), 240 }, > - { SCMP_SYS(shmat), 240 }, > - { SCMP_SYS(shmdt), 240 }, > - { SCMP_SYS(timerfd_create), 240 }, > - { SCMP_SYS(shmctl), 240 }, > - { SCMP_SYS(mlockall), 240 }, > - { SCMP_SYS(mlock), 240 }, > - { SCMP_SYS(munlock), 240 }, > - { SCMP_SYS(semctl), 240 }, > - { SCMP_SYS(fallocate), 240 }, > - { SCMP_SYS(fadvise64), 240 }, > - { SCMP_SYS(inotify_init1), 240 }, > - { SCMP_SYS(inotify_add_watch), 240 }, > - { SCMP_SYS(mbind), 240 }, > - { SCMP_SYS(memfd_create), 240 }, > -#ifdef HAVE_CACHEFLUSH > - { SCMP_SYS(cacheflush), 240 }, > -#endif > - { SCMP_SYS(sysinfo), 240 }, > +static const struct QemuSeccompSyscall blacklist[] = { > + { SCMP_SYS(reboot), 255 }, > + { SCMP_SYS(swapon), 255 }, > + { SCMP_SYS(swapoff), 255 }, > + { SCMP_SYS(syslog), 255 }, > + { SCMP_SYS(mount), 255 }, > + { SCMP_SYS(umount), 255 }, > + { SCMP_SYS(kexec_load), 255 }, > + { SCMP_SYS(afs_syscall), 255 }, > + { SCMP_SYS(break), 255 }, > + { SCMP_SYS(ftime), 255 }, > + { SCMP_SYS(getpmsg), 255 }, > + { SCMP_SYS(gtty), 255 }, > + { SCMP_SYS(lock), 255 }, > + { SCMP_SYS(mpx), 255 }, > + { SCMP_SYS(prof), 255 }, > + { SCMP_SYS(profil), 255 }, > + { SCMP_SYS(putpmsg), 255 }, > + { SCMP_SYS(security), 255 }, > + { SCMP_SYS(stty), 255 }, > + { SCMP_SYS(tuxcall), 255 }, > + { SCMP_SYS(ulimit), 255 }, > + { SCMP_SYS(vserver), 255 }, > }; > > int seccomp_start(void) > @@ -262,19 +62,19 @@ int seccomp_start(void) > unsigned int i = 0; > scmp_filter_ctx ctx; > > - ctx = seccomp_init(SCMP_ACT_KILL); > + ctx = seccomp_init(SCMP_ACT_ALLOW); > if (ctx == NULL) { > rc = -1; > goto seccomp_return; > } > > - for (i = 0; i < ARRAY_SIZE(seccomp_whitelist); i++) { > - rc = seccomp_rule_add(ctx, SCMP_ACT_ALLOW, seccomp_whitelist[i].num, 0); > + for (i = 0; i < ARRAY_SIZE(blacklist); i++) { > + rc = seccomp_rule_add(ctx, SCMP_ACT_KILL, blacklist[i].num, 0); > if (rc < 0) { > goto seccomp_return; > } > - rc = seccomp_syscall_priority(ctx, seccomp_whitelist[i].num, > - seccomp_whitelist[i].priority); > + rc = seccomp_syscall_priority(ctx, blacklist[i].num, > + blacklist[i].priority); > if (rc < 0) { > goto seccomp_return; > } > diff --git a/vl.c b/vl.c > index fb6b2efafa..15b98800e9 100644 > --- a/vl.c > +++ b/vl.c > @@ -1030,13 +1030,16 @@ static int bt_parse(const char *opt) > > static int parse_sandbox(void *opaque, QemuOpts *opts, Error **errp) > { > - /* FIXME: change this to true for 1.3 */ > if (qemu_opt_get_bool(opts, "enable", false)) { > #ifdef CONFIG_SECCOMP > if (seccomp_start() < 0) { > error_report("failed to install seccomp syscall filter " > "in the kernel"); > return -1; > + } else { > + error_report("warning: -sandbox on has been converted to blacklist " > + "approach. Refer to the manual for new sandbox " > + "options in order to increase security."); This is just noise that will pollute everyone's logs. This kind of thing can just be put in the QEMU release notes. If you remove this, then Reviewed-by: Daniel P. Berrange <berrange@redhat.com> Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [Qemu-devel] [PATCH v3 1/6] seccomp: changing from whitelist to blacklist 2017-07-28 12:10 ` [Qemu-devel] [PATCH v3 1/6] seccomp: changing from whitelist to blacklist Eduardo Otubo 2017-08-02 12:25 ` Daniel P. Berrange @ 2017-08-03 16:54 ` Thomas Huth 2017-08-11 9:51 ` Eduardo Otubo 1 sibling, 1 reply; 20+ messages in thread From: Thomas Huth @ 2017-08-03 16:54 UTC (permalink / raw) To: Eduardo Otubo, qemu-devel; +Cc: berrange, pbonzini On 28.07.2017 14:10, Eduardo Otubo wrote: > This patch changes the default behavior of the seccomp filter from > whitelist to blacklist. By default now all system calls are allowed and > a small black list of definitely forbidden ones was created. > > Signed-off-by: Eduardo Otubo <otubo@redhat.com> > --- > qemu-seccomp.c | 256 +++++++-------------------------------------------------- > vl.c | 5 +- > 2 files changed, 32 insertions(+), 229 deletions(-) > > diff --git a/qemu-seccomp.c b/qemu-seccomp.c > index df75d9c471..f8877b07b5 100644 > --- a/qemu-seccomp.c > +++ b/qemu-seccomp.c > @@ -31,229 +31,29 @@ struct QemuSeccompSyscall { > uint8_t priority; > }; [...] > +static const struct QemuSeccompSyscall blacklist[] = { > + { SCMP_SYS(reboot), 255 }, > + { SCMP_SYS(swapon), 255 }, > + { SCMP_SYS(swapoff), 255 }, > + { SCMP_SYS(syslog), 255 }, > + { SCMP_SYS(mount), 255 }, > + { SCMP_SYS(umount), 255 }, > + { SCMP_SYS(kexec_load), 255 }, > + { SCMP_SYS(afs_syscall), 255 }, > + { SCMP_SYS(break), 255 }, > + { SCMP_SYS(ftime), 255 }, > + { SCMP_SYS(getpmsg), 255 }, > + { SCMP_SYS(gtty), 255 }, > + { SCMP_SYS(lock), 255 }, > + { SCMP_SYS(mpx), 255 }, > + { SCMP_SYS(prof), 255 }, > + { SCMP_SYS(profil), 255 }, > + { SCMP_SYS(putpmsg), 255 }, > + { SCMP_SYS(security), 255 }, > + { SCMP_SYS(stty), 255 }, > + { SCMP_SYS(tuxcall), 255 }, > + { SCMP_SYS(ulimit), 255 }, > + { SCMP_SYS(vserver), 255 }, > }; Does it makes sense to still keep the priority field? Everything is now marked with the value 255 and I currently fail to see the point of priorities when using blacklisting ... so maybe just get rid of it? Thomas ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [Qemu-devel] [PATCH v3 1/6] seccomp: changing from whitelist to blacklist 2017-08-03 16:54 ` Thomas Huth @ 2017-08-11 9:51 ` Eduardo Otubo 2017-08-11 10:10 ` Daniel P. Berrange 0 siblings, 1 reply; 20+ messages in thread From: Eduardo Otubo @ 2017-08-11 9:51 UTC (permalink / raw) To: Thomas Huth; +Cc: qemu-devel, berrange, pbonzini On Thu, Aug 03, 2017 at 06:54:15PM +0200, Thomas Huth wrote: > On 28.07.2017 14:10, Eduardo Otubo wrote: > > This patch changes the default behavior of the seccomp filter from > > whitelist to blacklist. By default now all system calls are allowed and > > a small black list of definitely forbidden ones was created. > > > > Signed-off-by: Eduardo Otubo <otubo@redhat.com> > > --- > > qemu-seccomp.c | 256 +++++++-------------------------------------------------- > > vl.c | 5 +- > > 2 files changed, 32 insertions(+), 229 deletions(-) > > > > diff --git a/qemu-seccomp.c b/qemu-seccomp.c > > index df75d9c471..f8877b07b5 100644 > > --- a/qemu-seccomp.c > > +++ b/qemu-seccomp.c > > @@ -31,229 +31,29 @@ struct QemuSeccompSyscall { > > uint8_t priority; > > }; > [...] > > +static const struct QemuSeccompSyscall blacklist[] = { > > + { SCMP_SYS(reboot), 255 }, > > + { SCMP_SYS(swapon), 255 }, > > + { SCMP_SYS(swapoff), 255 }, > > + { SCMP_SYS(syslog), 255 }, > > + { SCMP_SYS(mount), 255 }, > > + { SCMP_SYS(umount), 255 }, > > + { SCMP_SYS(kexec_load), 255 }, > > + { SCMP_SYS(afs_syscall), 255 }, > > + { SCMP_SYS(break), 255 }, > > + { SCMP_SYS(ftime), 255 }, > > + { SCMP_SYS(getpmsg), 255 }, > > + { SCMP_SYS(gtty), 255 }, > > + { SCMP_SYS(lock), 255 }, > > + { SCMP_SYS(mpx), 255 }, > > + { SCMP_SYS(prof), 255 }, > > + { SCMP_SYS(profil), 255 }, > > + { SCMP_SYS(putpmsg), 255 }, > > + { SCMP_SYS(security), 255 }, > > + { SCMP_SYS(stty), 255 }, > > + { SCMP_SYS(tuxcall), 255 }, > > + { SCMP_SYS(ulimit), 255 }, > > + { SCMP_SYS(vserver), 255 }, > > }; > > Does it makes sense to still keep the priority field? Everything is now > marked with the value 255 and I currently fail to see the point of > priorities when using blacklisting ... so maybe just get rid of it? I think that's a fair point here. Don't see much of a point on such a small number of syscalls. I just need to double check the libseccomp docs if I can build the list without any priority information, but I'm pretty sure I've seen this before. -- Eduardo Otubo Senior Software Engineer @ RedHat ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [Qemu-devel] [PATCH v3 1/6] seccomp: changing from whitelist to blacklist 2017-08-11 9:51 ` Eduardo Otubo @ 2017-08-11 10:10 ` Daniel P. Berrange 0 siblings, 0 replies; 20+ messages in thread From: Daniel P. Berrange @ 2017-08-11 10:10 UTC (permalink / raw) To: Thomas Huth, qemu-devel, pbonzini On Fri, Aug 11, 2017 at 11:51:12AM +0200, Eduardo Otubo wrote: > On Thu, Aug 03, 2017 at 06:54:15PM +0200, Thomas Huth wrote: > > On 28.07.2017 14:10, Eduardo Otubo wrote: > > > This patch changes the default behavior of the seccomp filter from > > > whitelist to blacklist. By default now all system calls are allowed and > > > a small black list of definitely forbidden ones was created. > > > > > > Signed-off-by: Eduardo Otubo <otubo@redhat.com> > > > --- > > > qemu-seccomp.c | 256 +++++++-------------------------------------------------- > > > vl.c | 5 +- > > > 2 files changed, 32 insertions(+), 229 deletions(-) > > > > > > diff --git a/qemu-seccomp.c b/qemu-seccomp.c > > > index df75d9c471..f8877b07b5 100644 > > > --- a/qemu-seccomp.c > > > +++ b/qemu-seccomp.c > > > @@ -31,229 +31,29 @@ struct QemuSeccompSyscall { > > > uint8_t priority; > > > }; > > [...] > > > +static const struct QemuSeccompSyscall blacklist[] = { > > > + { SCMP_SYS(reboot), 255 }, > > > + { SCMP_SYS(swapon), 255 }, > > > + { SCMP_SYS(swapoff), 255 }, > > > + { SCMP_SYS(syslog), 255 }, > > > + { SCMP_SYS(mount), 255 }, > > > + { SCMP_SYS(umount), 255 }, > > > + { SCMP_SYS(kexec_load), 255 }, > > > + { SCMP_SYS(afs_syscall), 255 }, > > > + { SCMP_SYS(break), 255 }, > > > + { SCMP_SYS(ftime), 255 }, > > > + { SCMP_SYS(getpmsg), 255 }, > > > + { SCMP_SYS(gtty), 255 }, > > > + { SCMP_SYS(lock), 255 }, > > > + { SCMP_SYS(mpx), 255 }, > > > + { SCMP_SYS(prof), 255 }, > > > + { SCMP_SYS(profil), 255 }, > > > + { SCMP_SYS(putpmsg), 255 }, > > > + { SCMP_SYS(security), 255 }, > > > + { SCMP_SYS(stty), 255 }, > > > + { SCMP_SYS(tuxcall), 255 }, > > > + { SCMP_SYS(ulimit), 255 }, > > > + { SCMP_SYS(vserver), 255 }, > > > }; > > > > Does it makes sense to still keep the priority field? Everything is now > > marked with the value 255 and I currently fail to see the point of > > priorities when using blacklisting ... so maybe just get rid of it? > > I think that's a fair point here. Don't see much of a point on such a > small number of syscalls. I just need to double check the libseccomp > docs if I can build the list without any priority information, but I'm > pretty sure I've seen this before. Just always pass 255 to libseccomp apis directly. Its merely redundant to store the value 255 in this QEMU specific struct. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| ^ permalink raw reply [flat|nested] 20+ messages in thread
* [Qemu-devel] [PATCH v3 2/6] seccomp: add obsolete argument to command line 2017-07-28 12:10 [Qemu-devel] [PATCH v3 0/6] seccomp: feature refactoring Eduardo Otubo 2017-07-28 12:10 ` [Qemu-devel] [PATCH v3 1/6] seccomp: changing from whitelist to blacklist Eduardo Otubo @ 2017-07-28 12:10 ` Eduardo Otubo 2017-08-02 12:33 ` Daniel P. Berrange 2017-07-28 12:10 ` [Qemu-devel] [PATCH v3 3/6] seccomp: add elevateprivileges " Eduardo Otubo ` (3 subsequent siblings) 5 siblings, 1 reply; 20+ messages in thread From: Eduardo Otubo @ 2017-07-28 12:10 UTC (permalink / raw) To: qemu-devel; +Cc: thuth, berrange, pbonzini This patch introduces the argument [,obsolete=allow] to the `-sandbox on' option. It allows Qemu to run safely on old system that still relies on old system calls. Signed-off-by: Eduardo Otubo <otubo@redhat.com> --- include/sysemu/seccomp.h | 4 +++- qemu-options.hx | 9 +++++++-- qemu-seccomp.c | 32 +++++++++++++++++++++++++++++++- vl.c | 16 +++++++++++++++- 4 files changed, 56 insertions(+), 5 deletions(-) diff --git a/include/sysemu/seccomp.h b/include/sysemu/seccomp.h index cfc06008cb..7a7bde246b 100644 --- a/include/sysemu/seccomp.h +++ b/include/sysemu/seccomp.h @@ -15,7 +15,9 @@ #ifndef QEMU_SECCOMP_H #define QEMU_SECCOMP_H +#define OBSOLETE 0x0001 + #include <seccomp.h> -int seccomp_start(void); +int seccomp_start(uint8_t seccomp_opts); #endif diff --git a/qemu-options.hx b/qemu-options.hx index 746b5fa75d..54e492f36a 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -4004,13 +4004,18 @@ Old param mode (ARM only). ETEXI DEF("sandbox", HAS_ARG, QEMU_OPTION_sandbox, \ - "-sandbox <arg> Enable seccomp mode 2 system call filter (default 'off').\n", + "-sandbox on[,obsolete=allow] Enable seccomp mode 2 system call filter (default 'off').\n" \ + " obsolete: Allow obsolete system calls\n", QEMU_ARCH_ALL) STEXI -@item -sandbox @var{arg} +@item -sandbox @var{arg}[,obsolete=@var{string}] @findex -sandbox Enable Seccomp mode 2 system call filter. 'on' will enable syscall filtering and 'off' will disable it. The default is 'off'. +@table @option +@item obsolete=@var{string} +Enable Obsolete system calls +@end table ETEXI DEF("readconfig", HAS_ARG, QEMU_OPTION_readconfig, diff --git a/qemu-seccomp.c b/qemu-seccomp.c index f8877b07b5..c6a8b28260 100644 --- a/qemu-seccomp.c +++ b/qemu-seccomp.c @@ -31,6 +31,20 @@ struct QemuSeccompSyscall { uint8_t priority; }; +static const struct QemuSeccompSyscall obsolete[] = { + { SCMP_SYS(readdir), 255 }, + { SCMP_SYS(_sysctl), 255 }, + { SCMP_SYS(bdflush), 255 }, + { SCMP_SYS(create_module), 255 }, + { SCMP_SYS(get_kernel_syms), 255 }, + { SCMP_SYS(query_module), 255 }, + { SCMP_SYS(sgetmask), 255 }, + { SCMP_SYS(ssetmask), 255 }, + { SCMP_SYS(sysfs), 255 }, + { SCMP_SYS(uselib), 255 }, + { SCMP_SYS(ustat), 255 }, +}; + static const struct QemuSeccompSyscall blacklist[] = { { SCMP_SYS(reboot), 255 }, { SCMP_SYS(swapon), 255 }, @@ -56,7 +70,20 @@ static const struct QemuSeccompSyscall blacklist[] = { { SCMP_SYS(vserver), 255 }, }; -int seccomp_start(void) +static int is_obsolete(int syscall) +{ + unsigned int i = 0; + + for (i = 0; i < ARRAY_SIZE(obsolete); i++) { + if (syscall == obsolete[i].num) { + return 1; + } + } + + return 0; +} + +int seccomp_start(uint8_t seccomp_opts) { int rc = 0; unsigned int i = 0; @@ -69,6 +96,9 @@ int seccomp_start(void) } for (i = 0; i < ARRAY_SIZE(blacklist); i++) { + if ((seccomp_opts & OBSOLETE) && is_obsolete(blacklist[i].num)) { + continue; + } rc = seccomp_rule_add(ctx, SCMP_ACT_KILL, blacklist[i].num, 0); if (rc < 0) { goto seccomp_return; diff --git a/vl.c b/vl.c index 15b98800e9..cbe09c94af 100644 --- a/vl.c +++ b/vl.c @@ -271,6 +271,10 @@ static QemuOptsList qemu_sandbox_opts = { .name = "enable", .type = QEMU_OPT_BOOL, }, + { + .name = "obsolete", + .type = QEMU_OPT_STRING, + }, { /* end of list */ } }, }; @@ -1032,7 +1036,17 @@ static int parse_sandbox(void *opaque, QemuOpts *opts, Error **errp) { if (qemu_opt_get_bool(opts, "enable", false)) { #ifdef CONFIG_SECCOMP - if (seccomp_start() < 0) { + uint8_t seccomp_opts = 0x0000; + const char *value = NULL; + + value = qemu_opt_get(opts, "obsolete"); + if (value) { + if (strcmp(value, "allow") == 0) { + seccomp_opts |= OBSOLETE; + } + } + + if (seccomp_start(seccomp_opts) < 0) { error_report("failed to install seccomp syscall filter " "in the kernel"); return -1; -- 2.13.3 ^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [Qemu-devel] [PATCH v3 2/6] seccomp: add obsolete argument to command line 2017-07-28 12:10 ` [Qemu-devel] [PATCH v3 2/6] seccomp: add obsolete argument to command line Eduardo Otubo @ 2017-08-02 12:33 ` Daniel P. Berrange 2017-08-02 12:38 ` Daniel P. Berrange 2017-08-11 9:12 ` Eduardo Otubo 0 siblings, 2 replies; 20+ messages in thread From: Daniel P. Berrange @ 2017-08-02 12:33 UTC (permalink / raw) To: Eduardo Otubo; +Cc: qemu-devel, thuth, pbonzini On Fri, Jul 28, 2017 at 02:10:36PM +0200, Eduardo Otubo wrote: > This patch introduces the argument [,obsolete=allow] to the `-sandbox on' > option. It allows Qemu to run safely on old system that still relies on > old system calls. > > Signed-off-by: Eduardo Otubo <otubo@redhat.com> > --- > include/sysemu/seccomp.h | 4 +++- > qemu-options.hx | 9 +++++++-- > qemu-seccomp.c | 32 +++++++++++++++++++++++++++++++- > vl.c | 16 +++++++++++++++- > 4 files changed, 56 insertions(+), 5 deletions(-) > > diff --git a/include/sysemu/seccomp.h b/include/sysemu/seccomp.h > index cfc06008cb..7a7bde246b 100644 > --- a/include/sysemu/seccomp.h > +++ b/include/sysemu/seccomp.h > @@ -15,7 +15,9 @@ > #ifndef QEMU_SECCOMP_H > #define QEMU_SECCOMP_H > > +#define OBSOLETE 0x0001 Please namespace this - its far too generic a term to expose to other source files. I'd suggest QEMU_SECCOMP_SET_OBSOLETE > -int seccomp_start(void); > +int seccomp_start(uint8_t seccomp_opts); This only allows for 8 sets. Perhaps its enough, but I'd suggest just using a uint32_t straight away. > diff --git a/qemu-options.hx b/qemu-options.hx > index 746b5fa75d..54e492f36a 100644 > --- a/qemu-options.hx > +++ b/qemu-options.hx > @@ -4004,13 +4004,18 @@ Old param mode (ARM only). > ETEXI > > DEF("sandbox", HAS_ARG, QEMU_OPTION_sandbox, \ > - "-sandbox <arg> Enable seccomp mode 2 system call filter (default 'off').\n", > + "-sandbox on[,obsolete=allow] Enable seccomp mode 2 system call filter (default 'off').\n" \ > + " obsolete: Allow obsolete system calls\n", > QEMU_ARCH_ALL) > STEXI > -@item -sandbox @var{arg} > +@item -sandbox @var{arg}[,obsolete=@var{string}] > @findex -sandbox > Enable Seccomp mode 2 system call filter. 'on' will enable syscall filtering and 'off' will > disable it. The default is 'off'. > +@table @option > +@item obsolete=@var{string} > +Enable Obsolete system calls Lets explain this a bit more. E obsolete system calls that are provided by the kernel, but typically no longer used by modern C library implementations. > +@end table > ETEXI > > DEF("readconfig", HAS_ARG, QEMU_OPTION_readconfig, > diff --git a/qemu-seccomp.c b/qemu-seccomp.c > index f8877b07b5..c6a8b28260 100644 > --- a/qemu-seccomp.c > +++ b/qemu-seccomp.c > @@ -31,6 +31,20 @@ struct QemuSeccompSyscall { > uint8_t priority; > }; > > +static const struct QemuSeccompSyscall obsolete[] = { > + { SCMP_SYS(readdir), 255 }, > + { SCMP_SYS(_sysctl), 255 }, > + { SCMP_SYS(bdflush), 255 }, > + { SCMP_SYS(create_module), 255 }, > + { SCMP_SYS(get_kernel_syms), 255 }, > + { SCMP_SYS(query_module), 255 }, > + { SCMP_SYS(sgetmask), 255 }, > + { SCMP_SYS(ssetmask), 255 }, > + { SCMP_SYS(sysfs), 255 }, > + { SCMP_SYS(uselib), 255 }, > + { SCMP_SYS(ustat), 255 }, > +}; > + > static const struct QemuSeccompSyscall blacklist[] = { > { SCMP_SYS(reboot), 255 }, > { SCMP_SYS(swapon), 255 }, > @@ -56,7 +70,20 @@ static const struct QemuSeccompSyscall blacklist[] = { > { SCMP_SYS(vserver), 255 }, > }; > > -int seccomp_start(void) > +static int is_obsolete(int syscall) > +{ > + unsigned int i = 0; > + > + for (i = 0; i < ARRAY_SIZE(obsolete); i++) { > + if (syscall == obsolete[i].num) { > + return 1; > + } > + } > + > + return 0; > +} > + > +int seccomp_start(uint8_t seccomp_opts) > { > int rc = 0; > unsigned int i = 0; > @@ -69,6 +96,9 @@ int seccomp_start(void) > } > > for (i = 0; i < ARRAY_SIZE(blacklist); i++) { > + if ((seccomp_opts & OBSOLETE) && is_obsolete(blacklist[i].num)) { > + continue; > + } IMHO this is leading to a rather inefficient approach. Why not extend QemuSeccompSyscall struct so that it has another field to list which set it belongs to. Then you can do static const struct QemuSeccompSyscall blacklist[] = { { SCMP_SYS(reboot), 255, QEMU_SECCOMP_SET_DEFAULT }, { SCMP_SYS(swapon), 255, QEMU_SECCOMP_SET_DEFAULT }, .... { SCMP_SYS(readdir), 255, QEMU_SECCOMP_SET_OBSOLETE }, { SCMP_SYS(_sysctl), 255, QEMU_SECCOMP_SET_OBSOLETE }, ... And then to process this you can do for (i = 0; i < ARRAY_SIZE(blacklist); i++) { if (blacklist[i].set != QEMU_SECCOMP_SET_OBSOLETE && blacklist[i].set & seccomp_opts) { continue; } > rc = seccomp_rule_add(ctx, SCMP_ACT_KILL, blacklist[i].num, 0); > if (rc < 0) { > goto seccomp_return; > diff --git a/vl.c b/vl.c > index 15b98800e9..cbe09c94af 100644 > --- a/vl.c > +++ b/vl.c > @@ -271,6 +271,10 @@ static QemuOptsList qemu_sandbox_opts = { > .name = "enable", > .type = QEMU_OPT_BOOL, > }, > + { > + .name = "obsolete", > + .type = QEMU_OPT_STRING, > + }, > { /* end of list */ } > }, > }; > @@ -1032,7 +1036,17 @@ static int parse_sandbox(void *opaque, QemuOpts *opts, Error **errp) > { > if (qemu_opt_get_bool(opts, "enable", false)) { > #ifdef CONFIG_SECCOMP > - if (seccomp_start() < 0) { > + uint8_t seccomp_opts = 0x0000; > + const char *value = NULL; > + > + value = qemu_opt_get(opts, "obsolete"); > + if (value) { > + if (strcmp(value, "allow") == 0) { > + seccomp_opts |= OBSOLETE; > + } > + } IIUC, the values will all be booleans, so we should just use if (qemu_opt_get_bool(opts, "obsolete", false)) seccomp_opts |= OBSOLETE; > + > + if (seccomp_start(seccomp_opts) < 0) { > error_report("failed to install seccomp syscall filter " > "in the kernel"); > return -1; Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [Qemu-devel] [PATCH v3 2/6] seccomp: add obsolete argument to command line 2017-08-02 12:33 ` Daniel P. Berrange @ 2017-08-02 12:38 ` Daniel P. Berrange 2017-08-11 9:12 ` Eduardo Otubo 1 sibling, 0 replies; 20+ messages in thread From: Daniel P. Berrange @ 2017-08-02 12:38 UTC (permalink / raw) To: Eduardo Otubo; +Cc: pbonzini, thuth, qemu-devel On Wed, Aug 02, 2017 at 01:33:56PM +0100, Daniel P. Berrange wrote: > On Fri, Jul 28, 2017 at 02:10:36PM +0200, Eduardo Otubo wrote: > > This patch introduces the argument [,obsolete=allow] to the `-sandbox on' > > option. It allows Qemu to run safely on old system that still relies on > > old system calls. > > > > Signed-off-by: Eduardo Otubo <otubo@redhat.com> > > --- > > include/sysemu/seccomp.h | 4 +++- > > qemu-options.hx | 9 +++++++-- > > qemu-seccomp.c | 32 +++++++++++++++++++++++++++++++- > > vl.c | 16 +++++++++++++++- > > 4 files changed, 56 insertions(+), 5 deletions(-) > > @@ -1032,7 +1036,17 @@ static int parse_sandbox(void *opaque, QemuOpts *opts, Error **errp) > > { > > if (qemu_opt_get_bool(opts, "enable", false)) { > > #ifdef CONFIG_SECCOMP > > - if (seccomp_start() < 0) { > > + uint8_t seccomp_opts = 0x0000; > > + const char *value = NULL; > > + > > + value = qemu_opt_get(opts, "obsolete"); > > + if (value) { > > + if (strcmp(value, "allow") == 0) { > > + seccomp_opts |= OBSOLETE; > > + } > > + } > > IIUC, the values will all be booleans, so we should just use > > if (qemu_opt_get_bool(opts, "obsolete", false)) > seccomp_opts |= OBSOLETE; Oh ignore this. I see from the next patch, we can't treat it as a boolean. We should however explicitly look for 'value == deny', and then reject all other values with an error message > > > + > > + if (seccomp_start(seccomp_opts) < 0) { > > error_report("failed to install seccomp syscall filter " > > "in the kernel"); > > return -1; Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [Qemu-devel] [PATCH v3 2/6] seccomp: add obsolete argument to command line 2017-08-02 12:33 ` Daniel P. Berrange 2017-08-02 12:38 ` Daniel P. Berrange @ 2017-08-11 9:12 ` Eduardo Otubo 2017-08-11 9:25 ` Daniel P. Berrange 2017-08-11 9:49 ` Eduardo Otubo 1 sibling, 2 replies; 20+ messages in thread From: Eduardo Otubo @ 2017-08-11 9:12 UTC (permalink / raw) To: Daniel P. Berrange; +Cc: qemu-devel, thuth, pbonzini On Wed, Aug 02, 2017 at 01:33:56PM +0100, Daniel P. Berrange wrote: > On Fri, Jul 28, 2017 at 02:10:36PM +0200, Eduardo Otubo wrote: > > This patch introduces the argument [,obsolete=allow] to the `-sandbox on' > > option. It allows Qemu to run safely on old system that still relies on > > old system calls. > > > > Signed-off-by: Eduardo Otubo <otubo@redhat.com> > > --- > > include/sysemu/seccomp.h | 4 +++- > > qemu-options.hx | 9 +++++++-- > > qemu-seccomp.c | 32 +++++++++++++++++++++++++++++++- > > vl.c | 16 +++++++++++++++- > > 4 files changed, 56 insertions(+), 5 deletions(-) > > > > diff --git a/include/sysemu/seccomp.h b/include/sysemu/seccomp.h > > index cfc06008cb..7a7bde246b 100644 > > --- a/include/sysemu/seccomp.h > > +++ b/include/sysemu/seccomp.h > > @@ -15,7 +15,9 @@ > > #ifndef QEMU_SECCOMP_H > > #define QEMU_SECCOMP_H > > > > +#define OBSOLETE 0x0001 > > Please namespace this - its far too generic a term to expose to other > source files. I'd suggest > > QEMU_SECCOMP_SET_OBSOLETE > > > -int seccomp_start(void); > > +int seccomp_start(uint8_t seccomp_opts); > > This only allows for 8 sets. Perhaps its enough, but I'd suggest > just using a uint32_t straight away. > > > diff --git a/qemu-options.hx b/qemu-options.hx > > index 746b5fa75d..54e492f36a 100644 > > --- a/qemu-options.hx > > +++ b/qemu-options.hx > > @@ -4004,13 +4004,18 @@ Old param mode (ARM only). > > ETEXI > > > > DEF("sandbox", HAS_ARG, QEMU_OPTION_sandbox, \ > > - "-sandbox <arg> Enable seccomp mode 2 system call filter (default 'off').\n", > > + "-sandbox on[,obsolete=allow] Enable seccomp mode 2 system call filter (default 'off').\n" \ > > + " obsolete: Allow obsolete system calls\n", > > QEMU_ARCH_ALL) > > STEXI > > -@item -sandbox @var{arg} > > +@item -sandbox @var{arg}[,obsolete=@var{string}] > > @findex -sandbox > > Enable Seccomp mode 2 system call filter. 'on' will enable syscall filtering and 'off' will > > disable it. The default is 'off'. > > +@table @option > > +@item obsolete=@var{string} > > +Enable Obsolete system calls > > Lets explain this a bit more. > > E obsolete system calls that are provided by the kernel, but typically no > longer used by modern C library implementations. > > > +@end table > > ETEXI > > > > DEF("readconfig", HAS_ARG, QEMU_OPTION_readconfig, > > diff --git a/qemu-seccomp.c b/qemu-seccomp.c > > index f8877b07b5..c6a8b28260 100644 > > --- a/qemu-seccomp.c > > +++ b/qemu-seccomp.c > > @@ -31,6 +31,20 @@ struct QemuSeccompSyscall { > > uint8_t priority; > > }; > > > > +static const struct QemuSeccompSyscall obsolete[] = { > > + { SCMP_SYS(readdir), 255 }, > > + { SCMP_SYS(_sysctl), 255 }, > > + { SCMP_SYS(bdflush), 255 }, > > + { SCMP_SYS(create_module), 255 }, > > + { SCMP_SYS(get_kernel_syms), 255 }, > > + { SCMP_SYS(query_module), 255 }, > > + { SCMP_SYS(sgetmask), 255 }, > > + { SCMP_SYS(ssetmask), 255 }, > > + { SCMP_SYS(sysfs), 255 }, > > + { SCMP_SYS(uselib), 255 }, > > + { SCMP_SYS(ustat), 255 }, > > +}; > > + > > static const struct QemuSeccompSyscall blacklist[] = { > > { SCMP_SYS(reboot), 255 }, > > { SCMP_SYS(swapon), 255 }, > > @@ -56,7 +70,20 @@ static const struct QemuSeccompSyscall blacklist[] = { > > { SCMP_SYS(vserver), 255 }, > > }; > > > > -int seccomp_start(void) > > +static int is_obsolete(int syscall) > > +{ > > + unsigned int i = 0; > > + > > + for (i = 0; i < ARRAY_SIZE(obsolete); i++) { > > + if (syscall == obsolete[i].num) { > > + return 1; > > + } > > + } > > + > > + return 0; > > +} > > + > > +int seccomp_start(uint8_t seccomp_opts) > > { > > int rc = 0; > > unsigned int i = 0; > > @@ -69,6 +96,9 @@ int seccomp_start(void) > > } > > > > for (i = 0; i < ARRAY_SIZE(blacklist); i++) { > > + if ((seccomp_opts & OBSOLETE) && is_obsolete(blacklist[i].num)) { > > + continue; > > + } > > IMHO this is leading to a rather inefficient approach. Why not extend > QemuSeccompSyscall struct so that it has another field to list which > set it belongs to. Then you can do > > > static const struct QemuSeccompSyscall blacklist[] = { > { SCMP_SYS(reboot), 255, QEMU_SECCOMP_SET_DEFAULT }, > { SCMP_SYS(swapon), 255, QEMU_SECCOMP_SET_DEFAULT }, > .... > { SCMP_SYS(readdir), 255, QEMU_SECCOMP_SET_OBSOLETE }, > { SCMP_SYS(_sysctl), 255, QEMU_SECCOMP_SET_OBSOLETE }, > ... > > And then to process this you can do > > for (i = 0; i < ARRAY_SIZE(blacklist); i++) { > if (blacklist[i].set != QEMU_SECCOMP_SET_OBSOLETE && > blacklist[i].set & seccomp_opts) { > continue; I agree with all the rest except with this one. This would require a change on libseccomp itself. Not sure a change on the library would be suited for now. I'm working and reviewing all the comments today, so I'll try to post a new version later today. > } > > > > rc = seccomp_rule_add(ctx, SCMP_ACT_KILL, blacklist[i].num, 0); > > if (rc < 0) { > > goto seccomp_return; > > diff --git a/vl.c b/vl.c > > index 15b98800e9..cbe09c94af 100644 > > --- a/vl.c > > +++ b/vl.c > > @@ -271,6 +271,10 @@ static QemuOptsList qemu_sandbox_opts = { > > .name = "enable", > > .type = QEMU_OPT_BOOL, > > }, > > + { > > + .name = "obsolete", > > + .type = QEMU_OPT_STRING, > > + }, > > { /* end of list */ } > > }, > > }; > > @@ -1032,7 +1036,17 @@ static int parse_sandbox(void *opaque, QemuOpts *opts, Error **errp) > > { > > if (qemu_opt_get_bool(opts, "enable", false)) { > > #ifdef CONFIG_SECCOMP > > - if (seccomp_start() < 0) { > > + uint8_t seccomp_opts = 0x0000; > > + const char *value = NULL; > > + > > + value = qemu_opt_get(opts, "obsolete"); > > + if (value) { > > + if (strcmp(value, "allow") == 0) { > > + seccomp_opts |= OBSOLETE; > > + } > > + } > > IIUC, the values will all be booleans, so we should just use > > if (qemu_opt_get_bool(opts, "obsolete", false)) > seccomp_opts |= OBSOLETE; > > > + > > + if (seccomp_start(seccomp_opts) < 0) { > > error_report("failed to install seccomp syscall filter " > > "in the kernel"); > > return -1; > > Regards, > Daniel > -- > |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| > |: https://libvirt.org -o- https://fstop138.berrange.com :| > |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| -- Eduardo Otubo Senior Software Engineer @ RedHat ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [Qemu-devel] [PATCH v3 2/6] seccomp: add obsolete argument to command line 2017-08-11 9:12 ` Eduardo Otubo @ 2017-08-11 9:25 ` Daniel P. Berrange 2017-08-11 9:49 ` Eduardo Otubo 1 sibling, 0 replies; 20+ messages in thread From: Daniel P. Berrange @ 2017-08-11 9:25 UTC (permalink / raw) To: qemu-devel, thuth, pbonzini On Fri, Aug 11, 2017 at 11:12:48AM +0200, Eduardo Otubo wrote: > On Wed, Aug 02, 2017 at 01:33:56PM +0100, Daniel P. Berrange wrote: > > On Fri, Jul 28, 2017 at 02:10:36PM +0200, Eduardo Otubo wrote: > > > This patch introduces the argument [,obsolete=allow] to the `-sandbox on' > > > option. It allows Qemu to run safely on old system that still relies on > > > old system calls. > > > > > > Signed-off-by: Eduardo Otubo <otubo@redhat.com> > > > --- > > > include/sysemu/seccomp.h | 4 +++- > > > qemu-options.hx | 9 +++++++-- > > > qemu-seccomp.c | 32 +++++++++++++++++++++++++++++++- > > > vl.c | 16 +++++++++++++++- > > > 4 files changed, 56 insertions(+), 5 deletions(-) > > > > > > diff --git a/include/sysemu/seccomp.h b/include/sysemu/seccomp.h > > > index cfc06008cb..7a7bde246b 100644 > > > --- a/include/sysemu/seccomp.h > > > +++ b/include/sysemu/seccomp.h > > > @@ -15,7 +15,9 @@ > > > #ifndef QEMU_SECCOMP_H > > > #define QEMU_SECCOMP_H > > > > > > +#define OBSOLETE 0x0001 > > > > Please namespace this - its far too generic a term to expose to other > > source files. I'd suggest > > > > QEMU_SECCOMP_SET_OBSOLETE > > > > > -int seccomp_start(void); > > > +int seccomp_start(uint8_t seccomp_opts); > > > > This only allows for 8 sets. Perhaps its enough, but I'd suggest > > just using a uint32_t straight away. > > > > > diff --git a/qemu-options.hx b/qemu-options.hx > > > index 746b5fa75d..54e492f36a 100644 > > > --- a/qemu-options.hx > > > +++ b/qemu-options.hx > > > @@ -4004,13 +4004,18 @@ Old param mode (ARM only). > > > ETEXI > > > > > > DEF("sandbox", HAS_ARG, QEMU_OPTION_sandbox, \ > > > - "-sandbox <arg> Enable seccomp mode 2 system call filter (default 'off').\n", > > > + "-sandbox on[,obsolete=allow] Enable seccomp mode 2 system call filter (default 'off').\n" \ > > > + " obsolete: Allow obsolete system calls\n", > > > QEMU_ARCH_ALL) > > > STEXI > > > -@item -sandbox @var{arg} > > > +@item -sandbox @var{arg}[,obsolete=@var{string}] > > > @findex -sandbox > > > Enable Seccomp mode 2 system call filter. 'on' will enable syscall filtering and 'off' will > > > disable it. The default is 'off'. > > > +@table @option > > > +@item obsolete=@var{string} > > > +Enable Obsolete system calls > > > > Lets explain this a bit more. > > > > E obsolete system calls that are provided by the kernel, but typically no > > longer used by modern C library implementations. > > > > > +@end table > > > ETEXI > > > > > > DEF("readconfig", HAS_ARG, QEMU_OPTION_readconfig, > > > diff --git a/qemu-seccomp.c b/qemu-seccomp.c > > > index f8877b07b5..c6a8b28260 100644 > > > --- a/qemu-seccomp.c > > > +++ b/qemu-seccomp.c > > > @@ -31,6 +31,20 @@ struct QemuSeccompSyscall { > > > uint8_t priority; > > > }; > > > > > > +static const struct QemuSeccompSyscall obsolete[] = { > > > + { SCMP_SYS(readdir), 255 }, > > > + { SCMP_SYS(_sysctl), 255 }, > > > + { SCMP_SYS(bdflush), 255 }, > > > + { SCMP_SYS(create_module), 255 }, > > > + { SCMP_SYS(get_kernel_syms), 255 }, > > > + { SCMP_SYS(query_module), 255 }, > > > + { SCMP_SYS(sgetmask), 255 }, > > > + { SCMP_SYS(ssetmask), 255 }, > > > + { SCMP_SYS(sysfs), 255 }, > > > + { SCMP_SYS(uselib), 255 }, > > > + { SCMP_SYS(ustat), 255 }, > > > +}; > > > + > > > static const struct QemuSeccompSyscall blacklist[] = { > > > { SCMP_SYS(reboot), 255 }, > > > { SCMP_SYS(swapon), 255 }, > > > @@ -56,7 +70,20 @@ static const struct QemuSeccompSyscall blacklist[] = { > > > { SCMP_SYS(vserver), 255 }, > > > }; > > > > > > -int seccomp_start(void) > > > +static int is_obsolete(int syscall) > > > +{ > > > + unsigned int i = 0; > > > + > > > + for (i = 0; i < ARRAY_SIZE(obsolete); i++) { > > > + if (syscall == obsolete[i].num) { > > > + return 1; > > > + } > > > + } > > > + > > > + return 0; > > > +} > > > + > > > +int seccomp_start(uint8_t seccomp_opts) > > > { > > > int rc = 0; > > > unsigned int i = 0; > > > @@ -69,6 +96,9 @@ int seccomp_start(void) > > > } > > > > > > for (i = 0; i < ARRAY_SIZE(blacklist); i++) { > > > + if ((seccomp_opts & OBSOLETE) && is_obsolete(blacklist[i].num)) { > > > + continue; > > > + } > > > > IMHO this is leading to a rather inefficient approach. Why not extend > > QemuSeccompSyscall struct so that it has another field to list which > > set it belongs to. Then you can do > > > > > > static const struct QemuSeccompSyscall blacklist[] = { > > { SCMP_SYS(reboot), 255, QEMU_SECCOMP_SET_DEFAULT }, > > { SCMP_SYS(swapon), 255, QEMU_SECCOMP_SET_DEFAULT }, > > .... > > { SCMP_SYS(readdir), 255, QEMU_SECCOMP_SET_OBSOLETE }, > > { SCMP_SYS(_sysctl), 255, QEMU_SECCOMP_SET_OBSOLETE }, > > ... > > > > And then to process this you can do > > > > for (i = 0; i < ARRAY_SIZE(blacklist); i++) { > > if (blacklist[i].set != QEMU_SECCOMP_SET_OBSOLETE && > > blacklist[i].set & seccomp_opts) { > > continue; > > I agree with all the rest except with this one. This would require a > change on libseccomp itself. Not sure a change on the library would be > suited for now. Huh ? QemuSeccompSyscall is a QEMU defined struct, and this is QEMU code. The change I describe here takes place before we even call libseccomp and doesn't affect any APIs we call in the future Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [Qemu-devel] [PATCH v3 2/6] seccomp: add obsolete argument to command line 2017-08-11 9:12 ` Eduardo Otubo 2017-08-11 9:25 ` Daniel P. Berrange @ 2017-08-11 9:49 ` Eduardo Otubo 1 sibling, 0 replies; 20+ messages in thread From: Eduardo Otubo @ 2017-08-11 9:49 UTC (permalink / raw) To: Daniel P. Berrange, qemu-devel, thuth, pbonzini On Fri, Aug 11, 2017 at 11:12:48AM +0200, Eduardo Otubo wrote: > On Wed, Aug 02, 2017 at 01:33:56PM +0100, Daniel P. Berrange wrote: > > On Fri, Jul 28, 2017 at 02:10:36PM +0200, Eduardo Otubo wrote: > > > This patch introduces the argument [,obsolete=allow] to the `-sandbox on' > > > option. It allows Qemu to run safely on old system that still relies on > > > old system calls. > > > > > > Signed-off-by: Eduardo Otubo <otubo@redhat.com> > > > --- > > > include/sysemu/seccomp.h | 4 +++- > > > qemu-options.hx | 9 +++++++-- > > > qemu-seccomp.c | 32 +++++++++++++++++++++++++++++++- > > > vl.c | 16 +++++++++++++++- > > > 4 files changed, 56 insertions(+), 5 deletions(-) > > > > > > diff --git a/include/sysemu/seccomp.h b/include/sysemu/seccomp.h > > > index cfc06008cb..7a7bde246b 100644 > > > --- a/include/sysemu/seccomp.h > > > +++ b/include/sysemu/seccomp.h > > > @@ -15,7 +15,9 @@ > > > #ifndef QEMU_SECCOMP_H > > > #define QEMU_SECCOMP_H > > > > > > +#define OBSOLETE 0x0001 > > > > Please namespace this - its far too generic a term to expose to other > > source files. I'd suggest > > > > QEMU_SECCOMP_SET_OBSOLETE > > > > > -int seccomp_start(void); > > > +int seccomp_start(uint8_t seccomp_opts); > > > > This only allows for 8 sets. Perhaps its enough, but I'd suggest > > just using a uint32_t straight away. > > > > > diff --git a/qemu-options.hx b/qemu-options.hx > > > index 746b5fa75d..54e492f36a 100644 > > > --- a/qemu-options.hx > > > +++ b/qemu-options.hx > > > @@ -4004,13 +4004,18 @@ Old param mode (ARM only). > > > ETEXI > > > > > > DEF("sandbox", HAS_ARG, QEMU_OPTION_sandbox, \ > > > - "-sandbox <arg> Enable seccomp mode 2 system call filter (default 'off').\n", > > > + "-sandbox on[,obsolete=allow] Enable seccomp mode 2 system call filter (default 'off').\n" \ > > > + " obsolete: Allow obsolete system calls\n", > > > QEMU_ARCH_ALL) > > > STEXI > > > -@item -sandbox @var{arg} > > > +@item -sandbox @var{arg}[,obsolete=@var{string}] > > > @findex -sandbox > > > Enable Seccomp mode 2 system call filter. 'on' will enable syscall filtering and 'off' will > > > disable it. The default is 'off'. > > > +@table @option > > > +@item obsolete=@var{string} > > > +Enable Obsolete system calls > > > > Lets explain this a bit more. > > > > E obsolete system calls that are provided by the kernel, but typically no > > longer used by modern C library implementations. > > > > > +@end table > > > ETEXI > > > > > > DEF("readconfig", HAS_ARG, QEMU_OPTION_readconfig, > > > diff --git a/qemu-seccomp.c b/qemu-seccomp.c > > > index f8877b07b5..c6a8b28260 100644 > > > --- a/qemu-seccomp.c > > > +++ b/qemu-seccomp.c > > > @@ -31,6 +31,20 @@ struct QemuSeccompSyscall { > > > uint8_t priority; > > > }; > > > > > > +static const struct QemuSeccompSyscall obsolete[] = { > > > + { SCMP_SYS(readdir), 255 }, > > > + { SCMP_SYS(_sysctl), 255 }, > > > + { SCMP_SYS(bdflush), 255 }, > > > + { SCMP_SYS(create_module), 255 }, > > > + { SCMP_SYS(get_kernel_syms), 255 }, > > > + { SCMP_SYS(query_module), 255 }, > > > + { SCMP_SYS(sgetmask), 255 }, > > > + { SCMP_SYS(ssetmask), 255 }, > > > + { SCMP_SYS(sysfs), 255 }, > > > + { SCMP_SYS(uselib), 255 }, > > > + { SCMP_SYS(ustat), 255 }, > > > +}; > > > + > > > static const struct QemuSeccompSyscall blacklist[] = { > > > { SCMP_SYS(reboot), 255 }, > > > { SCMP_SYS(swapon), 255 }, > > > @@ -56,7 +70,20 @@ static const struct QemuSeccompSyscall blacklist[] = { > > > { SCMP_SYS(vserver), 255 }, > > > }; > > > > > > -int seccomp_start(void) > > > +static int is_obsolete(int syscall) > > > +{ > > > + unsigned int i = 0; > > > + > > > + for (i = 0; i < ARRAY_SIZE(obsolete); i++) { > > > + if (syscall == obsolete[i].num) { > > > + return 1; > > > + } > > > + } > > > + > > > + return 0; > > > +} > > > + > > > +int seccomp_start(uint8_t seccomp_opts) > > > { > > > int rc = 0; > > > unsigned int i = 0; > > > @@ -69,6 +96,9 @@ int seccomp_start(void) > > > } > > > > > > for (i = 0; i < ARRAY_SIZE(blacklist); i++) { > > > + if ((seccomp_opts & OBSOLETE) && is_obsolete(blacklist[i].num)) { > > > + continue; > > > + } > > > > IMHO this is leading to a rather inefficient approach. Why not extend > > QemuSeccompSyscall struct so that it has another field to list which > > set it belongs to. Then you can do > > > > > > static const struct QemuSeccompSyscall blacklist[] = { > > { SCMP_SYS(reboot), 255, QEMU_SECCOMP_SET_DEFAULT }, > > { SCMP_SYS(swapon), 255, QEMU_SECCOMP_SET_DEFAULT }, > > .... > > { SCMP_SYS(readdir), 255, QEMU_SECCOMP_SET_OBSOLETE }, > > { SCMP_SYS(_sysctl), 255, QEMU_SECCOMP_SET_OBSOLETE }, > > ... > > > > And then to process this you can do > > > > for (i = 0; i < ARRAY_SIZE(blacklist); i++) { > > if (blacklist[i].set != QEMU_SECCOMP_SET_OBSOLETE && > > blacklist[i].set & seccomp_opts) { > > continue; > > I agree with all the rest except with this one. This would require a > change on libseccomp itself. Not sure a change on the library would be > suited for now. > > I'm working and reviewing all the comments today, so I'll try to post > a new version later today. Let me reply this before you do: My mistake, this structure model does not belong to libseccomp. My bad. So yeah, I can change it to optimize things here. Thanks for the idea :) > > > } > > > > > > > rc = seccomp_rule_add(ctx, SCMP_ACT_KILL, blacklist[i].num, 0); > > > if (rc < 0) { > > > goto seccomp_return; > > > diff --git a/vl.c b/vl.c > > > index 15b98800e9..cbe09c94af 100644 > > > --- a/vl.c > > > +++ b/vl.c > > > @@ -271,6 +271,10 @@ static QemuOptsList qemu_sandbox_opts = { > > > .name = "enable", > > > .type = QEMU_OPT_BOOL, > > > }, > > > + { > > > + .name = "obsolete", > > > + .type = QEMU_OPT_STRING, > > > + }, > > > { /* end of list */ } > > > }, > > > }; > > > @@ -1032,7 +1036,17 @@ static int parse_sandbox(void *opaque, QemuOpts *opts, Error **errp) > > > { > > > if (qemu_opt_get_bool(opts, "enable", false)) { > > > #ifdef CONFIG_SECCOMP > > > - if (seccomp_start() < 0) { > > > + uint8_t seccomp_opts = 0x0000; > > > + const char *value = NULL; > > > + > > > + value = qemu_opt_get(opts, "obsolete"); > > > + if (value) { > > > + if (strcmp(value, "allow") == 0) { > > > + seccomp_opts |= OBSOLETE; > > > + } > > > + } > > > > IIUC, the values will all be booleans, so we should just use > > > > if (qemu_opt_get_bool(opts, "obsolete", false)) > > seccomp_opts |= OBSOLETE; > > > > > + > > > + if (seccomp_start(seccomp_opts) < 0) { > > > error_report("failed to install seccomp syscall filter " > > > "in the kernel"); > > > return -1; > > > > Regards, > > Daniel > > -- > > |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| > > |: https://libvirt.org -o- https://fstop138.berrange.com :| > > |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| > > -- > Eduardo Otubo > Senior Software Engineer @ RedHat -- Eduardo Otubo Senior Software Engineer @ RedHat ^ permalink raw reply [flat|nested] 20+ messages in thread
* [Qemu-devel] [PATCH v3 3/6] seccomp: add elevateprivileges argument to command line 2017-07-28 12:10 [Qemu-devel] [PATCH v3 0/6] seccomp: feature refactoring Eduardo Otubo 2017-07-28 12:10 ` [Qemu-devel] [PATCH v3 1/6] seccomp: changing from whitelist to blacklist Eduardo Otubo 2017-07-28 12:10 ` [Qemu-devel] [PATCH v3 2/6] seccomp: add obsolete argument to command line Eduardo Otubo @ 2017-07-28 12:10 ` Eduardo Otubo 2017-08-02 12:37 ` Daniel P. Berrange 2017-08-03 16:59 ` Thomas Huth 2017-07-28 12:10 ` [Qemu-devel] [PATCH v3 4/6] seccomp: add spawn " Eduardo Otubo ` (2 subsequent siblings) 5 siblings, 2 replies; 20+ messages in thread From: Eduardo Otubo @ 2017-07-28 12:10 UTC (permalink / raw) To: qemu-devel; +Cc: thuth, berrange, pbonzini This patch introduces the new argument [,elevateprivileges=allow|deny|children] to the `-sandbox on'. It allows or denies Qemu process to elevate its privileges by blacklisting all set*uid|gid system calls. The 'children' option will let forks and execves run unprivileged. Signed-off-by: Eduardo Otubo <otubo@redhat.com> --- include/sysemu/seccomp.h | 1 + qemu-options.hx | 9 ++++++--- qemu-seccomp.c | 29 +++++++++++++++++++++++++++++ vl.c | 22 ++++++++++++++++++++++ 4 files changed, 58 insertions(+), 3 deletions(-) diff --git a/include/sysemu/seccomp.h b/include/sysemu/seccomp.h index 7a7bde246b..e6e78d85ce 100644 --- a/include/sysemu/seccomp.h +++ b/include/sysemu/seccomp.h @@ -16,6 +16,7 @@ #define QEMU_SECCOMP_H #define OBSOLETE 0x0001 +#define PRIVILEGED 0x0010 #include <seccomp.h> diff --git a/qemu-options.hx b/qemu-options.hx index 54e492f36a..34d33a812e 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -4004,17 +4004,20 @@ Old param mode (ARM only). ETEXI DEF("sandbox", HAS_ARG, QEMU_OPTION_sandbox, \ - "-sandbox on[,obsolete=allow] Enable seccomp mode 2 system call filter (default 'off').\n" \ - " obsolete: Allow obsolete system calls\n", + "-sandbox on[,obsolete=allow][,elevateprivileges=allow|deny|children] Enable seccomp mode 2 system call filter (default 'off').\n" \ + " obsolete: Allow obsolete system calls\n" + " elevateprivileges: allows or denies Qemu process to elevate its privileges by blacklisting all set*uid|gid system calls. 'children' will deny set*uid|gid system calls for main Qemu process but will allow forks and execves to run unprivileged\n", QEMU_ARCH_ALL) STEXI -@item -sandbox @var{arg}[,obsolete=@var{string}] +@item -sandbox @var{arg}[,obsolete=@var{string}][,elevateprivileges=@var{string}] @findex -sandbox Enable Seccomp mode 2 system call filter. 'on' will enable syscall filtering and 'off' will disable it. The default is 'off'. @table @option @item obsolete=@var{string} Enable Obsolete system calls +@item elevateprivileges=@var{string} +Disable set*uid|gid systema calls @end table ETEXI diff --git a/qemu-seccomp.c b/qemu-seccomp.c index c6a8b28260..6caa513edd 100644 --- a/qemu-seccomp.c +++ b/qemu-seccomp.c @@ -31,6 +31,19 @@ struct QemuSeccompSyscall { uint8_t priority; }; +static const struct QemuSeccompSyscall privileged_syscalls[] = { + { SCMP_SYS(setuid), 255 }, + { SCMP_SYS(setgid), 255 }, + { SCMP_SYS(setpgid), 255 }, + { SCMP_SYS(setsid), 255 }, + { SCMP_SYS(setreuid), 255 }, + { SCMP_SYS(setregid), 255 }, + { SCMP_SYS(setresuid), 255 }, + { SCMP_SYS(setresgid), 255 }, + { SCMP_SYS(setfsuid), 255 }, + { SCMP_SYS(setfsgid), 255 }, +}; + static const struct QemuSeccompSyscall obsolete[] = { { SCMP_SYS(readdir), 255 }, { SCMP_SYS(_sysctl), 255 }, @@ -110,6 +123,22 @@ int seccomp_start(uint8_t seccomp_opts) } } + if (seccomp_opts & PRIVILEGED) { + for (i = 0; i < ARRAY_SIZE(privileged_syscalls); i++) { + rc = seccomp_rule_add(ctx, SCMP_ACT_KILL, + privileged_syscalls[i].num, 0); + if (rc < 0) { + goto seccomp_return; + } + rc = seccomp_syscall_priority(ctx, privileged_syscalls[i].num, + privileged_syscalls[i].priority); + if (rc < 0) { + goto seccomp_return; + } + } + } + + rc = seccomp_load(ctx); seccomp_return: diff --git a/vl.c b/vl.c index cbe09c94af..800e2b573d 100644 --- a/vl.c +++ b/vl.c @@ -29,6 +29,7 @@ #ifdef CONFIG_SECCOMP #include "sysemu/seccomp.h" +#include "sys/prctl.h" #endif #if defined(CONFIG_VDE) @@ -275,6 +276,10 @@ static QemuOptsList qemu_sandbox_opts = { .name = "obsolete", .type = QEMU_OPT_STRING, }, + { + .name = "elevateprivileges", + .type = QEMU_OPT_STRING, + }, { /* end of list */ } }, }; @@ -1046,6 +1051,23 @@ static int parse_sandbox(void *opaque, QemuOpts *opts, Error **errp) } } + value = qemu_opt_get(opts, "elevateprivileges"); + if (value) { + if (strcmp(value, "deny") == 0) { + seccomp_opts |= PRIVILEGED; + } + if (strcmp(value, "children") == 0) { + seccomp_opts |= PRIVILEGED; + + /* calling prctl directly because we're + * not sure if host has CAP_SYS_ADMIN set*/ + if (prctl(PR_SET_NO_NEW_PRIVS, 1)) { + error_report("failed to set no_new_privs " + "aborting"); + } + } + } + if (seccomp_start(seccomp_opts) < 0) { error_report("failed to install seccomp syscall filter " "in the kernel"); -- 2.13.3 ^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [Qemu-devel] [PATCH v3 3/6] seccomp: add elevateprivileges argument to command line 2017-07-28 12:10 ` [Qemu-devel] [PATCH v3 3/6] seccomp: add elevateprivileges " Eduardo Otubo @ 2017-08-02 12:37 ` Daniel P. Berrange 2017-08-03 16:59 ` Thomas Huth 1 sibling, 0 replies; 20+ messages in thread From: Daniel P. Berrange @ 2017-08-02 12:37 UTC (permalink / raw) To: Eduardo Otubo; +Cc: qemu-devel, thuth, pbonzini On Fri, Jul 28, 2017 at 02:10:37PM +0200, Eduardo Otubo wrote: > This patch introduces the new argument > [,elevateprivileges=allow|deny|children] to the `-sandbox on'. It allows > or denies Qemu process to elevate its privileges by blacklisting all > set*uid|gid system calls. The 'children' option will let forks and > execves run unprivileged. > > Signed-off-by: Eduardo Otubo <otubo@redhat.com> > --- > include/sysemu/seccomp.h | 1 + > qemu-options.hx | 9 ++++++--- > qemu-seccomp.c | 29 +++++++++++++++++++++++++++++ > vl.c | 22 ++++++++++++++++++++++ > 4 files changed, 58 insertions(+), 3 deletions(-) > > diff --git a/include/sysemu/seccomp.h b/include/sysemu/seccomp.h > index 7a7bde246b..e6e78d85ce 100644 > --- a/include/sysemu/seccomp.h > +++ b/include/sysemu/seccomp.h > @@ -16,6 +16,7 @@ > #define QEMU_SECCOMP_H > > #define OBSOLETE 0x0001 > +#define PRIVILEGED 0x0010 Err, this is hex, but you seem to be treating it as a binary string. It would be better expressed as #define OBSOLETE (1 << 0) #define PRIVILEGED (1 << 1) #define .... (1 << 2) #define .... (1 << 3) #define .... (1 << 4) > > + value = qemu_opt_get(opts, "elevateprivileges"); > + if (value) { > + if (strcmp(value, "deny") == 0) { > + seccomp_opts |= PRIVILEGED; > + } > + if (strcmp(value, "children") == 0) { > + seccomp_opts |= PRIVILEGED; > + > + /* calling prctl directly because we're > + * not sure if host has CAP_SYS_ADMIN set*/ > + if (prctl(PR_SET_NO_NEW_PRIVS, 1)) { > + error_report("failed to set no_new_privs " > + "aborting"); > + } The prctl() really ought to be done in seccomp_start IMHO. > + } Also it should report an error for invalid 'value' strings. > + } > + > if (seccomp_start(seccomp_opts) < 0) { > error_report("failed to install seccomp syscall filter " > "in the kernel"); > -- > 2.13.3 > Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [Qemu-devel] [PATCH v3 3/6] seccomp: add elevateprivileges argument to command line 2017-07-28 12:10 ` [Qemu-devel] [PATCH v3 3/6] seccomp: add elevateprivileges " Eduardo Otubo 2017-08-02 12:37 ` Daniel P. Berrange @ 2017-08-03 16:59 ` Thomas Huth 1 sibling, 0 replies; 20+ messages in thread From: Thomas Huth @ 2017-08-03 16:59 UTC (permalink / raw) To: Eduardo Otubo, qemu-devel; +Cc: berrange, pbonzini On 28.07.2017 14:10, Eduardo Otubo wrote: > This patch introduces the new argument > [,elevateprivileges=allow|deny|children] to the `-sandbox on'. It allows > or denies Qemu process to elevate its privileges by blacklisting all > set*uid|gid system calls. The 'children' option will let forks and > execves run unprivileged. > > Signed-off-by: Eduardo Otubo <otubo@redhat.com> > --- > include/sysemu/seccomp.h | 1 + > qemu-options.hx | 9 ++++++--- > qemu-seccomp.c | 29 +++++++++++++++++++++++++++++ > vl.c | 22 ++++++++++++++++++++++ > 4 files changed, 58 insertions(+), 3 deletions(-) > > diff --git a/include/sysemu/seccomp.h b/include/sysemu/seccomp.h > index 7a7bde246b..e6e78d85ce 100644 > --- a/include/sysemu/seccomp.h > +++ b/include/sysemu/seccomp.h > @@ -16,6 +16,7 @@ > #define QEMU_SECCOMP_H > > #define OBSOLETE 0x0001 > +#define PRIVILEGED 0x0010 > > #include <seccomp.h> > > diff --git a/qemu-options.hx b/qemu-options.hx > index 54e492f36a..34d33a812e 100644 > --- a/qemu-options.hx > +++ b/qemu-options.hx > @@ -4004,17 +4004,20 @@ Old param mode (ARM only). > ETEXI > > DEF("sandbox", HAS_ARG, QEMU_OPTION_sandbox, \ > - "-sandbox on[,obsolete=allow] Enable seccomp mode 2 system call filter (default 'off').\n" \ > - " obsolete: Allow obsolete system calls\n", > + "-sandbox on[,obsolete=allow][,elevateprivileges=allow|deny|children] Enable seccomp mode 2 system call filter (default 'off').\n" \ Most other boolean-like options use "on|off" as possible values ... maybe it would be nicer to use "on|off" instead of "allow|deny" here, too? > + " obsolete: Allow obsolete system calls\n" > + " elevateprivileges: allows or denies Qemu process to elevate its privileges by blacklisting all set*uid|gid system calls. 'children' will deny set*uid|gid system calls for main Qemu process but will allow forks and execves to run unprivileged\n", Correct spelling is "QEMU" with all capital letters, not "Qemu" > QEMU_ARCH_ALL) > STEXI > -@item -sandbox @var{arg}[,obsolete=@var{string}] > +@item -sandbox @var{arg}[,obsolete=@var{string}][,elevateprivileges=@var{string}] > @findex -sandbox > Enable Seccomp mode 2 system call filter. 'on' will enable syscall filtering and 'off' will > disable it. The default is 'off'. > @table @option > @item obsolete=@var{string} > Enable Obsolete system calls > +@item elevateprivileges=@var{string} > +Disable set*uid|gid systema calls s/systema/system/ Thomas ^ permalink raw reply [flat|nested] 20+ messages in thread
* [Qemu-devel] [PATCH v3 4/6] seccomp: add spawn argument to command line 2017-07-28 12:10 [Qemu-devel] [PATCH v3 0/6] seccomp: feature refactoring Eduardo Otubo ` (2 preceding siblings ...) 2017-07-28 12:10 ` [Qemu-devel] [PATCH v3 3/6] seccomp: add elevateprivileges " Eduardo Otubo @ 2017-07-28 12:10 ` Eduardo Otubo 2017-07-28 12:10 ` [Qemu-devel] [PATCH v3 5/6] seccomp: add resourcecontrol " Eduardo Otubo 2017-07-28 12:10 ` [Qemu-devel] [PATCH v3 6/6] seccomp: adding documentation to new seccomp model Eduardo Otubo 5 siblings, 0 replies; 20+ messages in thread From: Eduardo Otubo @ 2017-07-28 12:10 UTC (permalink / raw) To: qemu-devel; +Cc: thuth, berrange, pbonzini This patch adds [,spawn=deny] argument to `-sandbox on' option. It blacklists fork and execve system calls, avoiding Qemu to spawn new threads or processes. Signed-off-by: Eduardo Otubo <otubo@redhat.com> --- include/sysemu/seccomp.h | 1 + qemu-options.hx | 9 ++++++--- qemu-seccomp.c | 19 +++++++++++++++++++ vl.c | 11 +++++++++++ 4 files changed, 37 insertions(+), 3 deletions(-) diff --git a/include/sysemu/seccomp.h b/include/sysemu/seccomp.h index e6e78d85ce..f1614d6514 100644 --- a/include/sysemu/seccomp.h +++ b/include/sysemu/seccomp.h @@ -17,6 +17,7 @@ #define OBSOLETE 0x0001 #define PRIVILEGED 0x0010 +#define SPAWN 0x0100 #include <seccomp.h> diff --git a/qemu-options.hx b/qemu-options.hx index 34d33a812e..3d612f0fd1 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -4004,12 +4004,13 @@ Old param mode (ARM only). ETEXI DEF("sandbox", HAS_ARG, QEMU_OPTION_sandbox, \ - "-sandbox on[,obsolete=allow][,elevateprivileges=allow|deny|children] Enable seccomp mode 2 system call filter (default 'off').\n" \ + "-sandbox on[,obsolete=allow][,elevateprivileges=allow|deny|children][,spawn=deny] Enable seccomp mode 2 system call filter (default 'off').\n" \ " obsolete: Allow obsolete system calls\n" - " elevateprivileges: allows or denies Qemu process to elevate its privileges by blacklisting all set*uid|gid system calls. 'children' will deny set*uid|gid system calls for main Qemu process but will allow forks and execves to run unprivileged\n", + " elevateprivileges: allows or denies Qemu process to elevate its privileges by blacklisting all set*uid|gid system calls. 'children' will deny set*uid|gid system calls for main Qemu process but will allow forks and execves to run unprivileged\n" + " spawn: avoids Qemu to spawn new threads or processes by blacklisting *fork and execve\n", QEMU_ARCH_ALL) STEXI -@item -sandbox @var{arg}[,obsolete=@var{string}][,elevateprivileges=@var{string}] +@item -sandbox @var{arg}[,obsolete=@var{string}][,elevateprivileges=@var{string}][,spawn=@var{string}] @findex -sandbox Enable Seccomp mode 2 system call filter. 'on' will enable syscall filtering and 'off' will disable it. The default is 'off'. @@ -4018,6 +4019,8 @@ disable it. The default is 'off'. Enable Obsolete system calls @item elevateprivileges=@var{string} Disable set*uid|gid systema calls +@item spawn=@var{string} +Disable *fork and execve @end table ETEXI diff --git a/qemu-seccomp.c b/qemu-seccomp.c index 6caa513edd..22a093ca1b 100644 --- a/qemu-seccomp.c +++ b/qemu-seccomp.c @@ -31,6 +31,12 @@ struct QemuSeccompSyscall { uint8_t priority; }; +static const struct QemuSeccompSyscall spawn_syscalls[] = { + { SCMP_SYS(fork), 255 }, + { SCMP_SYS(vfork), 255 }, + { SCMP_SYS(execve), 255 }, +}; + static const struct QemuSeccompSyscall privileged_syscalls[] = { { SCMP_SYS(setuid), 255 }, { SCMP_SYS(setgid), 255 }, @@ -138,6 +144,19 @@ int seccomp_start(uint8_t seccomp_opts) } } + if (seccomp_opts & SPAWN) { + for (i = 0; i < ARRAY_SIZE(spawn_syscalls); i++) { + rc = seccomp_rule_add(ctx, SCMP_ACT_KILL, spawn_syscalls[i].num, 0); + if (rc < 0) { + goto seccomp_return; + } + rc = seccomp_syscall_priority(ctx, spawn_syscalls[i].num, + spawn_syscalls[i].priority); + if (rc < 0) { + goto seccomp_return; + } + } + } rc = seccomp_load(ctx); diff --git a/vl.c b/vl.c index 800e2b573d..e3a59ef1b5 100644 --- a/vl.c +++ b/vl.c @@ -280,6 +280,10 @@ static QemuOptsList qemu_sandbox_opts = { .name = "elevateprivileges", .type = QEMU_OPT_STRING, }, + { + .name = "spawn", + .type = QEMU_OPT_STRING, + }, { /* end of list */ } }, }; @@ -1068,6 +1072,13 @@ static int parse_sandbox(void *opaque, QemuOpts *opts, Error **errp) } } + value = qemu_opt_get(opts, "spawn"); + if (value) { + if (strcmp(value, "deny") == 0) { + seccomp_opts |= SPAWN; + } + } + if (seccomp_start(seccomp_opts) < 0) { error_report("failed to install seccomp syscall filter " "in the kernel"); -- 2.13.3 ^ permalink raw reply related [flat|nested] 20+ messages in thread
* [Qemu-devel] [PATCH v3 5/6] seccomp: add resourcecontrol argument to command line 2017-07-28 12:10 [Qemu-devel] [PATCH v3 0/6] seccomp: feature refactoring Eduardo Otubo ` (3 preceding siblings ...) 2017-07-28 12:10 ` [Qemu-devel] [PATCH v3 4/6] seccomp: add spawn " Eduardo Otubo @ 2017-07-28 12:10 ` Eduardo Otubo 2017-07-28 12:10 ` [Qemu-devel] [PATCH v3 6/6] seccomp: adding documentation to new seccomp model Eduardo Otubo 5 siblings, 0 replies; 20+ messages in thread From: Eduardo Otubo @ 2017-07-28 12:10 UTC (permalink / raw) To: qemu-devel; +Cc: thuth, berrange, pbonzini This patch adds [,resourcecontrol=deny] to `-sandbox on' option. It blacklists all process affinity and scheduler priority system calls to avoid any bigger of the process. Signed-off-by: Eduardo Otubo <otubo@redhat.com> --- include/sysemu/seccomp.h | 1 + qemu-options.hx | 9 ++++++--- qemu-seccomp.c | 28 ++++++++++++++++++++++++++++ vl.c | 11 +++++++++++ 4 files changed, 46 insertions(+), 3 deletions(-) diff --git a/include/sysemu/seccomp.h b/include/sysemu/seccomp.h index f1614d6514..c7003dd197 100644 --- a/include/sysemu/seccomp.h +++ b/include/sysemu/seccomp.h @@ -18,6 +18,7 @@ #define OBSOLETE 0x0001 #define PRIVILEGED 0x0010 #define SPAWN 0x0100 +#define RESOURCECTL 0x1000 #include <seccomp.h> diff --git a/qemu-options.hx b/qemu-options.hx index 3d612f0fd1..7192236dee 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -4004,13 +4004,14 @@ Old param mode (ARM only). ETEXI DEF("sandbox", HAS_ARG, QEMU_OPTION_sandbox, \ - "-sandbox on[,obsolete=allow][,elevateprivileges=allow|deny|children][,spawn=deny] Enable seccomp mode 2 system call filter (default 'off').\n" \ + "-sandbox on[,obsolete=allow][,elevateprivileges=allow|deny|children][,spawn=deny][,resourcecontrol=deny] Enable seccomp mode 2 system call filter (default 'off').\n" \ " obsolete: Allow obsolete system calls\n" " elevateprivileges: allows or denies Qemu process to elevate its privileges by blacklisting all set*uid|gid system calls. 'children' will deny set*uid|gid system calls for main Qemu process but will allow forks and execves to run unprivileged\n" - " spawn: avoids Qemu to spawn new threads or processes by blacklisting *fork and execve\n", + " spawn: avoids Qemu to spawn new threads or processes by blacklisting *fork and execve\n" + " resourcecontrol: disable process affinity and schedular priority\n", QEMU_ARCH_ALL) STEXI -@item -sandbox @var{arg}[,obsolete=@var{string}][,elevateprivileges=@var{string}][,spawn=@var{string}] +@item -sandbox @var{arg}[,obsolete=@var{string}][,elevateprivileges=@var{string}][,spawn=@var{string}][,resourcecontrol=@var{string}] @findex -sandbox Enable Seccomp mode 2 system call filter. 'on' will enable syscall filtering and 'off' will disable it. The default is 'off'. @@ -4021,6 +4022,8 @@ Enable Obsolete system calls Disable set*uid|gid systema calls @item spawn=@var{string} Disable *fork and execve +@item resourcecontrol=@var{string} +Disable process affinity and schedular priority @end table ETEXI diff --git a/qemu-seccomp.c b/qemu-seccomp.c index 22a093ca1b..95c8e31d1a 100644 --- a/qemu-seccomp.c +++ b/qemu-seccomp.c @@ -31,6 +31,19 @@ struct QemuSeccompSyscall { uint8_t priority; }; +static const struct QemuSeccompSyscall resourcecontrol_syscalls[] = { + { SCMP_SYS(getpriority), 255 }, + { SCMP_SYS(setpriority), 255 }, + { SCMP_SYS(sched_setparam), 255 }, + { SCMP_SYS(sched_getparam), 255 }, + { SCMP_SYS(sched_setscheduler), 255 }, + { SCMP_SYS(sched_getscheduler), 255 }, + { SCMP_SYS(sched_setaffinity), 255 }, + { SCMP_SYS(sched_getaffinity), 255 }, + { SCMP_SYS(sched_get_priority_max), 255 }, + { SCMP_SYS(sched_get_priority_min), 255 }, +}; + static const struct QemuSeccompSyscall spawn_syscalls[] = { { SCMP_SYS(fork), 255 }, { SCMP_SYS(vfork), 255 }, @@ -158,6 +171,21 @@ int seccomp_start(uint8_t seccomp_opts) } } + if (seccomp_opts & RESOURCECTL) { + for (i = 0; i < ARRAY_SIZE(resourcecontrol_syscalls); i++) { + rc = seccomp_rule_add(ctx, SCMP_ACT_KILL, + resourcecontrol_syscalls[i].num, 0); + if (rc < 0) { + goto seccomp_return; + } + rc = seccomp_syscall_priority(ctx, resourcecontrol_syscalls[i].num, + resourcecontrol_syscalls[i].priority); + if (rc < 0) { + goto seccomp_return; + } + } + } + rc = seccomp_load(ctx); seccomp_return: diff --git a/vl.c b/vl.c index e3a59ef1b5..c09d6dde49 100644 --- a/vl.c +++ b/vl.c @@ -284,6 +284,10 @@ static QemuOptsList qemu_sandbox_opts = { .name = "spawn", .type = QEMU_OPT_STRING, }, + { + .name = "resourcecontrol", + .type = QEMU_OPT_STRING, + }, { /* end of list */ } }, }; @@ -1079,6 +1083,13 @@ static int parse_sandbox(void *opaque, QemuOpts *opts, Error **errp) } } + value = qemu_opt_get(opts, "resourcecontrol"); + if (value) { + if (strcmp(value, "deny") == 0) { + seccomp_opts |= RESOURCECTL; + } + } + if (seccomp_start(seccomp_opts) < 0) { error_report("failed to install seccomp syscall filter " "in the kernel"); -- 2.13.3 ^ permalink raw reply related [flat|nested] 20+ messages in thread
* [Qemu-devel] [PATCH v3 6/6] seccomp: adding documentation to new seccomp model 2017-07-28 12:10 [Qemu-devel] [PATCH v3 0/6] seccomp: feature refactoring Eduardo Otubo ` (4 preceding siblings ...) 2017-07-28 12:10 ` [Qemu-devel] [PATCH v3 5/6] seccomp: add resourcecontrol " Eduardo Otubo @ 2017-07-28 12:10 ` Eduardo Otubo 2017-08-02 12:39 ` Daniel P. Berrange 2017-08-03 17:14 ` Thomas Huth 5 siblings, 2 replies; 20+ messages in thread From: Eduardo Otubo @ 2017-07-28 12:10 UTC (permalink / raw) To: qemu-devel; +Cc: thuth, berrange, pbonzini Adding new documention under docs/ to describe every one and each new option added by the seccomp refactoring patchset. Signed-off-by: Eduardo Otubo <otubo@redhat.com> --- docs/seccomp.txt | 31 +++++++++++++++++++++++++++++++ 1 file changed, 31 insertions(+) create mode 100644 docs/seccomp.txt diff --git a/docs/seccomp.txt b/docs/seccomp.txt new file mode 100644 index 0000000000..4b7edba312 --- /dev/null +++ b/docs/seccomp.txt @@ -0,0 +1,31 @@ +QEMU Seccomp system call filter +=============================== + +Starting from Qemu version 2.10, the seccomp filter does not work as a +whitelist but as a blacklist instead. This method allows safer deploys since +only the strictly forbidden system calls will be black-listed and the +possibility of breaking any workload is close to zero. + +The default option (-sandbox on) has a slightly looser security though and the +reason is that it shouldn't break any backwards compatibility with previous +deploys and command lines already running. But if the intent is to have a +better security from this version on, one should make use of the following +additional options properly: + +* [,obsolete=allow]: It allows Qemu to run safely on old system that still + relies on old system calls. + +* [,elevateprivileges=deny|allow|children]: It allows or denies Qemu process + to elevate its privileges by blacklisting all set*uid|gid system calls. The + 'children' option sets the PR_SET_NO_NEW_PRIVS to 1 which allows helpers + (forls and execs) to run unprivileged. + +* [,spawn=deny]: It blacklists fork and execve syste calls, avoiding Qemu to + spawn new threads or processes. + +* [,resourcecontrol=deny]: It blacklists all process affinity and scheduler + priority system calls to avoid any bigger of the process. + + +-- +Eduardo Otubo <otubo@redhat.com> -- 2.13.3 ^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [Qemu-devel] [PATCH v3 6/6] seccomp: adding documentation to new seccomp model 2017-07-28 12:10 ` [Qemu-devel] [PATCH v3 6/6] seccomp: adding documentation to new seccomp model Eduardo Otubo @ 2017-08-02 12:39 ` Daniel P. Berrange 2017-08-03 17:14 ` Thomas Huth 1 sibling, 0 replies; 20+ messages in thread From: Daniel P. Berrange @ 2017-08-02 12:39 UTC (permalink / raw) To: Eduardo Otubo; +Cc: qemu-devel, thuth, pbonzini On Fri, Jul 28, 2017 at 02:10:40PM +0200, Eduardo Otubo wrote: > Adding new documention under docs/ to describe every one and each new > option added by the seccomp refactoring patchset. > > Signed-off-by: Eduardo Otubo <otubo@redhat.com> > --- > docs/seccomp.txt | 31 +++++++++++++++++++++++++++++++ > 1 file changed, 31 insertions(+) > create mode 100644 docs/seccomp.txt > > diff --git a/docs/seccomp.txt b/docs/seccomp.txt > new file mode 100644 > index 0000000000..4b7edba312 > --- /dev/null > +++ b/docs/seccomp.txt > @@ -0,0 +1,31 @@ > +QEMU Seccomp system call filter > +=============================== > + > +Starting from Qemu version 2.10, the seccomp filter does not work as a > +whitelist but as a blacklist instead. This method allows safer deploys since > +only the strictly forbidden system calls will be black-listed and the > +possibility of breaking any workload is close to zero. > + > +The default option (-sandbox on) has a slightly looser security though and the > +reason is that it shouldn't break any backwards compatibility with previous > +deploys and command lines already running. But if the intent is to have a > +better security from this version on, one should make use of the following > +additional options properly: > + > +* [,obsolete=allow]: It allows Qemu to run safely on old system that still > + relies on old system calls. We should support 'allow' and 'deny' for all of the options. THis allows the callers to be explicit about the state, if they don't wish to rely on the QEMU defaults > + > +* [,elevateprivileges=deny|allow|children]: It allows or denies Qemu process > + to elevate its privileges by blacklisting all set*uid|gid system calls. The > + 'children' option sets the PR_SET_NO_NEW_PRIVS to 1 which allows helpers > + (forls and execs) to run unprivileged. > + > +* [,spawn=deny]: It blacklists fork and execve syste calls, avoiding Qemu to > + spawn new threads or processes. > + > +* [,resourcecontrol=deny]: It blacklists all process affinity and scheduler > + priority system calls to avoid any bigger of the process. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [Qemu-devel] [PATCH v3 6/6] seccomp: adding documentation to new seccomp model 2017-07-28 12:10 ` [Qemu-devel] [PATCH v3 6/6] seccomp: adding documentation to new seccomp model Eduardo Otubo 2017-08-02 12:39 ` Daniel P. Berrange @ 2017-08-03 17:14 ` Thomas Huth 1 sibling, 0 replies; 20+ messages in thread From: Thomas Huth @ 2017-08-03 17:14 UTC (permalink / raw) To: Eduardo Otubo, qemu-devel; +Cc: berrange, pbonzini On 28.07.2017 14:10, Eduardo Otubo wrote: > Adding new documention under docs/ to describe every one and each new s/documention/documentation/ > option added by the seccomp refactoring patchset. > > Signed-off-by: Eduardo Otubo <otubo@redhat.com> > --- > docs/seccomp.txt | 31 +++++++++++++++++++++++++++++++ > 1 file changed, 31 insertions(+) > create mode 100644 docs/seccomp.txt > > diff --git a/docs/seccomp.txt b/docs/seccomp.txt > new file mode 100644 > index 0000000000..4b7edba312 > --- /dev/null > +++ b/docs/seccomp.txt > @@ -0,0 +1,31 @@ > +QEMU Seccomp system call filter > +=============================== > + > +Starting from Qemu version 2.10, the seccomp filter does not work as a s/Qemu/QEMU/ s/2.10/2.11/ > +whitelist but as a blacklist instead. This method allows safer deploys since > +only the strictly forbidden system calls will be black-listed and the > +possibility of breaking any workload is close to zero. > + > +The default option (-sandbox on) has a slightly looser security though and the > +reason is that it shouldn't break any backwards compatibility with previous > +deploys and command lines already running. But if the intent is to have a > +better security from this version on, one should make use of the following > +additional options properly: > + > +* [,obsolete=allow]: It allows Qemu to run safely on old system that still > + relies on old system calls. > + > +* [,elevateprivileges=deny|allow|children]: It allows or denies Qemu process > + to elevate its privileges by blacklisting all set*uid|gid system calls. The > + 'children' option sets the PR_SET_NO_NEW_PRIVS to 1 which allows helpers > + (forls and execs) to run unprivileged. s/forls/forks/ > +* [,spawn=deny]: It blacklists fork and execve syste calls, avoiding Qemu to > + spawn new threads or processes. > + > +* [,resourcecontrol=deny]: It blacklists all process affinity and scheduler > + priority system calls to avoid any bigger of the process. "to avoid any bigger" sounds strange to me. Maybe rather something like: "to avoid that the process can increase its amount of allowed resource consumption" or something similar? Thomas ^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2017-08-11 10:10 UTC | newest] Thread overview: 20+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2017-07-28 12:10 [Qemu-devel] [PATCH v3 0/6] seccomp: feature refactoring Eduardo Otubo 2017-07-28 12:10 ` [Qemu-devel] [PATCH v3 1/6] seccomp: changing from whitelist to blacklist Eduardo Otubo 2017-08-02 12:25 ` Daniel P. Berrange 2017-08-03 16:54 ` Thomas Huth 2017-08-11 9:51 ` Eduardo Otubo 2017-08-11 10:10 ` Daniel P. Berrange 2017-07-28 12:10 ` [Qemu-devel] [PATCH v3 2/6] seccomp: add obsolete argument to command line Eduardo Otubo 2017-08-02 12:33 ` Daniel P. Berrange 2017-08-02 12:38 ` Daniel P. Berrange 2017-08-11 9:12 ` Eduardo Otubo 2017-08-11 9:25 ` Daniel P. Berrange 2017-08-11 9:49 ` Eduardo Otubo 2017-07-28 12:10 ` [Qemu-devel] [PATCH v3 3/6] seccomp: add elevateprivileges " Eduardo Otubo 2017-08-02 12:37 ` Daniel P. Berrange 2017-08-03 16:59 ` Thomas Huth 2017-07-28 12:10 ` [Qemu-devel] [PATCH v3 4/6] seccomp: add spawn " Eduardo Otubo 2017-07-28 12:10 ` [Qemu-devel] [PATCH v3 5/6] seccomp: add resourcecontrol " Eduardo Otubo 2017-07-28 12:10 ` [Qemu-devel] [PATCH v3 6/6] seccomp: adding documentation to new seccomp model Eduardo Otubo 2017-08-02 12:39 ` Daniel P. Berrange 2017-08-03 17:14 ` Thomas Huth
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).