From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:55449) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Sgjtz-0005wW-QY for qemu-devel@nongnu.org; Mon, 18 Jun 2012 17:54:30 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Sgjtw-0004Pd-Nu for qemu-devel@nongnu.org; Mon, 18 Jun 2012 17:54:27 -0400 Received: from e33.co.us.ibm.com ([32.97.110.151]:33309) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Sgjtw-0004KI-DP for qemu-devel@nongnu.org; Mon, 18 Jun 2012 17:54:24 -0400 Received: from /spool/local by e33.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Mon, 18 Jun 2012 15:54:16 -0600 Received: from d03relay05.boulder.ibm.com (d03relay05.boulder.ibm.com [9.17.195.107]) by d03dlp03.boulder.ibm.com (Postfix) with ESMTP id D838B19D8059 for ; Mon, 18 Jun 2012 21:54:08 +0000 (WET) Received: from d03av06.boulder.ibm.com (d03av06.boulder.ibm.com [9.17.195.245]) by d03relay05.boulder.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id q5ILrq8n192342 for ; Mon, 18 Jun 2012 15:53:53 -0600 Received: from d03av06.boulder.ibm.com (loopback [127.0.0.1]) by d03av06.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id q5ILslig027314 for ; Mon, 18 Jun 2012 15:54:47 -0600 Message-ID: <4FDFA36E.4010802@linux.vnet.ibm.com> Date: Mon, 18 Jun 2012 17:53:50 -0400 From: Corey Bryant MIME-Version: 1.0 References: <20120613203305.GC6019@redhat.com> <20120618083335.GD28026@redhat.com> <4FDF479B.9060502@linux.vnet.ibm.com> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [RFC] [PATCHv2 2/2] Adding basic calls to libseccomp in vl.c List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Blue Swirl Cc: qemu-devel@nongnu.org, Eduardo Otubo On 06/18/2012 04:18 PM, Blue Swirl wrote: > On Mon, Jun 18, 2012 at 3:22 PM, Corey Bryant wrote: >> >> >> On 06/18/2012 04:33 AM, Daniel P. Berrange wrote: >>> >>> On Fri, Jun 15, 2012 at 07:04:45PM +0000, Blue Swirl wrote: >>>> >>>> On Wed, Jun 13, 2012 at 8:33 PM, Daniel P. Berrange >>>> wrote: >>>>> >>>>> On Wed, Jun 13, 2012 at 07:56:06PM +0000, Blue Swirl wrote: >>>>>> >>>>>> On Wed, Jun 13, 2012 at 7:20 PM, Eduardo Otubo >>>>>> wrote: >>>>>>> >>>>>>> I added a syscall struct using priority levels as described in the >>>>>>> libseccomp man page. The priority numbers are based to the frequency >>>>>>> they appear in a sample strace from a regular qemu guest run under >>>>>>> libvirt. >>>>>>> >>>>>>> Libseccomp generates linear BPF code to filter system calls, those >>>>>>> rules >>>>>>> are read one after another. The priority system places the most common >>>>>>> rules first in order to reduce the overhead when processing them. >>>>>>> >>>>>>> Also, since this is just a first RFC, the whitelist is a little raw. >>>>>>> We >>>>>>> might need your help to improve, test and fine tune the set of system >>>>>>> calls. >>>>>>> >>>>>>> v2: Fixed some style issues >>>>>>> Removed code from vl.c and created qemu-seccomp.[ch] >>>>>>> Now using ARRAY_SIZE macro >>>>>>> Added more syscalls without priority/frequency set yet >>>>>>> >>>>>>> Signed-off-by: Eduardo Otubo >>>>>>> --- >>>>>>> qemu-seccomp.c | 73 >>>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>>>>>> qemu-seccomp.h | 9 +++++++ >>>>>>> vl.c | 7 ++++++ >>>>>>> 3 files changed, 89 insertions(+) >>>>>>> create mode 100644 qemu-seccomp.c >>>>>>> create mode 100644 qemu-seccomp.h >>>>>>> >>>>>>> diff --git a/qemu-seccomp.c b/qemu-seccomp.c >>>>>>> new file mode 100644 >>>>>>> index 0000000..048b7ba >>>>>>> --- /dev/null >>>>>>> +++ b/qemu-seccomp.c >>>>>>> @@ -0,0 +1,73 @@ >>>>>> >>>>>> >>>>>> Copyright and license info missing. >>>>>> >>>>>>> +#include >>>>>>> +#include >>>>>>> +#include "qemu-seccomp.h" >>>>>>> + >>>>>>> +static struct QemuSeccompSyscall seccomp_whitelist[] = { >>>>>> >>>>>> >>>>>> 'const' >>>>>> >>>>>>> + { SCMP_SYS(timer_settime), 255 }, >>>>>>> + { SCMP_SYS(timer_gettime), 254 }, >>>>>>> + { SCMP_SYS(futex), 253 }, >>>>>>> + { SCMP_SYS(select), 252 }, >>>>>>> + { SCMP_SYS(recvfrom), 251 }, >>>>>>> + { SCMP_SYS(sendto), 250 }, >>>>>>> + { SCMP_SYS(read), 249 }, >>>>>>> + { SCMP_SYS(brk), 248 }, >>>>>>> + { SCMP_SYS(clone), 247 }, >>>>>>> + { SCMP_SYS(mmap), 247 }, >>>>>>> + { SCMP_SYS(mprotect), 246 }, >>>>>>> + { SCMP_SYS(ioctl), 245 }, >>>>>>> + { SCMP_SYS(recvmsg), 245 }, >>>>>>> + { SCMP_SYS(sendmsg), 245 }, >>>>>>> + { SCMP_SYS(accept), 245 }, >>>>>>> + { SCMP_SYS(connect), 245 }, >>>>>>> + { SCMP_SYS(bind), 245 }, >>>>>> >>>>>> >>>>>> It would be nice to avoid connect() and bind(). Perhaps seccomp init >>>>>> should be postponed to after all sockets have been created? >>>>> >>>>> >>>>> If you want to migrate your guest, you need to be able to >>>>> call connect() at an arbitrary point in the QEMU process' >>>>> lifecycle. So you can't avoid allowing connect(). Similarly >>>>> if you want to allow hotplug of NICs (and their backends) >>>>> then you need to have both bind() + connect() available. >>>> >>>> >>>> That's bad. Migration could conceivably be extended to use file >>>> descriptor passing, but hotplug is more tricky. >>> >>> >>> As with execve(), i'm reporting this on the basis that on the previous >>> patch posting I was told we must whitelist any syscalls QEMU can >>> conceivably use to avoid any loss in functionality. >> >> >> Thanks for pointing out syscalls needed for the whitelist. >> >> As Paul has already mentioned, it was recommended that we restrict all of >> QEMU (as a single process) from the start of execution. This is opposed to >> other options of restricting QEMU from the time that vCPUS start, further >> restricting based on syscall parms, or decomposing QEMU into multiple >> processes that are individually restricted with their own seccomp >> whitelists. > > Can each thread have separate seccomp whitelists? For example CPU > threads should not need pretty much anything but the I/O thread needs > I/O. > No, seccomp filters are defined and enforced at the process level. -- Regards, Corey >> I think this approach is a good starting point that can be further tuned in >> the future. And as with most security measures, defense in depth improves >> the cause (e.g. combining seccomp with DAC or MAC). > > Agreed. > >> >> -- >> Regards, >> Corey >> >> >