From: Corey Bryant <coreyb@linux.vnet.ibm.com>
To: Blue Swirl <blauwirbel@gmail.com>
Cc: Will Drewry <wad@chromium.org>,
qemu-devel <qemu-devel@nongnu.org>,
Eduardo Otubo <otubo@linux.vnet.ibm.com>
Subject: Re: [Qemu-devel] [RFC] [PATCHv2 2/2] Adding basic calls to libseccomp in vl.c
Date: Tue, 19 Jun 2012 12:51:21 -0400 [thread overview]
Message-ID: <4FE0AE09.8050506@linux.vnet.ibm.com> (raw)
In-Reply-To: <CABqD9ha32FAuikpDojzO91Jg8Q6VTY340LShKzpvTx6FN_uacQ@mail.gmail.com>
On 06/19/2012 11:37 AM, Will Drewry wrote:
> On Tue, Jun 19, 2012 at 8:35 AM, Corey Bryant <coreyb@linux.vnet.ibm.com> wrote:
>>
>>
>> On 06/18/2012 06:14 PM, Will Drewry wrote:
>>>
>>> [-all]
>>>
>>> On Mon, Jun 18, 2012 at 4:53 PM, Corey Bryant <coreyb@linux.vnet.ibm.com>
>>> wrote:
>>>>
>>>>
>>>>
>>>> On 06/18/2012 04:18 PM, Blue Swirl wrote:
>>>>>
>>>>>
>>>>> On Mon, Jun 18, 2012 at 3:22 PM, Corey Bryant
>>>>> <coreyb@linux.vnet.ibm.com>
>>>>> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 06/18/2012 04:33 AM, Daniel P. Berrange wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Jun 15, 2012 at 07:04:45PM +0000, Blue Swirl wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Jun 13, 2012 at 8:33 PM, Daniel P. Berrange
>>>>>>>> <berrange@redhat.com>
>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Jun 13, 2012 at 07:56:06PM +0000, Blue Swirl wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Wed, Jun 13, 2012 at 7:20 PM, Eduardo Otubo
>>>>>>>>>> <otubo@linux.vnet.ibm.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I added a syscall struct using priority levels as described in the
>>>>>>>>>>> libseccomp man page. The priority numbers are based to the
>>>>>>>>>>> frequency
>>>>>>>>>>> they appear in a sample strace from a regular qemu guest run under
>>>>>>>>>>> libvirt.
>>>>>>>>>>>
>>>>>>>>>>> Libseccomp generates linear BPF code to filter system calls, those
>>>>>>>>>>> rules
>>>>>>>>>>> are read one after another. The priority system places the most
>>>>>>>>>>> common
>>>>>>>>>>> rules first in order to reduce the overhead when processing them.
>>>>>>>>>>>
>>>>>>>>>>> Also, since this is just a first RFC, the whitelist is a little
>>>>>>>>>>> raw.
>>>>>>>>>>> We
>>>>>>>>>>> might need your help to improve, test and fine tune the set of
>>>>>>>>>>> system
>>>>>>>>>>> calls.
>>>>>>>>>>>
>>>>>>>>>>> v2: Fixed some style issues
>>>>>>>>>>> Removed code from vl.c and created qemu-seccomp.[ch]
>>>>>>>>>>> Now using ARRAY_SIZE macro
>>>>>>>>>>> Added more syscalls without priority/frequency set yet
>>>>>>>>>>>
>>>>>>>>>>> Signed-off-by: Eduardo Otubo <otubo@linux.vnet.ibm.com>
>>>>>>>>>>> ---
>>>>>>>>>>> qemu-seccomp.c | 73
>>>>>>>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>>>>>>>>> qemu-seccomp.h | 9 +++++++
>>>>>>>>>>> vl.c | 7 ++++++
>>>>>>>>>>> 3 files changed, 89 insertions(+)
>>>>>>>>>>> create mode 100644 qemu-seccomp.c
>>>>>>>>>>> create mode 100644 qemu-seccomp.h
>>>>>>>>>>>
>>>>>>>>>>> diff --git a/qemu-seccomp.c b/qemu-seccomp.c
>>>>>>>>>>> new file mode 100644
>>>>>>>>>>> index 0000000..048b7ba
>>>>>>>>>>> --- /dev/null
>>>>>>>>>>> +++ b/qemu-seccomp.c
>>>>>>>>>>> @@ -0,0 +1,73 @@
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Copyright and license info missing.
>>>>>>>>>>
>>>>>>>>>>> +#include <stdio.h>
>>>>>>>>>>> +#include <seccomp.h>
>>>>>>>>>>> +#include "qemu-seccomp.h"
>>>>>>>>>>> +
>>>>>>>>>>> +static struct QemuSeccompSyscall seccomp_whitelist[] = {
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 'const'
>>>>>>>>>>
>>>>>>>>>>> + { SCMP_SYS(timer_settime), 255 },
>>>>>>>>>>> + { SCMP_SYS(timer_gettime), 254 },
>>>>>>>>>>> + { SCMP_SYS(futex), 253 },
>>>>>>>>>>> + { SCMP_SYS(select), 252 },
>>>>>>>>>>> + { SCMP_SYS(recvfrom), 251 },
>>>>>>>>>>> + { SCMP_SYS(sendto), 250 },
>>>>>>>>>>> + { SCMP_SYS(read), 249 },
>>>>>>>>>>> + { SCMP_SYS(brk), 248 },
>>>>>>>>>>> + { SCMP_SYS(clone), 247 },
>>>>>>>>>>> + { SCMP_SYS(mmap), 247 },
>>>>>>>>>>> + { SCMP_SYS(mprotect), 246 },
>>>>>>>>>>> + { SCMP_SYS(ioctl), 245 },
>>>>>>>>>>> + { SCMP_SYS(recvmsg), 245 },
>>>>>>>>>>> + { SCMP_SYS(sendmsg), 245 },
>>>>>>>>>>> + { SCMP_SYS(accept), 245 },
>>>>>>>>>>> + { SCMP_SYS(connect), 245 },
>>>>>>>>>>> + { SCMP_SYS(bind), 245 },
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> It would be nice to avoid connect() and bind(). Perhaps seccomp
>>>>>>>>>> init
>>>>>>>>>> should be postponed to after all sockets have been created?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> If you want to migrate your guest, you need to be able to
>>>>>>>>> call connect() at an arbitrary point in the QEMU process'
>>>>>>>>> lifecycle. So you can't avoid allowing connect(). Similarly
>>>>>>>>> if you want to allow hotplug of NICs (and their backends)
>>>>>>>>> then you need to have both bind() + connect() available.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> That's bad. Migration could conceivably be extended to use file
>>>>>>>> descriptor passing, but hotplug is more tricky.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> As with execve(), i'm reporting this on the basis that on the previous
>>>>>>> patch posting I was told we must whitelist any syscalls QEMU can
>>>>>>> conceivably use to avoid any loss in functionality.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Thanks for pointing out syscalls needed for the whitelist.
>>>>>>
>>>>>> As Paul has already mentioned, it was recommended that we restrict all
>>>>>> of
>>>>>> QEMU (as a single process) from the start of execution. This is
>>>>>> opposed
>>>>>> to
>>>>>> other options of restricting QEMU from the time that vCPUS start,
>>>>>> further
>>>>>> restricting based on syscall parms, or decomposing QEMU into multiple
>>>>>> processes that are individually restricted with their own seccomp
>>>>>> whitelists.
>>>>>
>>>>>
>>>>>
>>>>> Can each thread have separate seccomp whitelists? For example CPU
>>>>> threads should not need pretty much anything but the I/O thread needs
>>>>> I/O.
>>>>>
>>>>
>>>> No, seccomp filters are defined and enforced at the process level.
>>>
>>>
>>> I'll keep lurking :) especially since I don't know the internals of
>>> qemu well, but you can do per-thread seccomp filters since
>>> processes==threads on linux. The real risk is that threads share so
>>> much that an attack on the CPU thread may be able to parlay that into
>>> a syscall proxy on a another thread. Probably what would make sense
>>> in that way is a loose global filter, then have each sub-thread
>>> install a functionality specific second filter.
>>>
>>> I may be way off base though, so feel free to just tell me to keep lurking
>>> :)
>>>
>>> Thanks again for all the support and for pushing hard to get this
>>> functionality in qemu!
>>
>>
>> Please keep lurking! I appreciate the input and education. :)
>>
>> So whether it's a thread or process, I assume it will have its own a
>> task_struct, allowing us to set a filter per thread or per process. The
>> difference being that threads share more resources than processes. Sort of
>> thinking out loud here to see if I'm right.
>
> Exactly!
>
>> It doesn't seem ideal vs process separation, but it's do-able.
>
> Yep -- so for something like qemu, you could install a global baseline
> policy (e.g., union of all needed syscalls) then for each thread, they
> can install a more restrictive set. The actual security guarantees
> will be the total synthesis because of cross-thread attacks, but it
> would make exploitation pretty painful.
>
> If you want better guarantees, then process separation is needed. One
> option is even doing brokering for complex syscalls using either
> ptrace or a sigsys handler, but that is likely too much to get into
> while establishing a baseline.
>
In response to "Can each thread have separate seccomp whitelists?"
please take a look at the thread above from Will Drewry. seccomp *can*
be used per thread. However, it's not ideal vs per process seccomp filters.
--
Regards,
Corey
>> You don't mind if I share your input with the others, do you?
>
> Of course not!
>
> cheers!
>
>>
>> --
>> Regards,
>> Corey
>>
>>
>>>
>>>>
>>>>>> I think this approach is a good starting point that can be further
>>>>>> tuned
>>>>>> in
>>>>>> the future. And as with most security measures, defense in depth
>>>>>> improves
>>>>>> the cause (e.g. combining seccomp with DAC or MAC).
>>>>>
>>>>>
>>>>>
>>>>> Agreed.
>>>>>
>>>>>>
>>>>>> --
>>>>>> Regards,
>>>>>> Corey
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>
>>
>
next prev parent reply other threads:[~2012-06-19 16:51 UTC|newest]
Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-06-13 19:20 [Qemu-devel] [RFC] [PATCHv2 0/2] Sandboxing Qemu guests with Libseccomp Eduardo Otubo
2012-06-13 19:20 ` [Qemu-devel] [RFC] [PATCHv2 1/2] Adding support for libseccomp in configure Eduardo Otubo
2012-06-13 19:45 ` Blue Swirl
2012-06-13 19:20 ` [Qemu-devel] [RFC] [PATCHv2 2/2] Adding basic calls to libseccomp in vl.c Eduardo Otubo
2012-06-13 19:56 ` Blue Swirl
2012-06-13 20:33 ` Daniel P. Berrange
2012-06-15 19:04 ` Blue Swirl
2012-06-18 8:33 ` Daniel P. Berrange
2012-06-18 15:22 ` Corey Bryant
2012-06-18 20:18 ` Blue Swirl
2012-06-18 21:53 ` Corey Bryant
[not found] ` <CABqD9hYKLf9D37XsF6nvNmtJ=0wJ39Yu_A-JeWxDJ_8haBmEWA@mail.gmail.com>
[not found] ` <4FE08025.6030406@linux.vnet.ibm.com>
[not found] ` <CABqD9ha32FAuikpDojzO91Jg8Q6VTY340LShKzpvTx6FN_uacQ@mail.gmail.com>
2012-06-19 16:51 ` Corey Bryant [this message]
2012-07-01 13:25 ` Paolo Bonzini
2012-07-02 2:18 ` Will Drewry
2012-07-02 14:20 ` Corey Bryant
2012-06-13 20:30 ` Daniel P. Berrange
2012-06-15 19:06 ` Blue Swirl
2012-06-15 21:02 ` Paul Moore
2012-06-15 21:23 ` Blue Swirl
2012-06-15 21:36 ` Paul Moore
2012-06-16 6:46 ` Blue Swirl
2012-06-18 17:41 ` Corey Bryant
2012-06-19 11:04 ` Avi Kivity
2012-06-19 18:58 ` Blue Swirl
2012-06-21 8:04 ` Avi Kivity
[not found] ` <4FEB7A4D.7050608@redhat.com>
[not found] ` <CAAu8pHtYmoJ7WCK7LAOj_j2YU-nAgiLTg7q4qXL3Vu-kPRpZnw@mail.gmail.com>
2012-07-02 18:05 ` Corey Bryant
2012-07-03 19:15 ` Blue Swirl
2012-06-15 21:44 ` Eric Blake
2012-06-18 8:31 ` Daniel P. Berrange
2012-06-18 8:38 ` Daniel P. Berrange
2012-06-18 13:52 ` Paul Moore
2012-06-18 13:55 ` Daniel P. Berrange
2012-06-18 14:02 ` Paul Moore
2012-06-18 20:13 ` Eduardo Otubo
2012-06-18 20:23 ` Blue Swirl
2012-06-18 15:29 ` Corey Bryant
2012-06-18 20:15 ` Blue Swirl
2012-06-19 9:23 ` Daniel P. Berrange
2012-06-19 18:44 ` Blue Swirl
2012-06-18 8:26 ` Daniel P. Berrange
2012-06-13 20:37 ` Daniel P. Berrange
2012-06-13 20:31 ` [Qemu-devel] [RFC] [PATCHv2 0/2] Sandboxing Qemu guests with Libseccomp Paul Moore
2012-06-14 21:59 ` [Qemu-devel] [libseccomp-discuss] " Kees Cook
2012-06-15 13:54 ` Paul Moore
2012-10-29 15:11 ` Corey Bryant
2012-10-29 15:32 ` Daniel P. Berrange
2012-10-29 15:40 ` Paul Moore
2012-10-29 15:51 ` Corey Bryant
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4FE0AE09.8050506@linux.vnet.ibm.com \
--to=coreyb@linux.vnet.ibm.com \
--cc=blauwirbel@gmail.com \
--cc=otubo@linux.vnet.ibm.com \
--cc=qemu-devel@nongnu.org \
--cc=wad@chromium.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).