[Qemu-devel] [RFC] Continuous work on sandboxing

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [Qemu-devel] [RFC] Continuous work on sandboxing
@ 2013-04-26 18:39 Eduardo Otubo
  2013-04-26 21:07 ` Paul Moore
  0 siblings, 1 reply; 16+ messages in thread
From: Eduardo Otubo @ 2013-04-26 18:39 UTC (permalink / raw)
  To: qemu-devel, Daniel P. Berrange, Eric Paris, Paul Moore

Hello folks,

	Resuming the sandboxing work, I'd  like to ask for comments on the 
ideias I have:

1. Reduce whitelist to the optimal subset: Run various tests on Qemu 
with different configurations to reduce to the smallest syscall set 
possible; test and send a patch weekly (this is already being performed 
and a patch is on the way)

2. Introduce a second whitelist - the whitelist should be defined in 
libvirt and passed on to qemu or just pre defined in Qemu? Also remove 
execve() and avoid open() and socket() and its parameters - also 
wondering if (and how) we should pass the fd along from libvirt to qemu.

3. Debugging and/or learning mode - third party libraries still have the 
problem of interfering in the Qemu's signal mask. According to some 
previous discussions, perhaps patch all external libraries that mass up 
with this mask (spice, for example) is a way to solve it. But not sure 
if it worth the time spent. Would like to hear you guys.

Regards,

-- 
Eduardo Otubo
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [RFC] Continuous work on sandboxing
  2013-04-26 18:39 [Qemu-devel] [RFC] Continuous work on sandboxing Eduardo Otubo
@ 2013-04-26 21:07 ` Paul Moore
  2013-04-26 22:17   ` Paolo Bonzini
                     ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Paul Moore @ 2013-04-26 21:07 UTC (permalink / raw)
  To: Eduardo Otubo; +Cc: qemu-devel, Eric Paris

On Friday, April 26, 2013 03:39:33 PM Eduardo Otubo wrote:
> Hello folks,
> 
> Resuming the sandboxing work, I'd  like to ask for comments on the
> ideias I have:
> 
> 1. Reduce whitelist to the optimal subset: Run various tests on Qemu
> with different configurations to reduce to the smallest syscall set
> possible; test and send a patch weekly (this is already being performed
> and a patch is on the way)

Is this hooked into a testing framework?  While it is always nice to have 
someone verify the correctness, having a simple tool/testsuite what can run 
through things on a regular basis is even better.

Also, looking a bit further ahead, it might be interesting to look at removing 
some of the arch dependent stuff in qemu-seccomp.c.  The latest version of 
libseccomp should remove the need for many, if not all, of the arch specific 
#ifdefs and the next version of libseccomp will add support for x32 and ARM.

> 2. Introduce a second whitelist - the whitelist should be defined in
> libvirt and passed on to qemu or just pre defined in Qemu? Also remove
> execve() and avoid open() and socket() and its parameters ...

If I'm understanding you correctly, I think what you'll want is a second 
*blacklist*.  We talked about this previously; we currently have a single 
whitelist, and considering how seccomp works, you can really only further 
restrict things after you install a whitelist into the kernel (hence the 
blacklist).

> 3. Debugging and/or learning mode - third party libraries still have the
> problem of interfering in the Qemu's signal mask. According to some
> previous discussions, perhaps patch all external libraries that mass up
> with this mask (spice, for example) is a way to solve it. But not sure
> if it worth the time spent. Would like to hear you guys.

I think patching all the libraries is a losing battle, I think we need to 
pursue alternate debugging techniques.

-- 
paul moore
security and virtualization @ redhat

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [RFC] Continuous work on sandboxing
  2013-04-26 21:07 ` Paul Moore
@ 2013-04-26 22:17   ` Paolo Bonzini
  2013-04-29 19:57     ` Eduardo Otubo
  2013-04-29 18:39   ` Eduardo Otubo
  2013-04-29 21:52   ` Corey Bryant
  2 siblings, 1 reply; 16+ messages in thread
From: Paolo Bonzini @ 2013-04-26 22:17 UTC (permalink / raw)
  To: Paul Moore; +Cc: qemu-devel, Eric Paris, Eduardo Otubo

Il 26/04/2013 23:07, Paul Moore ha scritto:
>> > 3. Debugging and/or learning mode - third party libraries still have the
>> > problem of interfering in the Qemu's signal mask. According to some
>> > previous discussions, perhaps patch all external libraries that mass up
>> > with this mask (spice, for example) is a way to solve it. But not sure
>> > if it worth the time spent. Would like to hear you guys.
> I think patching all the libraries is a losing battle, I think we need to 
> pursue alternate debugging techniques.

It is really only about patching libraries that create threads _and_
block all signals in the newly-created thread (to not interfere with the
program's own handling of the signals).  In this case, the per-thread
signals (SIGFPE/SIGSEGV/SIGBUS/SIGSYS/SIGILL) should be left unblocked,
but SIGSYS is often forgotten.

I don't think there are many libraries like this, but fixing SPICE at
least should definitely be welcome.

In fact QEMU's own util/qemu-thread-posix.c does not unblock those
signals.  Eduardo, can you submit a patch for that?

Paolo

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [RFC] Continuous work on sandboxing
  2013-04-26 21:07 ` Paul Moore
  2013-04-26 22:17   ` Paolo Bonzini
@ 2013-04-29 18:39   ` Eduardo Otubo
  2013-04-29 19:24     ` Paul Moore
  2013-04-29 22:02     ` Corey Bryant
  2013-04-29 21:52   ` Corey Bryant
  2 siblings, 2 replies; 16+ messages in thread
From: Eduardo Otubo @ 2013-04-29 18:39 UTC (permalink / raw)
  To: Paul Moore; +Cc: qemu-devel, Eric Paris



On 04/26/2013 06:07 PM, Paul Moore wrote:
> On Friday, April 26, 2013 03:39:33 PM Eduardo Otubo wrote:
>> Hello folks,
>>
>> Resuming the sandboxing work, I'd  like to ask for comments on the
>> ideias I have:
>>
>> 1. Reduce whitelist to the optimal subset: Run various tests on Qemu
>> with different configurations to reduce to the smallest syscall set
>> possible; test and send a patch weekly (this is already being performed
>> and a patch is on the way)
>
> Is this hooked into a testing framework?  While it is always nice to have
> someone verify the correctness, having a simple tool/testsuite what can run
> through things on a regular basis is even better.

Unfortunately it is currently not. I'm running the tests manually, but I 
have in mind some ideas to implement a tool for this purpose.

>
> Also, looking a bit further ahead, it might be interesting to look at removing
> some of the arch dependent stuff in qemu-seccomp.c.  The latest version of
> libseccomp should remove the need for many, if not all, of the arch specific
> #ifdefs and the next version of libseccomp will add support for x32 and ARM.

Tell me more about this. You're saying I can remove the #ifdefs and keep 
the lines like "{ SCMP_SYS(getresuid32), 241 }, " or address these 
syscalls in another way?

>
>> 2. Introduce a second whitelist - the whitelist should be defined in
>> libvirt and passed on to qemu or just pre defined in Qemu? Also remove
>> execve() and avoid open() and socket() and its parameters ...
>
> If I'm understanding you correctly, I think what you'll want is a second
> *blacklist*.  We talked about this previously; we currently have a single
> whitelist, and considering how seccomp works, you can really only further
> restrict things after you install a whitelist into the kernel (hence the
> blacklist).

Yes, that's exactly what I'm planning to do.

>
>> 3. Debugging and/or learning mode - third party libraries still have the
>> problem of interfering in the Qemu's signal mask. According to some
>> previous discussions, perhaps patch all external libraries that mass up
>> with this mask (spice, for example) is a way to solve it. But not sure
>> if it worth the time spent. Would like to hear you guys.
>
> I think patching all the libraries is a losing battle, I think we need to
> pursue alternate debugging techniques.
>

I agree with you. I was just thinking about working with third party 
libraries due to this thread: 
http://lists.gnu.org/archive/html/qemu-devel/2013-02/msg00620.html

-- 
Eduardo Otubo
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [RFC] Continuous work on sandboxing
  2013-04-29 18:39   ` Eduardo Otubo
@ 2013-04-29 19:24     ` Paul Moore
  2013-04-29 22:02     ` Corey Bryant
  1 sibling, 0 replies; 16+ messages in thread
From: Paul Moore @ 2013-04-29 19:24 UTC (permalink / raw)
  To: Eduardo Otubo; +Cc: qemu-devel, Eric Paris

On Monday, April 29, 2013 03:39:57 PM Eduardo Otubo wrote:
> On 04/26/2013 06:07 PM, Paul Moore wrote:
> > On Friday, April 26, 2013 03:39:33 PM Eduardo Otubo wrote:
> > Also, looking a bit further ahead, it might be interesting to look at
> > removing some of the arch dependent stuff in qemu-seccomp.c.  The latest
> > version of libseccomp should remove the need for many, if not all, of the
> > arch specific #ifdefs and the next version of libseccomp will add support
> > for x32 and ARM.
>
> Tell me more about this. You're saying I can remove the #ifdefs and keep
> the lines like "{ SCMP_SYS(getresuid32), 241 }, " or address these
> syscalls in another way?

Yes.  If you are using libseccomp 2.x you shouldn't need any of the #ifdefs in 
the seccomp_whitelist[] variable like there are at present.  As long as you 
aren't using the *_exact() versions of the libseccomp APIs, which QEMU is not, 
the library will do the right thing for the current architecture: syscalls 
that don't exist will be ignored, those that need translation, e.g. socket() 
on x86, will be translated, and those that do exist normally will be handled 
normally.  Anything else I would consider a bug in libseccomp.

There is more to it if you are generating a seccomp filter to support multiple 
simultaneous architectures, e.g. x86 and x86_64, but that doesn't really apply 
with QEMU.

-- 
paul moore
security and virtualization @ redhat

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [RFC] Continuous work on sandboxing
  2013-04-26 22:17   ` Paolo Bonzini
@ 2013-04-29 19:57     ` Eduardo Otubo
  2013-04-29 21:06       ` Paolo Bonzini
  0 siblings, 1 reply; 16+ messages in thread
From: Eduardo Otubo @ 2013-04-29 19:57 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Paul Moore, qemu-devel, Eric Paris



On 04/26/2013 07:17 PM, Paolo Bonzini wrote:
> Il 26/04/2013 23:07, Paul Moore ha scritto:
>>>> 3. Debugging and/or learning mode - third party libraries still have the
>>>> problem of interfering in the Qemu's signal mask. According to some
>>>> previous discussions, perhaps patch all external libraries that mass up
>>>> with this mask (spice, for example) is a way to solve it. But not sure
>>>> if it worth the time spent. Would like to hear you guys.
>> I think patching all the libraries is a losing battle, I think we need to
>> pursue alternate debugging techniques.
>
> It is really only about patching libraries that create threads _and_
> block all signals in the newly-created thread (to not interfere with the
> program's own handling of the signals).  In this case, the per-thread
> signals (SIGFPE/SIGSEGV/SIGBUS/SIGSYS/SIGILL) should be left unblocked,
> but SIGSYS is often forgotten.

But otherwise you have a fast way to test third party linked libraries, 
I would have to test it each one manually. How many libraries are linked 
to Qemu today?

>
> I don't think there are many libraries like this, but fixing SPICE at
> least should definitely be welcome.
>
> In fact QEMU's own util/qemu-thread-posix.c does not unblock those
> signals.  Eduardo, can you submit a patch for that?

I sure can.
-- 
Eduardo Otubo
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [RFC] Continuous work on sandboxing
  2013-04-29 19:57     ` Eduardo Otubo
@ 2013-04-29 21:06       ` Paolo Bonzini
  0 siblings, 0 replies; 16+ messages in thread
From: Paolo Bonzini @ 2013-04-29 21:06 UTC (permalink / raw)
  To: Eduardo Otubo; +Cc: Paul Moore, qemu-devel, Eric Paris

Il 29/04/2013 21:57, Eduardo Otubo ha scritto:
> 
> 
> On 04/26/2013 07:17 PM, Paolo Bonzini wrote:
>> Il 26/04/2013 23:07, Paul Moore ha scritto:
>>>>> 3. Debugging and/or learning mode - third party libraries still
>>>>> have the
>>>>> problem of interfering in the Qemu's signal mask. According to some
>>>>> previous discussions, perhaps patch all external libraries that
>>>>> mass up
>>>>> with this mask (spice, for example) is a way to solve it. But not sure
>>>>> if it worth the time spent. Would like to hear you guys.
>>> I think patching all the libraries is a losing battle, I think we
>>> need to
>>> pursue alternate debugging techniques.
>>
>> It is really only about patching libraries that create threads _and_
>> block all signals in the newly-created thread (to not interfere with the
>> program's own handling of the signals).  In this case, the per-thread
>> signals (SIGFPE/SIGSEGV/SIGBUS/SIGSYS/SIGILL) should be left unblocked,
>> but SIGSYS is often forgotten.
> 
> But otherwise you have a fast way to test third party linked libraries,
> I would have to test it each one manually. How many libraries are linked
> to Qemu today?

I'd wager that most of them are not creating threads.

We could specify our own GThread implementation too (see
https://developer.gnome.org/glib/2.30/glib-Threads.html#GThreadFunctions) that
sets signals correctly, which would cover those libraries that create
threads, but do so via glib.

Another fix would be to set the signal mask around each call to poll.
That's quite expensive however.  There are pselect/ppoll, but Linux
doesn't implement them so they're also expensive.

>> I don't think there are many libraries like this, but fixing SPICE at
>> least should definitely be welcome.
>>
>> In fact QEMU's own util/qemu-thread-posix.c does not unblock those
>> signals.  Eduardo, can you submit a patch for that?
> 
> I sure can.

Thanks,

Paolo

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [RFC] Continuous work on sandboxing
  2013-04-26 21:07 ` Paul Moore
  2013-04-26 22:17   ` Paolo Bonzini
  2013-04-29 18:39   ` Eduardo Otubo
@ 2013-04-29 21:52   ` Corey Bryant
  2013-04-30 15:24     ` Paul Moore
  2 siblings, 1 reply; 16+ messages in thread
From: Corey Bryant @ 2013-04-29 21:52 UTC (permalink / raw)
  To: Paul Moore; +Cc: qemu-devel, Eric Paris, Eduardo Otubo

On 04/26/2013 05:07 PM, Paul Moore wrote:
> [snip]
>
>> >3. Debugging and/or learning mode - third party libraries still have the
>> >problem of interfering in the Qemu's signal mask. According to some
>> >previous discussions, perhaps patch all external libraries that mass up
>> >with this mask (spice, for example) is a way to solve it. But not sure
>> >if it worth the time spent. Would like to hear you guys.
> I think patching all the libraries is a losing battle, I think we need to
> pursue alternate debugging techniques.
>
> -- paul moore security and virtualization @ redhat
>

I agree.  It would be nice to have some sort of learning mode that 
reported all denied syscalls on a single run, but signal handlers 
doesn't seem like the right way.  Maybe we could improve on this 
approach, since it never gained traction: https://lkml.org/lkml/2013/1/7/313

At least we can get a single denied syscall at a time today via the 
audit log that the kernel issues.  Eduardo, you may want to see if 
there's a good place to document that for QEMU so that people know where 
to look.

-- 
Regards,
Corey Bryant

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [RFC] Continuous work on sandboxing
  2013-04-29 18:39   ` Eduardo Otubo
  2013-04-29 19:24     ` Paul Moore
@ 2013-04-29 22:02     ` Corey Bryant
  2013-04-30 18:47       ` Eduardo Otubo
  1 sibling, 1 reply; 16+ messages in thread
From: Corey Bryant @ 2013-04-29 22:02 UTC (permalink / raw)
  To: Eduardo Otubo; +Cc: Paul Moore, qemu-devel, Eric Paris



On 04/29/2013 02:39 PM, Eduardo Otubo wrote:
>
>
> On 04/26/2013 06:07 PM, Paul Moore wrote:
>> On Friday, April 26, 2013 03:39:33 PM Eduardo Otubo wrote:
>>> Hello folks,
>>>
>>> Resuming the sandboxing work, I'd  like to ask for comments on the
>>> ideias I have:
>>>
>>> 1. Reduce whitelist to the optimal subset: Run various tests on Qemu
>>> with different configurations to reduce to the smallest syscall set
>>> possible; test and send a patch weekly (this is already being performed
>>> and a patch is on the way)
>>
>> Is this hooked into a testing framework?  While it is always nice to have
>> someone verify the correctness, having a simple tool/testsuite what
>> can run
>> through things on a regular basis is even better.
>
> Unfortunately it is currently not. I'm running the tests manually, but I
> have in mind some ideas to implement a tool for this purpose.
>

How about testing in KVM autotest?  I assume it would be as simple as 
modifying some existing tests to use -sandbox on.  We definitely should 
get some automated regression tests running with seccomp on.

>>
>> Also, looking a bit further ahead, it might be interesting to look at
>> removing
>> some of the arch dependent stuff in qemu-seccomp.c.  The latest
>> version of
>> libseccomp should remove the need for many, if not all, of the arch
>> specific
>> #ifdefs and the next version of libseccomp will add support for x32
>> and ARM.
>
> Tell me more about this. You're saying I can remove the #ifdefs and keep
> the lines like "{ SCMP_SYS(getresuid32), 241 }, " or address these
> syscalls in another way?
>
>>
>>> 2. Introduce a second whitelist - the whitelist should be defined in
>>> libvirt and passed on to qemu or just pre defined in Qemu? Also remove
>>> execve() and avoid open() and socket() and its parameters ...
>>
>> If I'm understanding you correctly, I think what you'll want is a second
>> *blacklist*.  We talked about this previously; we currently have a single
>> whitelist, and considering how seccomp works, you can really only further
>> restrict things after you install a whitelist into the kernel (hence the
>> blacklist).
>
> Yes, that's exactly what I'm planning to do.
>

Hmm, I thought you were going to introduce a completely new whitelist so 
that a guest could optionally be run under:
1) the existing sandbox environment where everything in QEMU works,
*or*
2) a new tighter and more restricted sandbox environment where things 
like execve() is denied, open() is denied (once the pre-req's are in 
place for fd passing), and potentially other "dangerous" syscalls are 
denied.

If the whitelist for #2 was passed from libvirt to qemu then libvirt 
could define the syscalls and syscall parameters that are denied.

-- 
Regards,
Corey Bryant

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [RFC] Continuous work on sandboxing
  2013-04-29 21:52   ` Corey Bryant
@ 2013-04-30 15:24     ` Paul Moore
  2013-05-01 17:25       ` Eduardo Otubo
  0 siblings, 1 reply; 16+ messages in thread
From: Paul Moore @ 2013-04-30 15:24 UTC (permalink / raw)
  To: Corey Bryant; +Cc: qemu-devel, Eric Paris, Eduardo Otubo

On Monday, April 29, 2013 05:52:10 PM Corey Bryant wrote:
> On 04/26/2013 05:07 PM, Paul Moore wrote:
> > [snip]
> > 
> >> >3. Debugging and/or learning mode - third party libraries still have the
> >> >problem of interfering in the Qemu's signal mask. According to some
> >> >previous discussions, perhaps patch all external libraries that mass up
> >> >with this mask (spice, for example) is a way to solve it. But not sure
> >> >if it worth the time spent. Would like to hear you guys.
> > 
> > I think patching all the libraries is a losing battle, I think we need to
> > pursue alternate debugging techniques.
> 
> I agree.  It would be nice to have some sort of learning mode that
> reported all denied syscalls on a single run, but signal handlers
> doesn't seem like the right way.  Maybe we could improve on this
> approach, since it never gained traction: https://lkml.org/lkml/2013/1/7/313
> 
> At least we can get a single denied syscall at a time today via the
> audit log that the kernel issues.  Eduardo, you may want to see if
> there's a good place to document that for QEMU so that people know where
> to look.

Lately I've been using the fact that the seccomp BPF filter result generates 
an audit log; it either dumps to syslog or the audit log (depending on your 
configuration) and seems to accomplish most of what we wanted with 
SECCOMP_RET_INFO.

I'm always open to new/better ideas, but this has been working reasonably well 
for me for the past few months.

-- 
paul moore
security and virtualization @ redhat

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [RFC] Continuous work on sandboxing
  2013-04-29 22:02     ` Corey Bryant
@ 2013-04-30 18:47       ` Eduardo Otubo
  2013-04-30 20:28         ` Corey Bryant
  0 siblings, 1 reply; 16+ messages in thread
From: Eduardo Otubo @ 2013-04-30 18:47 UTC (permalink / raw)
  To: Corey Bryant; +Cc: Paul Moore, qemu-devel, Eric Paris



On 04/29/2013 07:02 PM, Corey Bryant wrote:
>
>
> On 04/29/2013 02:39 PM, Eduardo Otubo wrote:
>>
>>
>> On 04/26/2013 06:07 PM, Paul Moore wrote:
>>> On Friday, April 26, 2013 03:39:33 PM Eduardo Otubo wrote:
>>>> Hello folks,
>>>>
>>>> Resuming the sandboxing work, I'd  like to ask for comments on the
>>>> ideias I have:
>>>>
>>>> 1. Reduce whitelist to the optimal subset: Run various tests on Qemu
>>>> with different configurations to reduce to the smallest syscall set
>>>> possible; test and send a patch weekly (this is already being performed
>>>> and a patch is on the way)
>>>
>>> Is this hooked into a testing framework?  While it is always nice to
>>> have
>>> someone verify the correctness, having a simple tool/testsuite what
>>> can run
>>> through things on a regular basis is even better.
>>
>> Unfortunately it is currently not. I'm running the tests manually, but I
>> have in mind some ideas to implement a tool for this purpose.
>>
>
> How about testing in KVM autotest?  I assume it would be as simple as
> modifying some existing tests to use -sandbox on.  We definitely should
> get some automated regression tests running with seccomp on.
>
>>>
>>> Also, looking a bit further ahead, it might be interesting to look at
>>> removing
>>> some of the arch dependent stuff in qemu-seccomp.c.  The latest
>>> version of
>>> libseccomp should remove the need for many, if not all, of the arch
>>> specific
>>> #ifdefs and the next version of libseccomp will add support for x32
>>> and ARM.
>>
>> Tell me more about this. You're saying I can remove the #ifdefs and keep
>> the lines like "{ SCMP_SYS(getresuid32), 241 }, " or address these
>> syscalls in another way?
>>
>>>
>>>> 2. Introduce a second whitelist - the whitelist should be defined in
>>>> libvirt and passed on to qemu or just pre defined in Qemu? Also remove
>>>> execve() and avoid open() and socket() and its parameters ...
>>>
>>> If I'm understanding you correctly, I think what you'll want is a second
>>> *blacklist*.  We talked about this previously; we currently have a
>>> single
>>> whitelist, and considering how seccomp works, you can really only
>>> further
>>> restrict things after you install a whitelist into the kernel (hence the
>>> blacklist).
>>
>> Yes, that's exactly what I'm planning to do.
>>
>
> Hmm, I thought you were going to introduce a completely new whitelist so
> that a guest could optionally be run under:
> 1) the existing sandbox environment where everything in QEMU works,
> *or*
> 2) a new tighter and more restricted sandbox environment where things
> like execve() is denied, open() is denied (once the pre-req's are in
> place for fd passing), and potentially other "dangerous" syscalls are
> denied.

I think we're talking about the same thing here. I believe the execution 
flow will happen like this: 1) first whitelist installed, only few 
syscalls allowed. 2) qemu starts 3) given the current scenario (the 
current list of syscalls allowed) the second *blacklist* is installed, 
denying execve and open. 4) start guests.

At the end of step 3, we'll have the same environment we have at step 1, 
without execve and open. Is that correct?

>
> If the whitelist for #2 was passed from libvirt to qemu then libvirt
> could define the syscalls and syscall parameters that are denied.
>

-- 
Eduardo Otubo
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [RFC] Continuous work on sandboxing
  2013-04-30 18:47       ` Eduardo Otubo
@ 2013-04-30 20:28         ` Corey Bryant
  2013-05-01 14:13           ` Paul Moore
  0 siblings, 1 reply; 16+ messages in thread
From: Corey Bryant @ 2013-04-30 20:28 UTC (permalink / raw)
  To: Eduardo Otubo; +Cc: Paul Moore, qemu-devel, Eric Paris



On 04/30/2013 02:47 PM, Eduardo Otubo wrote:
>
>
> On 04/29/2013 07:02 PM, Corey Bryant wrote:
>>
>>
>> On 04/29/2013 02:39 PM, Eduardo Otubo wrote:
>>>
>>>
>>> On 04/26/2013 06:07 PM, Paul Moore wrote:
>>>> On Friday, April 26, 2013 03:39:33 PM Eduardo Otubo wrote:
>>>>> Hello folks,
>>>>>
>>>>> Resuming the sandboxing work, I'd  like to ask for comments on the
>>>>> ideias I have:
>>>>>
>>>>> 1. Reduce whitelist to the optimal subset: Run various tests on Qemu
>>>>> with different configurations to reduce to the smallest syscall set
>>>>> possible; test and send a patch weekly (this is already being
>>>>> performed
>>>>> and a patch is on the way)
>>>>
>>>> Is this hooked into a testing framework?  While it is always nice to
>>>> have
>>>> someone verify the correctness, having a simple tool/testsuite what
>>>> can run
>>>> through things on a regular basis is even better.
>>>
>>> Unfortunately it is currently not. I'm running the tests manually, but I
>>> have in mind some ideas to implement a tool for this purpose.
>>>
>>
>> How about testing in KVM autotest?  I assume it would be as simple as
>> modifying some existing tests to use -sandbox on.  We definitely should
>> get some automated regression tests running with seccomp on.
>>
>>>>
>>>> Also, looking a bit further ahead, it might be interesting to look at
>>>> removing
>>>> some of the arch dependent stuff in qemu-seccomp.c.  The latest
>>>> version of
>>>> libseccomp should remove the need for many, if not all, of the arch
>>>> specific
>>>> #ifdefs and the next version of libseccomp will add support for x32
>>>> and ARM.
>>>
>>> Tell me more about this. You're saying I can remove the #ifdefs and keep
>>> the lines like "{ SCMP_SYS(getresuid32), 241 }, " or address these
>>> syscalls in another way?
>>>
>>>>
>>>>> 2. Introduce a second whitelist - the whitelist should be defined in
>>>>> libvirt and passed on to qemu or just pre defined in Qemu? Also remove
>>>>> execve() and avoid open() and socket() and its parameters ...
>>>>
>>>> If I'm understanding you correctly, I think what you'll want is a
>>>> second
>>>> *blacklist*.  We talked about this previously; we currently have a
>>>> single
>>>> whitelist, and considering how seccomp works, you can really only
>>>> further
>>>> restrict things after you install a whitelist into the kernel (hence
>>>> the
>>>> blacklist).
>>>
>>> Yes, that's exactly what I'm planning to do.
>>>
>>
>> Hmm, I thought you were going to introduce a completely new whitelist so
>> that a guest could optionally be run under:
>> 1) the existing sandbox environment where everything in QEMU works,
>> *or*
>> 2) a new tighter and more restricted sandbox environment where things
>> like execve() is denied, open() is denied (once the pre-req's are in
>> place for fd passing), and potentially other "dangerous" syscalls are
>> denied.
>
> I think we're talking about the same thing here. I believe the execution

I think so, but I'm not entirely sure.

> flow will happen like this: 1) first whitelist installed, only few
> syscalls allowed. 2) qemu starts 3) given the current scenario (the
> current list of syscalls allowed) the second *blacklist* is installed,
> denying execve and open. 4) start guests.

Yes, you could implement the new whitelist this way.

>
> At the end of step 3, we'll have the same environment we have at step 1,
> without execve and open. Is that correct?
>
>>
>> If the whitelist for #2 was passed from libvirt to qemu then libvirt
>> could define the syscalls and syscall parameters that are denied.
>>
>

Just to be clear, I'm thinking you could launch guests in one of two 
different seccomp sandboxed environments:

1) Using the existing and more permissive whitelist where every QEMU 
feature works:

qemu-kvm -sandbox on,default

2) A more restricted whitelist environment that doesn't allow all QEMU 
features to work.  It would be limited to the whitelist in 1 and it 
would also deny things like execve(), open(), socket(), certain ioctl() 
parameters, and may only allow reads/writes to specifc fds, and/or block 
anything else that could be dangerous:

qemu-kvm -sandbox on,restricted

I'm just throwing these command line options and syscalls out there. 
And maybe it makes more sense for libvirt to pass the syscalls and 
parameters to QEMU so that libvirt can determine the parameters to 
restrict, like fd's the guest is allowed to read/write.

Here's another thread where this was discussed:
http://www.redhat.com/archives/libvir-list/2013-April/msg01501.html

-- 
Regards,
Corey Bryant

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [RFC] Continuous work on sandboxing
  2013-04-30 20:28         ` Corey Bryant
@ 2013-05-01 14:13           ` Paul Moore
  2013-05-01 15:30             ` Corey Bryant
  0 siblings, 1 reply; 16+ messages in thread
From: Paul Moore @ 2013-05-01 14:13 UTC (permalink / raw)
  To: Corey Bryant; +Cc: qemu-devel, Eric Paris, Eduardo Otubo

On Tuesday, April 30, 2013 04:28:54 PM Corey Bryant wrote:
> Just to be clear, I'm thinking you could launch guests in one of two
> different seccomp sandboxed environments:
> 
> 1) Using the existing and more permissive whitelist where every QEMU
> feature works:
> 
> qemu-kvm -sandbox on,default

In general, I like the comma delimited list of sandbox filters/methods/etc. 
but I'm not sure we need to explicitly specify "default", it seems like "on" 
would be sufficient.  It also preserved compatibility with what we have now.

> 2) A more restricted whitelist environment that doesn't allow all QEMU
> features to work.  It would be limited to the whitelist in 1 and it
> would also deny things like execve(), open(), socket(), certain ioctl()
> parameters, and may only allow reads/writes to specifc fds, and/or block
> anything else that could be dangerous:
> 
> qemu-kvm -sandbox on,restricted
> 
> I'm just throwing these command line options and syscalls out there.
> And maybe it makes more sense for libvirt to pass the syscalls and
> parameters to QEMU so that libvirt can determine the parameters to
> restrict, like fd's the guest is allowed to read/write.
> 
> Here's another thread where this was discussed:
> http://www.redhat.com/archives/libvir-list/2013-April/msg01501.html

Seems reasonable to me.

-- 
paul moore
security and virtualization @ redhat

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [RFC] Continuous work on sandboxing
  2013-05-01 14:13           ` Paul Moore
@ 2013-05-01 15:30             ` Corey Bryant
  0 siblings, 0 replies; 16+ messages in thread
From: Corey Bryant @ 2013-05-01 15:30 UTC (permalink / raw)
  To: Paul Moore; +Cc: qemu-devel, Eric Paris, Eduardo Otubo



On 05/01/2013 10:13 AM, Paul Moore wrote:
> On Tuesday, April 30, 2013 04:28:54 PM Corey Bryant wrote:
>> Just to be clear, I'm thinking you could launch guests in one of two
>> different seccomp sandboxed environments:
>>
>> 1) Using the existing and more permissive whitelist where every QEMU
>> feature works:
>>
>> qemu-kvm -sandbox on,default
>
> In general, I like the comma delimited list of sandbox filters/methods/etc.
> but I'm not sure we need to explicitly specify "default", it seems like "on"
> would be sufficient.  It also preserved compatibility with what we have now.
>

Yes, I agree.  This should definitely remain backward compatible.

-- 
Regards,
Corey Bryant

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [RFC] Continuous work on sandboxing
  2013-04-30 15:24     ` Paul Moore
@ 2013-05-01 17:25       ` Eduardo Otubo
  2013-05-01 18:04         ` Corey Bryant
  0 siblings, 1 reply; 16+ messages in thread
From: Eduardo Otubo @ 2013-05-01 17:25 UTC (permalink / raw)
  To: Paul Moore; +Cc: Paolo Bonzini, Corey Bryant, qemu-devel, Eric Paris



On 04/30/2013 12:24 PM, Paul Moore wrote:
> On Monday, April 29, 2013 05:52:10 PM Corey Bryant wrote:
>> On 04/26/2013 05:07 PM, Paul Moore wrote:
>>> [snip]
>>>
>>>>> 3. Debugging and/or learning mode - third party libraries still have the
>>>>> problem of interfering in the Qemu's signal mask. According to some
>>>>> previous discussions, perhaps patch all external libraries that mass up
>>>>> with this mask (spice, for example) is a way to solve it. But not sure
>>>>> if it worth the time spent. Would like to hear you guys.
>>>
>>> I think patching all the libraries is a losing battle, I think we need to
>>> pursue alternate debugging techniques.
>>
>> I agree.  It would be nice to have some sort of learning mode that
>> reported all denied syscalls on a single run, but signal handlers
>> doesn't seem like the right way.  Maybe we could improve on this
>> approach, since it never gained traction: https://lkml.org/lkml/2013/1/7/313
>>
>> At least we can get a single denied syscall at a time today via the
>> audit log that the kernel issues.  Eduardo, you may want to see if
>> there's a good place to document that for QEMU so that people know where
>> to look.
>
> Lately I've been using the fact that the seccomp BPF filter result generates
> an audit log; it either dumps to syslog or the audit log (depending on your
> configuration) and seems to accomplish most of what we wanted with
> SECCOMP_RET_INFO.
>
> I'm always open to new/better ideas, but this has been working reasonably well
> for me for the past few months.

I think this feature would fits well on Qemu if we could have a "normal" 
signal handling. But external libraries interfere a lot on this matter.

Paolo, am I the first one to complain about signal handling on Qemu 
(being interfered by other libraries)? I believe this may cause some 
trouble in other parts of the project as well. Wouldn't be this a good 
time to, perhaps, just think about a signal handling refactoring?

Regards,

-- 
Eduardo Otubo
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [RFC] Continuous work on sandboxing
  2013-05-01 17:25       ` Eduardo Otubo
@ 2013-05-01 18:04         ` Corey Bryant
  0 siblings, 0 replies; 16+ messages in thread
From: Corey Bryant @ 2013-05-01 18:04 UTC (permalink / raw)
  To: Eduardo Otubo; +Cc: Paul Moore, Paolo Bonzini, qemu-devel, Eric Paris



On 05/01/2013 01:25 PM, Eduardo Otubo wrote:
>
>
> On 04/30/2013 12:24 PM, Paul Moore wrote:
>> On Monday, April 29, 2013 05:52:10 PM Corey Bryant wrote:
>>> On 04/26/2013 05:07 PM, Paul Moore wrote:
>>>> [snip]
>>>>
>>>>>> 3. Debugging and/or learning mode - third party libraries still
>>>>>> have the
>>>>>> problem of interfering in the Qemu's signal mask. According to some
>>>>>> previous discussions, perhaps patch all external libraries that
>>>>>> mass up
>>>>>> with this mask (spice, for example) is a way to solve it. But not
>>>>>> sure
>>>>>> if it worth the time spent. Would like to hear you guys.
>>>>
>>>> I think patching all the libraries is a losing battle, I think we
>>>> need to
>>>> pursue alternate debugging techniques.
>>>
>>> I agree.  It would be nice to have some sort of learning mode that
>>> reported all denied syscalls on a single run, but signal handlers
>>> doesn't seem like the right way.  Maybe we could improve on this
>>> approach, since it never gained traction:
>>> https://lkml.org/lkml/2013/1/7/313
>>>
>>> At least we can get a single denied syscall at a time today via the
>>> audit log that the kernel issues.  Eduardo, you may want to see if
>>> there's a good place to document that for QEMU so that people know where
>>> to look.
>>
>> Lately I've been using the fact that the seccomp BPF filter result
>> generates
>> an audit log; it either dumps to syslog or the audit log (depending on
>> your
>> configuration) and seems to accomplish most of what we wanted with
>> SECCOMP_RET_INFO.
>>
>> I'm always open to new/better ideas, but this has been working
>> reasonably well
>> for me for the past few months.
>
> I think this feature would fits well on Qemu if we could have a "normal"
> signal handling. But external libraries interfere a lot on this matter.
>
> Paolo, am I the first one to complain about signal handling on Qemu
> (being interfered by other libraries)? I believe this may cause some
> trouble in other parts of the project as well. Wouldn't be this a good
> time to, perhaps, just think about a signal handling refactoring?
>

You don't need signal handling to use what Paul was talking about above. 
  I think that should be enough for -sandbox purposes, but perhaps you 
could document it somewhere for QEMU.

-- 
Regards,
Corey Bryant

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2013-05-01 18:04 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-04-26 18:39 [Qemu-devel] [RFC] Continuous work on sandboxing Eduardo Otubo
2013-04-26 21:07 ` Paul Moore
2013-04-26 22:17   ` Paolo Bonzini
2013-04-29 19:57     ` Eduardo Otubo
2013-04-29 21:06       ` Paolo Bonzini
2013-04-29 18:39   ` Eduardo Otubo
2013-04-29 19:24     ` Paul Moore
2013-04-29 22:02     ` Corey Bryant
2013-04-30 18:47       ` Eduardo Otubo
2013-04-30 20:28         ` Corey Bryant
2013-05-01 14:13           ` Paul Moore
2013-05-01 15:30             ` Corey Bryant
2013-04-29 21:52   ` Corey Bryant
2013-04-30 15:24     ` Paul Moore
2013-05-01 17:25       ` Eduardo Otubo
2013-05-01 18:04         ` Corey Bryant

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).