From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:47770) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TVV80-0004rs-E9 for qemu-devel@nongnu.org; Mon, 05 Nov 2012 17:26:46 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1TVV7w-0002Jk-83 for qemu-devel@nongnu.org; Mon, 05 Nov 2012 17:26:44 -0500 Received: from e34.co.us.ibm.com ([32.97.110.152]:35304) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TVV7w-0002J7-1Y for qemu-devel@nongnu.org; Mon, 05 Nov 2012 17:26:40 -0500 Received: from /spool/local by e34.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Mon, 5 Nov 2012 15:26:34 -0700 Received: from d03relay04.boulder.ibm.com (d03relay04.boulder.ibm.com [9.17.195.106]) by d03dlp01.boulder.ibm.com (Postfix) with ESMTP id 5EA681FF0042 for ; Mon, 5 Nov 2012 15:26:29 -0700 (MST) Received: from d03av01.boulder.ibm.com (d03av01.boulder.ibm.com [9.17.195.167]) by d03relay04.boulder.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id qA5MQV9V241884 for ; Mon, 5 Nov 2012 15:26:31 -0700 Received: from d03av01.boulder.ibm.com (loopback [127.0.0.1]) by d03av01.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id qA5MQUEY032156 for ; Mon, 5 Nov 2012 15:26:31 -0700 Message-ID: <50983D13.3070901@linux.vnet.ibm.com> Date: Mon, 05 Nov 2012 17:26:27 -0500 From: Corey Bryant MIME-Version: 1.0 References: <1350971732-16621-1-git-send-email-otubo@linux.vnet.ibm.com> <1613380.rRprVCBI9z@sifl> <5097CFB2.1060104@linux.vnet.ibm.com> <676080355.TcBOoEia2G@sifl> In-Reply-To: <676080355.TcBOoEia2G@sifl> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCHv2 3/4] Support for "double whitelist" filters List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Paul Moore Cc: aliguori@us.ibm.com, qemu-devel@nongnu.org, Eduardo Otubo On 11/05/2012 04:58 PM, Paul Moore wrote: > On Monday, November 05, 2012 09:39:46 AM Corey Bryant wrote: >> On 11/02/2012 06:14 PM, Paul Moore wrote: >>> On Friday, November 02, 2012 06:00:29 PM Corey Bryant wrote: >>>> On 11/02/2012 05:29 PM, Paul Moore wrote: >>>>> On Tuesday, October 23, 2012 03:55:31 AM Eduardo Otubo wrote: >>>>>> This patch includes a second whitelist right before the main loop. It's >>>>>> a smaller and more restricted whitelist, excluding execve() among many >>>>>> others. >>>>>> >>>>>> v2: * ctx changed to main_loop_ctx >>>>>> >>>>>> * seccomp_on now inside ifdef >>>>>> * open syscall added to the main_loop whitelist >>>>>> >>>>>> Signed-off-by: Eduardo Otubo >>>>> >>>>> Unfortunately qemu.org seems to be down for me today so I can't grab the >>>>> latest repo to review/verify this patch (some of my comments/assumptions >>>>> below may be off) but I'm a little confused, hopefully you guys can help >>>>> me out, read below ... >>>>> >>>>> The first call to seccomp_install_filter() will setup a whitelist for >>>>> the >>>>> syscalls that have been explicitly specified, all others will hit the >>>>> default action TRAP/KILL. The second call to seccomp_install_filter() >>>>> will add a second whitelist for another set of explicitly specified >>>>> syscalls, all others will hit the default action TRAP/KILL. >>>> >>>> That's correct. The goal was to have a 2nd list that is a subset of the >>>> 1st list, and also not include execve() in the 2nd list. At this point >>>> though, since it's late in the release, we've expanded the 2nd list to >>>> be the same as the 1st with the exception of execve() not being in the >>>> 2nd list. >>>> >>>>> The problem occurs when the filters are executed in the kernel when a >>>>> syscall is executed. On each syscall the first filter will be executed >>>>> and the action will either be ALLOW or TRAP/KILL, next the second filter >>>>> will be executed and the action will either be ALLOW or TRAP/KILL; since >>>>> the kernel always takes the most restrictive (lowest integer action >>>>> value) action when multiple filters are specified, I think your double >>>>> whitelist value is going to have some inherent problems. >>>> >>>> That's something I hadn't thought of. But TRAP and KILL won't exist >>>> together in our whitelists, and our 2nd whitelist is a subset of the >>>> 1st. So do you think there would still be problems? >>> >>> It doesn't really matter if the default action is TRAP and/or KILL, the >>> point is that if you use a second whitelist after an initial whitelist >>> the effective seccomp filter is going to be only the syscalls you >>> explicitly allowed in the second whitelist. When using multiple seccomp >>> filters on a process, all filters are executed for each syscall and the >>> most restrictive action of all the filters is the action that the kernel >>> takes. >>> >>> Don't get me wrong, I like the idea of progressively restricting QEMU, but >>> if you are going to load multiple seccomp filters into the kernel, you >>> almost certainly only want the first whitelist filter to be the union of >>> all the seccomp filter you intend to load with all subsequent filters >>> being blacklists which progressively remove syscalls which are allowed by >>> the initial whitelist. >> >> That's what we're doing though. The first whitelist is a union of all >> subsequent filters. Of course there's only one subsequent filter at >> this point. But the idea is to start out with a large whitelist for >> initialization and then tighten it up before the main loop when >> presumably less syscalls are needed. > > Okay, that's good ... It still seems a bit odd to me, I think a whitelist 1st > blacklist 2nd is a more intuitive and efficient solution but that may just be > me. > I missed the blacklist point on this before. Yes, that makes more sense 2nd list. We'll try that out. -- Regards, Corey Bryant