From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([140.186.70.92]:45195) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RCBok-0007xQ-7F for qemu-devel@nongnu.org; Fri, 07 Oct 2011 10:54:31 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1RCBoi-0004hd-TK for qemu-devel@nongnu.org; Fri, 07 Oct 2011 10:54:30 -0400 Received: from e4.ny.us.ibm.com ([32.97.182.144]:46399) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RCBoi-0004hM-Jm for qemu-devel@nongnu.org; Fri, 07 Oct 2011 10:54:28 -0400 Received: from /spool/local by e4.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Fri, 7 Oct 2011 10:53:11 -0400 Received: from d03av03.boulder.ibm.com (d03av03.boulder.ibm.com [9.17.195.169]) by d01relay03.pok.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id p97Er2NB237890 for ; Fri, 7 Oct 2011 10:53:03 -0400 Received: from d03av03.boulder.ibm.com (loopback [127.0.0.1]) by d03av03.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id p97Eqehq000353 for ; Fri, 7 Oct 2011 08:52:41 -0600 Message-ID: <4E8F1235.7060102@linux.vnet.ibm.com> Date: Fri, 07 Oct 2011 10:52:37 -0400 From: Corey Bryant MIME-Version: 1.0 References: <1317915508-15491-1-git-send-email-rmarwah@linux.vnet.ibm.com> <1317915508-15491-2-git-send-email-rmarwah@linux.vnet.ibm.com> <20111006164148.GF2450@redhat.com> <4E8DEDA5.3050209@us.ibm.com> <4E8DF5C0.8080001@linux.vnet.ibm.com> <20111007090415.GB31228@redhat.com> <4E8F0F78.3000800@linux.vnet.ibm.com> <20111007144536.GH31228@redhat.com> In-Reply-To: <20111007144536.GH31228@redhat.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH 1/4] Add basic version of bridge helper List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Daniel P. Berrange" Cc: Anthony Liguori , Richa Marwaha , qemu-devel@nongnu.org On 10/07/2011 10:45 AM, Daniel P. Berrange wrote: > On Fri, Oct 07, 2011 at 10:40:56AM -0400, Corey Bryant wrote: >> >> >> On 10/07/2011 05:04 AM, Daniel P. Berrange wrote: >>> On Thu, Oct 06, 2011 at 02:38:56PM -0400, Corey Bryant wrote: >>>> >>>> >>>> On 10/06/2011 02:04 PM, Anthony Liguori wrote: >>>>> On 10/06/2011 11:41 AM, Daniel P. Berrange wrote: >>>>>> On Thu, Oct 06, 2011 at 11:38:25AM -0400, Richa Marwaha wrote: >>>>>>> This patch adds a helper that can be used to create a tap device >>>>>>> attached to >>>>>>> a bridge device. Since this helper is minimal in what it does, it can be >>>>>>> given CAP_NET_ADMIN which allows qemu to avoid running as root while >>>>>>> still >>>>>>> satisfying the majority of what users tend to want to do with tap >>>>>>> devices. >>>>>>> >>>>>>> The way this all works is that qemu launches this helper passing a >>>>>>> bridge >>>>>>> name and the name of an inherited file descriptor. The descriptor is one >>>>>>> end of a socketpair() of domain sockets. This domain socket is used to >>>>>>> transmit a file descriptor of the opened tap device from the helper >>>>>>> to qemu. >>>>>>> >>>>>>> The helper can then exit and let qemu use the tap device. >>>>>> >>>>>> When QEMU is run by libvirt, we generally like to use capng to >>>>>> remove the ability for QEMU to run setuid programs at all. So >>>>>> obviously it will struggle to run the qemu-bridge-helper binary >>>>>> in such a scenario. >>>>>> >>>>>> With the way you transmit the TAP device FD back to the caller, >>>>>> it looks like libvirt itself could execute the qemu-bridge-helper >>>>>> receiving the FD, and then pass the FD onto QEMU using the >>>>>> traditional tap,fd=XX syntax. >>>>> >>>>> Exactly. This would allow tap-based networking using libvirt session:// >>>>> URIs. >>>>> >>>> >>>> I'll take note of this. It seems like it would be a nice future >>>> addition to libvirt. >>>> >>>> A slight tangent, but a point on DAC isolation. The helper enables >>>> DAC isolation for qemu:///session but we still need some work in >>>> libvirt to provide DAC isolation for qemu:///system. This could be >>>> done by allowing management applications to specify custom >>>> user/group IDs when creating guests rather than hard coding the IDs >>>> in the configuration file. >>> >>> Yes, this is a item on our todo list for libvirt. There are a couple of >>> work items involved >>> >>> - Extend the XML to allow multiple elements, one per >>> security driver in use. >>> - Add a new API to allow fetching of live seclabel data per >>> security driver >>> - Extend the current DAC security driver to automatically allocate >>> UIDs from an admin defined range, and/or pull them from the XML >>> provided by app. >>> >>> Tecnically we could do item 3, without doing items 1/2, but that would >>> neccessitate *not* using the sVirt security driver. I don't think that's >>> too useful, so items 1/2 let us use both the sVirt& enhanced DAC driver >>> at the same time. >>> >> >> I think I'm missing something here and could use some more details >> to understand 1& 2. Here's what I'm currently picturing. >> >> With DAC isolation: >> QEMU A runs under userA:groupA and QEMU B runs under userB:groupB >> >> versus currently: >> QEMU A runs under qemu:qemu and QEMU B runs under qemu:qemu >> >> In either case, guests A and B have separate domain XML and a single >> unique seclabel, such as this dynamic SELinux label: >> >> >> >> system_u:object_r:svirt_image_t:s0:c633,c712 >> > > If we're going to make the DAC user ID/group ID configurable, then we > need to expose this to application in the XML so that > > a. apps can allocate unique user/group *cluster wide* when shared > filesystems are in use. libvirt can only ensure per-host uniqueness. > > b. apps can know what user/group ID has been allocate to each guest > and this can be reported in virsh dominfo, as with svirt info. > > ie, we'll need something like this: > > > > system_u:object_r:svirt_image_t:s0:c633,c712 > > > > 102:102 > > > > And: > > # virsh dominfo f16x86_64 > Id: 29 > Name: f16x86_64 > UUID: 1e9f3097-0a45-ea06-d0d8-40507999a1cd > OS Type: hvm > State: running > CPU(s): 1 > CPU time: 19.5s > Max memory: 819200 kB > Used memory: 819200 kB > Persistent: yes > Autostart: disable > Security model: selinux > Security DOI: 0 > Security label: system_u:system_r:svirt_t:s0:c244,c424 (permissive) > Security model: dac > Security DOI: 0 > Security label: 102:102 (enforcing) > > Regards, > Daniel Ah, yes. That makes complete sense. Thanks for the clarification. -- Regards, Corey