From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:45735) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Sfbss-0007qy-Ut for qemu-devel@nongnu.org; Fri, 15 Jun 2012 15:08:40 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Sfbsq-0000PI-IZ for qemu-devel@nongnu.org; Fri, 15 Jun 2012 15:08:38 -0400 Received: from e3.ny.us.ibm.com ([32.97.182.143]:41556) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Sfbsq-0000OB-DI for qemu-devel@nongnu.org; Fri, 15 Jun 2012 15:08:36 -0400 Received: from /spool/local by e3.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Fri, 15 Jun 2012 15:08:31 -0400 Received: from d01relay07.pok.ibm.com (d01relay07.pok.ibm.com [9.56.227.147]) by d01dlp01.pok.ibm.com (Postfix) with ESMTP id BFF3438C8105 for ; Fri, 15 Jun 2012 15:03:05 -0400 (EDT) Received: from d01av02.pok.ibm.com (d01av02.pok.ibm.com [9.56.224.216]) by d01relay07.pok.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id q5FJ33YX30998668 for ; Fri, 15 Jun 2012 15:03:03 -0400 Received: from d01av02.pok.ibm.com (loopback [127.0.0.1]) by d01av02.pok.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id q5FJ33WY002231 for ; Fri, 15 Jun 2012 16:03:03 -0300 Message-ID: <4FDB86E0.4070307@linux.vnet.ibm.com> Date: Fri, 15 Jun 2012 15:02:56 -0400 From: Corey Bryant MIME-Version: 1.0 References: <1339689305-27031-1-git-send-email-coreyb@linux.vnet.ibm.com> <1339689305-27031-4-git-send-email-coreyb@linux.vnet.ibm.com> <4FDB51E8.8060406@redhat.com> <4FDB7C1B.8020802@linux.vnet.ibm.com> <4FDB8218.1010804@redhat.com> In-Reply-To: <4FDB8218.1010804@redhat.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH v3 3/5] osdep: Enable qemu_open to dup pre-opened fd List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Eric Blake Cc: kwolf@redhat.com, aliguori@us.ibm.com, stefanha@linux.vnet.ibm.com, libvir-list@redhat.com, qemu-devel@nongnu.org, lcapitulino@redhat.com, pbonzini@redhat.com On 06/15/2012 02:42 PM, Eric Blake wrote: > On 06/15/2012 12:16 PM, Corey Bryant wrote: > >>> I think you need to honor flags so that the end use of the fd will be as >>> if qemu had directly opened the file, rather than just doing a blind dup >>> with a resulting fd that is in a different state than the caller >>> expected. I can think of at least the following cases (there may be >>> more): >> >> I was thinking libvirt would handle all the flag settings on open >> (obviously since that's how I coded it). I think you're right with this >> approach though as QEMU will re-open the same file various times with >> different flags. > > How will libvirt know whether qemu wanted to open a file with O_TRUNCATE > vs. reusing an entire file, or with O_NONBLOCK or not, and so forth? I > think there are just too many qmeu_open() calls with different flags to > expect libvirt to know how to pre-open all possible fds in such a manner > that /dev/fd/nnn will be a valid replacement for what qemu would have > done, in the cases where qemu can instead correct flags itself. > I agree completely. >> >> There are some flags that I don't think we'll be able to change. For >> example: O_RDONLY, O_WRONLY, O_RDWR. I assume libvirt would open all >> files O_RDWR. > > I agree that this can't be changed, so this is one case where libvirt > will have to know what the file will used for. But it is also a case > where qemu can at least check whether the fd passed in has sufficient > permissions (fcntl(F_GETFL)) for what qemu wanted to use, and should > error out here rather than silently succeed here then have a weird EBADF > failure down the road when the dup'd fd is finally used. You are right > that libvirt should always be safe in passing in an O_RDWR fd for > whatever qemu wants, although that may represent security holes (there's > reasons for O_RDONLY); and in cases where libvirt is instead passing in > one end of a pipe, the fd will necessarily be O_RDONLY or O_WRONLY > (since you can't have an O_RDWR pipe). Good points. In addition to flag setting, I'll add some flag checking for those flags that can't be set (ie. O_RDONLY, O_WRONLY). > >>> Oh, and are we using MSG_CMSG_CLOEXEC where possible (and where not >>> possible, falling back to fcntl(F_GETFD/F_SETFD) to set FD_CLOEXEC) on >>> all fds received by 'getfd' and 'pass-fd'? I can't think of any reason >>> why 'migrate fd:name' would need to be inheritable, and in the case of >>> /dev/fd/ parsing, while the dup() result may need to be inheritable, the >>> original that we are dup'ing from should most certainly be cloexec. >> >> It doesn't look like we use MSG_CMSG_CLOEXEC anywhere in QEMU. I don't >> think we can modify getfd at this point (compatibility) but we could >> update pass-fd to set it. It may make more sense to set it with >> fcntl(FD_CLOEXEC) in qmp_pass_fd(). > > That's not atomic. If we don't care about atomicity (for example, if we > can guarantee that no other thread in qemu can possibly fork/exec in > between the monitor thread receiving an fd then converting it to > cloexec, based on how we hold a mutex), then that's fine. OR, if we > make sure _all_ fork() calls sanitize themselves, by closing all fds in > the getfd/pass-fd list prior to calling exec(), then we don't have to > even worry about cloexec (but then you have to worry about locking the > getfd name list, since locking a list to iterate it is not an async-safe > action and probably can't be done between fork() and exec()). > Otherwise, using MSG_CMSG_CLOEXEC is the only safe way to avoid leaking > unintended fds into another thread's child process. Ok I'm sold. I'll first look into setting MSG_CMSG_CLOEXEC. > >> >>> >>> if (flags & O_NONBLOCK) >>> use fcntl(F_GETFL/F_SETFL) to set O_NONBLOCK >>> else >>> use fcntl(F_GETFL/F_SETFL) to clear O_NONBLOCK >>> >>> or maybe we document that callers of pass-fd must always pass fds with >>> O_NONBLOCK clear instead of clearing it ourselves. Or maybe we make >>> sure part of the process of tying name with fd in the lookup list of >>> named fds is determining the current O_NONBLOCK state in case future >>> qemu_open() need it in the opposite state. >> >> Just documenting it seems error-prone. Why not just set/clear it based >> on the flag passed to qemu_open? > > Yep, back to my argument that making libvirt have to know every context > that qemu will want to open, and what flags it would be using in those > contexts, is painful compared to having qemu just do the right thing in > the first place, or gracefully erroring out right away at the point of > the qemu_open(/dev/fd/nnn). > I agree. -- Regards, Corey