From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:38177) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SuFDC-0008T3-SW for qemu-devel@nongnu.org; Wed, 25 Jul 2012 23:58:08 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1SuFDB-0003J9-Fv for qemu-devel@nongnu.org; Wed, 25 Jul 2012 23:58:06 -0400 Received: from e32.co.us.ibm.com ([32.97.110.150]:38158) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SuFDB-0003Hv-6R for qemu-devel@nongnu.org; Wed, 25 Jul 2012 23:58:05 -0400 Received: from /spool/local by e32.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 25 Jul 2012 21:57:42 -0600 Received: from d03relay04.boulder.ibm.com (d03relay04.boulder.ibm.com [9.17.195.106]) by d03dlp01.boulder.ibm.com (Postfix) with ESMTP id 3D8FA1FF001A for ; Thu, 26 Jul 2012 03:57:37 +0000 (WET) Received: from d03av06.boulder.ibm.com (d03av06.boulder.ibm.com [9.17.195.245]) by d03relay04.boulder.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id q6Q3vdXo149306 for ; Wed, 25 Jul 2012 21:57:39 -0600 Received: from d03av06.boulder.ibm.com (loopback [127.0.0.1]) by d03av06.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id q6Q3wnVV014861 for ; Wed, 25 Jul 2012 21:58:49 -0600 Message-ID: <5010C031.9040602@linux.vnet.ibm.com> Date: Wed, 25 Jul 2012 23:57:37 -0400 From: Corey Bryant MIME-Version: 1.0 References: <1343048885-1701-1-git-send-email-coreyb@linux.vnet.ibm.com> <1343048885-1701-7-git-send-email-coreyb@linux.vnet.ibm.com> <50104C44.1050206@redhat.com> In-Reply-To: <50104C44.1050206@redhat.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH v5 6/6] block: Enable qemu_open/close to work with fd sets List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Eric Blake Cc: kwolf@redhat.com, aliguori@us.ibm.com, stefanha@linux.vnet.ibm.com, libvir-list@redhat.com, qemu-devel@nongnu.org, lcapitulino@redhat.com On 07/25/2012 03:43 PM, Eric Blake wrote: > On 07/23/2012 07:08 AM, Corey Bryant wrote: >> When qemu_open is passed a filename of the "/dev/fdset/nnn" >> format (where nnn is the fdset ID), an fd with matching access >> mode flags will be searched for within the specified monitor >> fd set. If the fd is found, a dup of the fd will be returned >> from qemu_open. >> >> Each fd set has a reference count. The purpose of the reference >> count is to determine if an fd set contains file descriptors that >> have open dup() references that have not yet been closed. It is >> incremented on qemu_open and decremented on qemu_close. It is >> not until the refcount is zero that file desriptors in an fd set >> can be closed. If an fd set has dup() references open, then we >> must keep the other fds in the fd set open in case a reopen >> of the file occurs that requires an fd with a different access >> mode. >> > >> +++ b/monitor.c >> @@ -2551,6 +2551,91 @@ static void monitor_fdsets_set_in_use(Monitor *mon, bool in_use) >> } >> } >> >> +void monitor_fdset_increment_refcount(Monitor *mon, int64_t fdset_id) >> +{ >> + mon_fdset_t *mon_fdset; >> + >> + if (!mon) { >> + return; >> + } > > Am I reading this code right by stating that 'if there is no monitor, we > don't increment the refcount'? How does a monitor reattach affect > things? Or am I missing something fundamental about the cases when > 'mon==NULL' will exist? > Yes you're reading this correctly. I'm pretty sure that mon will only be NULL if QEMU is started without a monitor. If QEMU has a monitor, and libvirt disconnects it's connection to the qemu monitor, then I believe mon will remain non-NULL. I'll plan on testing this out to verify though. (I'm out most of this week and will be back full time starting next Tues.) >> +int monitor_fdset_get_fd(Monitor *mon, int64_t fdset_id, int flags) >> +{ >> + mon_fdset_t *mon_fdset; >> + mon_fdset_fd_t *mon_fdset_fd; >> + int mon_fd_flags; >> + >> + if (!mon) { >> + errno = ENOENT; >> + return -1; >> + } >> + >> + QLIST_FOREACH(mon_fdset, &mon->fdsets, next) { >> + if (mon_fdset->id != fdset_id) { >> + continue; >> + } >> + QLIST_FOREACH(mon_fdset_fd, &mon_fdset->fds, next) { >> + if (mon_fdset_fd->removed) { >> + continue; >> + } >> + >> + mon_fd_flags = fcntl(mon_fdset_fd->fd, F_GETFL); >> + if (mon_fd_flags == -1) { >> + return -1; > > This says we fail on the first fcntl() failure, instead of trying other > fds in the set. Granted, an fcntl() failure is probably the sign of a > bigger bug (such as closing an fd at the wrong point in time), so I > guess trying to go on doesn't make much sense once we already know we > are hosed. > I think I'll stick with it the way it is. If fcntl() fails we might have a tainted fd set so I think we should fail. >> + } >> + >> + switch (flags & O_ACCMODE) { >> + case O_RDWR: >> + if ((mon_fd_flags & O_ACCMODE) == O_RDWR) { >> + return mon_fdset_fd->fd; >> + } >> + break; >> + case O_RDONLY: >> + if ((mon_fd_flags & O_ACCMODE) == O_RDONLY) { >> + return mon_fdset_fd->fd; >> + } >> + break; > > Do we want to allow the case where the caller asked for O_RDONLY, but > the set only has O_RDWR? After all, the caller is getting a compatible > subset of what the set offers. > I don't see a problem with it. >> + case O_WRONLY: >> + if ((mon_fd_flags & O_ACCMODE) == O_WRONLY) { >> + return mon_fdset_fd->fd; >> + } >> + break; > > Likewise, should we allow a caller asking for O_WRONLY when the set > provides only O_RDWR? > I don't see a problem with it. >> >> +/* >> + * Dups an fd and sets the flags >> + */ >> +static int qemu_dup(int fd, int flags) >> +{ >> + int i; >> + int ret; >> + int serrno; >> + int dup_flags; >> + int setfl_flags[] = { O_APPEND, O_ASYNC, O_DIRECT, O_NOATIME, >> + O_NONBLOCK, 0 }; >> + >> + if (flags & O_CLOEXEC) { >> + ret = fcntl(fd, F_DUPFD_CLOEXEC, 0); > > F_DUPFD_CLOEXEC is required by POSIX but not implemented on all modern > OS yet; you probably need some #ifdef and/or configure guards. > Ok >> + if (ret == -1 && errno == EINVAL) { >> + ret = dup(fd); >> + if (ret != -1 && fcntl_setfl(ret, O_CLOEXEC) == -1) { > > You _can't_ call F_SETFL with O_CLOEXEC. O_CLOEXEC only causes open() > to set FD_CLOEXEC; thereafter, including in the case of this dup, what > you want to do is instead set FD_CLOEXEC via F_SETFD (aka call > qemu_set_cloexec, not fcntl_setfl). > I know, this is a mistake. I'm planning to replace fcntl_setfl() here with qemu_set_cloexec(). >> + goto fail; >> + } >> + } >> + } else { >> + ret = dup(fd); >> + } >> + >> + if (ret == -1) { >> + goto fail; >> + } >> + >> + dup_flags = fcntl(ret, F_GETFL); >> + if (dup_flags == -1) { >> + goto fail; >> + } >> + >> + if ((flags & O_SYNC) != (dup_flags & O_SYNC)) { >> + errno = EINVAL; >> + goto fail; >> + } >> + >> + /* Set/unset flags that we can with fcntl */ >> + i = 0; >> + while (setfl_flags[i] != 0) { >> + if (flags & setfl_flags[i]) { >> + dup_flags = (dup_flags | setfl_flags[i]); >> + } else { >> + dup_flags = (dup_flags & ~setfl_flags[i]); >> + } >> + i++; >> + } > > Rather than looping one bit at a time, you should do this as a mask > operation. > I agree. >> + >> + if (fcntl(ret, F_SETFL, dup_flags) == -1) { >> + goto fail; >> + } >> + >> + /* Truncate the file in the cases that open() would truncate it */ >> + if (flags & O_TRUNC || >> + ((flags & (O_CREAT | O_EXCL)) == (O_CREAT | O_EXCL))) { >> + if (ftruncate(ret, 0) == -1) { >> + goto fail; >> + } >> + } >> + >> + qemu_set_cloexec(ret); > > If we're going to blindly set FD_CLOEXEC at the end of the day, rather > than try to honor O_CLOEXEC, then why not simplify the beginning of this > function: This call to qemu_set_cloexec() was a mistake. I'm planning on removing it. > > ret = fcntl(fd, F_DUPFD_CLOEXEC, 0); > if (ret == -1 && errno == EINVAL) { > ret = dup(fd); > if (ret != -1) { > qemu_set_cloexec(ret); > } > } > if (ret == -1) { > goto fail; > } > I'll plan on sticking with the existing code in the beginning of this function with the modifications mentioned above. -- Regards, Corey