From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:59536) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SoVGt-0001AF-05 for qemu-devel@nongnu.org; Tue, 10 Jul 2012 03:54:15 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1SoVGq-0007nj-MM for qemu-devel@nongnu.org; Tue, 10 Jul 2012 03:54:10 -0400 Received: from mx1.redhat.com ([209.132.183.28]:6054) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SoVGq-0007nc-DX for qemu-devel@nongnu.org; Tue, 10 Jul 2012 03:54:08 -0400 Message-ID: <4FFBDF94.6000303@redhat.com> Date: Tue, 10 Jul 2012 09:53:56 +0200 From: Kevin Wolf MIME-Version: 1.0 References: <1340390174-7493-1-git-send-email-coreyb@linux.vnet.ibm.com> <20120626091004.GA14451@redhat.com> <4FE9A0F0.2050809@redhat.com> <20120626175045.2c7011b3@doriath.home> <4FEA37A9.10707@linux.vnet.ibm.com> <4FEA3D9C.8080205@redhat.com> <4FF21A67.8010100@linux.vnet.ibm.com> <4FF31265.1000308@linux.vnet.ibm.com> <4FF316C9.5020100@redhat.com> <4FF31CFD.7030508@linux.vnet.ibm.com> <4FF325C8.4060401@redhat.com> <4FF33004.5030909@linux.vnet.ibm.com> <4FF33349.10404@redhat.com> <4FF3381D.40101@linux.vnet.ibm.com> <4FF3FA22.6090400@redhat.com> <4FF5AD90.8000305@linux.vnet.ibm.com> <20120709110510.12214347@doriath.home> <4FFAF334.9000807@linux.vnet.ibm.com> <4FFAFCB8.8020508@redhat.com> <4FFB1657.1090405@linux.vnet.ibm.com> In-Reply-To: <4FFB1657.1090405@linux.vnet.ibm.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Corey Bryant Cc: aliguori@us.ibm.com, stefanha@linux.vnet.ibm.com, libvir-list@redhat.com, qemu-devel@nongnu.org, Luiz Capitulino , pbonzini@redhat.com, Eric Blake Am 09.07.2012 19:35, schrieb Corey Bryant: > > > On 07/09/2012 11:46 AM, Kevin Wolf wrote: >> Am 09.07.2012 17:05, schrieb Corey Bryant: >>> I'm not sure this is an issue with current design. I know things have >>> changed a bit as the email threads evolved, so I'll paste the current >>> design that I am working from. Please let me know if you still see any >>> issues. >>> >>> FD passing: >>> ----------- >>> New monitor commands enable adding/removing an fd to/from a set. New >>> monitor command query-fdsets enables querying of current monitor fdsets. >>> The set of fds should all refer to the same file, with each fd having >>> different access flags (ie. O_RDWR, O_RDONLY). qemu_open can then dup >>> the fd that has the matching access mode flags. >>> >>> Design points: >>> -------------- >>> 1. add-fd >>> -> fd is passed via SCM rights and qemu adds fd to first unused fdset >>> (e.g. /dev/fdset/1) >>> -> add-fd monitor function initializes the monitor inuse flag for the >>> fdset to true >>> -> add-fd monitor function initializes the remove flag for the fd to false >>> -> add-fd returns fdset number and received fd number (e.g fd=3) to caller >>> >>> 2. drive_add file=/dev/fdset/1 >>> -> qemu_open uses the first fd in fdset1 that has access flags matching >>> the qemu_open action flags and has remove flag set to false >>> -> qemu_open increments refcount for the fdset >>> -> Need to make sure that if a command like 'device-add' fails that >>> refcount is not incremented >>> >>> 3. add-fd fdset=1 >>> -> fd is passed via SCM rights >>> -> add-fd monitor function adds the received fd to the specified fdset >>> (or fails if fdset doesn't exist) >>> -> add-fd monitor function initializes the remove flag for the fd to false >>> -> add-fd returns fdset number and received fd number (e.g fd=4) to caller >>> >>> 4. block-commit >>> -> qemu_open performs "reopen" by using the first fd from the fdset that >>> has access flags matching the qemu_open action flags and has remove flag >>> set to false >>> -> qemu_open increments refcount for the fdset >>> -> Need to make sure that if a command like 'block-commit' fails that >>> refcount is not incremented >>> >>> 5. remove-fd fdset=1 fd=4 >>> -> remove-fd monitor function fails if fdset doesn't exist >>> -> remove-fd monitor function turns on remove flag for fd=4 >> >> What was again the reason why we keep removed fds in the fdset at all? > > Because if refcount is > 0 for the fd set, then the fd could be in use > by a block device. So we keep it around until refcount is decremented > to zero, at which point it is safe to close. > >> >> The removed flag would make sense for a fdset after a hypothetical >> close-fdset call because the fdset needs to be kept around until the >> last user closes it, but I think removed fds can be deleted immediately. > > fds in an fd set really need to be kept around until zero block devices > reference them. At that point, if '(refcount == 0 && (!inuse || > remove))' is true, then we'll officially close the fd. Block devices don't reference an fd in the fdset. There are two references in a block device. The first one is obviously the file descriptor they are using; it is a fd dup()ed from an fd in the fdset, but it's now independent of it. The other reference is the file name that is kept in the BlockDriverState, and it always points to "/dev/fdset/X", that is, the whole fdset instead of a single fd. What happens if you remove a file descriptor from an fdset that is in use, is that you can't reopen the fdset with the flags of the removed file descriptor any more. Which I believe is exactly the expected behaviour. libvirt would use this to revoke r/w access, for example (and which behaviour you already provide by checking removed in qemu_open). Are there any other use cases where it makes a difference whether a file descriptor is kept in the fdset with removed=1 or whether it's actually removed from the fdset? >> I think I might have confused remove-fd and close-fdset in earlier >> emails in this thread, so I hope this isn't inconsistent with what I >> said before. >> > > Ok no problem. > >>> 6. qemu_close (need to replace all close calls in block layer with >>> qemu_close) >>> -> qemu_close decrements refcount for fdset >>> -> qemu_close closes all fds that have (refcount == 0 && (!inuse || remove)) >>> -> qemu_close frees the fdset if no fds remain in it >>> >>> 7. disconnecting the QMP monitor >>> -> monitor disconnect visits all fdsets on monitor and turns off monitor >>> in-use flag for fdset >> >> And close all fds with refcount == 0. >> > > Yes, this makes sense. > > It also makes sense to close removed fds with refcount == 0 in the > remove-fd function. Basically this will be the same thing we do in > qemu_close. We'll close any fds that evaulate the following as true: > > (refcount == 0 && (!inuse || remove)) Yes, whatever condition we'll come up with, but it should be the same and checked in all places where its value might change. Kevin