From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:57989) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Sm7Bu-00010a-HT for qemu-devel@nongnu.org; Tue, 03 Jul 2012 13:47:13 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Sm7Bn-00055Q-Lm for qemu-devel@nongnu.org; Tue, 03 Jul 2012 13:47:10 -0400 Received: from e9.ny.us.ibm.com ([32.97.182.139]:47098) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Sm7Bn-00054m-Ha for qemu-devel@nongnu.org; Tue, 03 Jul 2012 13:47:03 -0400 Received: from /spool/local by e9.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 3 Jul 2012 13:46:59 -0400 Received: from d01relay02.pok.ibm.com (d01relay02.pok.ibm.com [9.56.227.234]) by d01dlp01.pok.ibm.com (Postfix) with ESMTP id A6F8338C8080 for ; Tue, 3 Jul 2012 13:46:56 -0400 (EDT) Received: from d03av06.boulder.ibm.com (d03av06.boulder.ibm.com [9.17.195.245]) by d01relay02.pok.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id q63Hkuv4399578 for ; Tue, 3 Jul 2012 13:46:56 -0400 Received: from d03av06.boulder.ibm.com (loopback [127.0.0.1]) by d03av06.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id q63Hlm9o019725 for ; Tue, 3 Jul 2012 11:47:48 -0600 Message-ID: <4FF33004.5030909@linux.vnet.ibm.com> Date: Tue, 03 Jul 2012 13:46:44 -0400 From: Corey Bryant MIME-Version: 1.0 References: <1340390174-7493-1-git-send-email-coreyb@linux.vnet.ibm.com> <20120626091004.GA14451@redhat.com> <4FE9A0F0.2050809@redhat.com> <20120626175045.2c7011b3@doriath.home> <4FEA37A9.10707@linux.vnet.ibm.com> <4FEA3D9C.8080205@redhat.com> <4FF21A67.8010100@linux.vnet.ibm.com> <4FF31265.1000308@linux.vnet.ibm.com> <4FF316C9.5020100@redhat.com> <4FF31CFD.7030508@linux.vnet.ibm.com> <4FF325C8.4060401@redhat.com> In-Reply-To: <4FF325C8.4060401@redhat.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Eric Blake Cc: Kevin Wolf , aliguori@us.ibm.com, stefanha@linux.vnet.ibm.com, libvir-list@redhat.com, qemu-devel@nongnu.org, Luiz Capitulino , pbonzini@redhat.com On 07/03/2012 01:03 PM, Eric Blake wrote: > On 07/03/2012 10:25 AM, Corey Bryant wrote: > >>> I thought qemu would rather return the number of the fdset (which it >>> also assigns if none it passed, i.e. for fdset creation). Does libvirt >>> need the number of an individual fd? >>> >>> If libvirt prefers to assign fdset numbers itself, I'm not against it, >>> it's just something that wasn't clear to me yet. >>> >> >> That's fine. QEMU can return the fdset number or a string >> (/dev/fdset/1) if none is specified. And an fdset will need to be >> specified if adding to an existing set. >> >> I think libvirt will need the fd returned by add-fd so that it can >> evaluate fds returned by query-fd. It's also useful for remove-fd. > > Correct - since we will be adding a remove-fd, then that command needs > to know both the fdset name and the individual fd within the set to be > removed. > Ok >> >>>> 2. drive_add file=/dev/fdset/1 -> qemu_open uses the first fd from the >>>> set that has access flags matching the qemu_open action flags. >>>> qemu_open increments refcount for this fd. >>>> 3. add-fd /dev/fdset/1 FDSET={M} -> qemu adds fd to set named >>>> "/dev/fdset/1" - command returns qemu fd to caller (e.g fd=5). libvirt >>>> in-use flag turned on for fd. >>>> 3. block-commit -> qemu_open reopens "/dev/fdset/1" by using the first >>>> fd from the set that has access flags matching the qemu_open action >>>> flags. qemu_open increments refcount for this fd. >>>> 4. remove-fd /dev/fdset/1 5 -> caller requests fd==5 be removed from the >>>> set. turns libvirt in-use flag off marking the fd ready to be closed >>>> when qemu is done with it. >>> >>> If we decided to not return the individual fd numbers to libvirt, file >>> descriptors would be uniquely identified by an fdset/flags pair here. >>> >> >> Are you saying we'd pass the fdset name and flags parameters on >> remove-fd to somehow identify the fds to remove? > > Passing the flag parameters is not trivial, as that would mean the QMP > code would have to define constants mapping to all of the O_* flags that > qemu_open supports. It's easier to support closing by fd number. > I understand what you were saying now, although I guess it's not applicable at this point. I'll plan on returning the fd from add-fd and passing the fd on the remove-fd command. >> >>>> 5. qemu_close decrements refcount for fd, and closes fd when refcount is >>>> zero and libvirt in use flag is off. >>> >>> The monitor could just hold another reference, then we save the >>> additional flag. But that's a qemu implementation detail. >>> >> >> I'm not sure I understand what you mean. > > pass-fd (or add-fd, whatever name we give it) adds an fd to an fdset, > with initial use count of 1 (the use is the monitor). qemu_open() > increments the use count. A new qemu_close() wrapper would decrement > the use count. And both calling 'remove-fd', or closing the QMP monitor > of an fd that has not yet been passed through 'remove-fd', serves as a > way to decrement the use count. You'd still have to track whether the > monitor is using an fd (to avoid over-decrementing on QMP monitor > close), but by having the monitor's use also tracked under the refcount, > then refcount reaching 0 is sufficient to auto-close an fd. I think > that also means that re-establishing the client QMP connection would > increment For some examples: Yes, I think adding a +1 to the refcount for the monitor makes sense. I'm a bit unsure how to increment the refcount when a monitor reconnects though. Maybe it is as simple as adding a +1 to each fd's refcount when the next QMP monitor connects. > > 1. client calls 'add-fd', qemu is now tracking fd=4 with refcount 1, in > use by monitor, as member of fdset1 > 2. client crashes, so all tracked fds are visited; fd=4 had not yet been > passed to 'remove-fd', so qemu decrements refcount; refcount of fd=4 is > now 0 so qemu closes it Just a note that the fd above also hasn't yet been referenced by a drive-add/device-add, so it will be closed in step 2. > > 1. client calls 'add-fd', qemu is now tracking fd=4 with refcount 1, in > use by monitor, as member of fdset1 > 2. client calls 'device-add' with /dev/fdset/1 as the backing filename, > so qemu_open() increments the refcount to 2 > 3. client crashes, so all tracked fds are visited; fd=4 had not yet been > passed to 'remove-fd', so qemu decrements refcount to 1, but leaves fd=4 > open because it is still in use by the block device > 4. client re-establishes QMP connection, and 'query-fds' lets client > learn about fd=4 still being open as part of fdset1, but also informs > client that fd is not in use by the monitor And in step 4 the QMP connection will increment the refcount +1 for all fds that persisted through the QMP disconnect. (?) > > 1. client calls 'add-fd', qemu is now tracking fd=4 with refcount 1, in > use by monitor, as member of fdset1 > 2. client calls 'device-add' with /dev/fdset/1 as the backing filename, > so qemu_open() increments the refcount to 2 > 3. client calls 'remove-fd fdset=1 fd=4', so qemu marks fd=4 as no > longer in use by the monitor, refcount decremented to 1 but still left > open because it is in use by the block device > 4. client crashes, so all tracked fds are visited; but fd=4 is already > marked as not in use by the monitor, so its refcount is unchanged > > 1. client calls 'add-fd', qemu is now tracking fd=4 with refcount 1, in > use by monitor, as member of fdset1 > 2. client calls 'device-add' with /dev/fdset/1 as the backing filename, > but the command fails for some other reason, so the refcount is still 1 > at the end of the command (although it may have been temporarily > incremented then decremented during the command) > 3. client calls 'remove-fd fdset=1 fd=4' to deal with the failure (or > QMP connection is closed), so qemu marks fd=4 as no longer in use by the > monitor, refcount is now decremented to 0 and fd=4 is closed > > I think that covers the idea; you need a bool in_use for tracking > monitor state (the monitor is in use until either a remove-fd or a > monitor connection closes), as well as a ref-count. > Yes, it all makes sense to me. Thanks for the scenarios. >>> We also need a query-fdsets command that lists all fdsets that exist. If >>> we add information about single fds to the return value of it, we >>> probably don't need a separate query-fd that operates on a single fdset. >>> >> >> Yes, good point. And maybe we don't need 2 commands. query-fdsets >> could return all the sets and all the fds that are in those sets. > > Yes, I think a single query command is good enough here, something like: > > { "execute":"query-fdsets" } => > { "return" : { "sets": [ > { "name": "fdset1", > "fds": [ { "fd": 4, "monitor": true, "refcount": 1 } ] }, > { "name": "fdset2", > "fds": [ { "fd": 5, "monitor": false, "refcount": 1 }, > { "fd": 6, "monitor": true, "refcount": 2 } ] } ] } } > > Ok, thanks! >>> In use by whom? If it's still in use in qemu (as in "in-use flag would >>> be set") and we have a refcount of zero, then that's a bug. >>> >> >> In use by qemu. I don't think it's a bug. I think there are situations >> where refcount gets to zero but qemu is still using the fd. > > I think the refcount being non-zero _is_ what defines an fd as being in > use by qemu (including use by the monitor). Any place you have to close > an fd before reopening it is dangerous; the safe way is always to open > with the new permissions before closing the old permissions. > Maybe Kevin wants to weigh in on this. Perhaps it's an issue that can be separated from my patch series. -- Regards, Corey