qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Corey Bryant <coreyb@linux.vnet.ibm.com>
To: Eric Blake <eblake@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>,
	aliguori@us.ibm.com, stefanha@linux.vnet.ibm.com,
	libvir-list@redhat.com, qemu-devel@nongnu.org,
	Luiz Capitulino <lcapitulino@redhat.com>,
	pbonzini@redhat.com
Subject: Re: [Qemu-devel] [PATCH v4 0/7] file descriptor	passing	using	pass-fd
Date: Tue, 03 Jul 2012 13:46:44 -0400	[thread overview]
Message-ID: <4FF33004.5030909@linux.vnet.ibm.com> (raw)
In-Reply-To: <4FF325C8.4060401@redhat.com>



On 07/03/2012 01:03 PM, Eric Blake wrote:
> On 07/03/2012 10:25 AM, Corey Bryant wrote:
>
>>> I thought qemu would rather return the number of the fdset (which it
>>> also assigns if none it passed, i.e. for fdset creation). Does libvirt
>>> need the number of an individual fd?
>>>
>>> If libvirt prefers to assign fdset numbers itself, I'm not against it,
>>> it's just something that wasn't clear to me yet.
>>>
>>
>> That's fine.  QEMU can return the fdset number or a string
>> (/dev/fdset/1) if none is specified.  And an fdset will need to be
>> specified if adding to an existing set.
>>
>> I think libvirt will need the fd returned by add-fd so that it can
>> evaluate fds returned by query-fd.  It's also useful for remove-fd.
>
> Correct - since we will be adding a remove-fd, then that command needs
> to know both the fdset name and the individual fd within the set to be
> removed.
>

Ok

>>
>>>> 2. drive_add file=/dev/fdset/1 -> qemu_open uses the first fd from the
>>>> set that has access flags matching the qemu_open action flags.
>>>> qemu_open increments refcount for this fd.
>>>> 3. add-fd /dev/fdset/1 FDSET={M} -> qemu adds fd to set named
>>>> "/dev/fdset/1" - command returns qemu fd to caller (e.g fd=5).  libvirt
>>>> in-use flag turned on for fd.
>>>> 3. block-commit -> qemu_open reopens "/dev/fdset/1" by using the first
>>>> fd from the set that has access flags matching the qemu_open action
>>>> flags.  qemu_open increments refcount for this fd.
>>>> 4. remove-fd /dev/fdset/1 5 -> caller requests fd==5 be removed from the
>>>> set.  turns libvirt in-use flag off marking the fd ready to be closed
>>>> when qemu is done with it.
>>>
>>> If we decided to not return the individual fd numbers to libvirt, file
>>> descriptors would be uniquely identified by an fdset/flags pair here.
>>>
>>
>> Are you saying we'd pass the fdset name and flags parameters on
>> remove-fd to somehow identify the fds to remove?
>
> Passing the flag parameters is not trivial, as that would mean the QMP
> code would have to define constants mapping to all of the O_* flags that
> qemu_open supports.  It's easier to support closing by fd number.
>

I understand what you were saying now, although I guess it's not 
applicable at this point.  I'll plan on returning the fd from add-fd and 
passing the fd on the remove-fd command.

>>
>>>> 5. qemu_close decrements refcount for fd, and closes fd when refcount is
>>>> zero and libvirt in use flag is off.
>>>
>>> The monitor could just hold another reference, then we save the
>>> additional flag. But that's a qemu implementation detail.
>>>
>>
>> I'm not sure I understand what you mean.
>
> pass-fd (or add-fd, whatever name we give it) adds an fd to an fdset,
> with initial use count of 1 (the use is the monitor).  qemu_open()
> increments the use count.  A new qemu_close() wrapper would decrement
> the use count.  And both calling 'remove-fd', or closing the QMP monitor
> of an fd that has not yet been passed through 'remove-fd', serves as a
> way to decrement the use count.  You'd still have to track whether the
> monitor is using an fd (to avoid over-decrementing on QMP monitor
> close), but by having the monitor's use also tracked under the refcount,
> then refcount reaching 0 is sufficient to auto-close an fd.  I think
> that also means that re-establishing the client QMP connection would
> increment  For some examples:

Yes, I think adding a +1 to the refcount for the monitor makes sense.

I'm a bit unsure how to increment the refcount when a monitor reconnects 
though.  Maybe it is as simple as adding a +1 to each fd's refcount when 
the next QMP monitor connects.

>
> 1. client calls 'add-fd', qemu is now tracking fd=4 with refcount 1, in
> use by monitor, as member of fdset1
> 2. client crashes, so all tracked fds are visited; fd=4 had not yet been
> passed to 'remove-fd', so qemu decrements refcount; refcount of fd=4 is
> now 0 so qemu closes it

Just a note that the fd above also hasn't yet been referenced by a 
drive-add/device-add, so it will be closed in step 2.

>
> 1. client calls 'add-fd', qemu is now tracking fd=4 with refcount 1, in
> use by monitor, as member of fdset1
> 2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
> so qemu_open() increments the refcount to 2
> 3. client crashes, so all tracked fds are visited; fd=4 had not yet been
> passed to 'remove-fd', so qemu decrements refcount to 1, but leaves fd=4
> open because it is still in use by the block device
> 4. client re-establishes QMP connection, and 'query-fds' lets client
> learn about fd=4 still being open as part of fdset1, but also informs
> client that fd is not in use by the monitor

And in step 4 the QMP connection will increment the refcount +1 for all 
fds that persisted through the QMP disconnect. (?)

>
> 1. client calls 'add-fd', qemu is now tracking fd=4 with refcount 1, in
> use by monitor, as member of fdset1
> 2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
> so qemu_open() increments the refcount to 2
> 3. client calls 'remove-fd fdset=1 fd=4', so qemu marks fd=4 as no
> longer in use by the monitor, refcount decremented to 1 but still left
> open because it is in use by the block device
> 4. client crashes, so all tracked fds are visited; but fd=4 is already
> marked as not in use by the monitor, so its refcount is unchanged
>
> 1. client calls 'add-fd', qemu is now tracking fd=4 with refcount 1, in
> use by monitor, as member of fdset1
> 2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
> but the command fails for some other reason, so the refcount is still 1
> at the end of the command (although it may have been temporarily
> incremented then decremented during the command)
> 3. client calls 'remove-fd fdset=1 fd=4' to deal with the failure (or
> QMP connection is closed), so qemu marks fd=4 as no longer in use by the
> monitor, refcount is now decremented to 0 and fd=4 is closed
>
> I think that covers the idea; you need a bool in_use for tracking
> monitor state (the monitor is in use until either a remove-fd or a
> monitor connection closes), as well as a ref-count.
>

Yes, it all makes sense to me.  Thanks for the scenarios.

>>> We also need a query-fdsets command that lists all fdsets that exist. If
>>> we add information about single fds to the return value of it, we
>>> probably don't need a separate query-fd that operates on a single fdset.
>>>
>>
>> Yes, good point.  And maybe we don't need 2 commands.  query-fdsets
>> could return all the sets and all the fds that are in those sets.
>
> Yes, I think a single query command is good enough here, something like:
>
> { "execute":"query-fdsets" } =>
> { "return" : { "sets": [
>     { "name": "fdset1",
>       "fds": [ { "fd": 4, "monitor": true, "refcount": 1 } ] },
>     { "name": "fdset2",
>       "fds": [ { "fd": 5, "monitor": false, "refcount": 1 },
>                { "fd": 6, "monitor": true, "refcount": 2 } ] } ] } }
>
>

Ok, thanks!

>>> In use by whom? If it's still in use in qemu (as in "in-use flag would
>>> be set") and we have a refcount of zero, then that's a bug.
>>>
>>
>> In use by qemu.  I don't think it's a bug.  I think there are situations
>> where refcount gets to zero but qemu is still using the fd.
>
> I think the refcount being non-zero _is_ what defines an fd as being in
> use by qemu (including use by the monitor).  Any place you have to close
> an fd before reopening it is dangerous; the safe way is always to open
> with the new permissions before closing the old permissions.
>

Maybe Kevin wants to weigh in on this.  Perhaps it's an issue that can 
be separated from my patch series.

-- 
Regards,
Corey

  reply	other threads:[~2012-07-03 17:47 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-06-22 18:36 [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd Corey Bryant
2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 1/7] qemu-char: Add MSG_CMSG_CLOEXEC flag to recvmsg Corey Bryant
2012-06-22 19:31   ` Eric Blake
2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 2/7] qapi: Convert getfd and closefd Corey Bryant
2012-07-11 18:51   ` Luiz Capitulino
2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 3/7] qapi: Add pass-fd QMP command Corey Bryant
2012-06-22 20:24   ` Eric Blake
2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 4/7] qapi: Re-arrange monitor.c functions Corey Bryant
2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 5/7] block: Prevent /dev/fd/X filename from being detected as floppy Corey Bryant
2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 6/7] block: Convert open calls to qemu_open Corey Bryant
2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 7/7] osdep: Enable qemu_open to dup pre-opened fd Corey Bryant
2012-06-22 19:58   ` Eric Blake
     [not found] ` <20120626091004.GA14451@redhat.com>
     [not found]   ` <4FE9A0F0.2050809@redhat.com>
     [not found]     ` <20120626175045.2c7011b3@doriath.home>
     [not found]       ` <4FEA37A9.10707@linux.vnet.ibm.com>
     [not found]         ` <4FEA3D9C.8080205@redhat.com>
2012-07-02 22:02           ` [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd Corey Bryant
2012-07-02 22:31             ` Eric Blake
2012-07-03  9:07               ` Daniel P. Berrange
2012-07-03  9:40               ` Kevin Wolf
2012-07-03 13:42               ` Corey Bryant
2012-07-03 15:40             ` Corey Bryant
2012-07-03 15:59               ` Kevin Wolf
2012-07-03 16:25                 ` Corey Bryant
2012-07-03 17:03                   ` Eric Blake
2012-07-03 17:46                     ` Corey Bryant [this message]
2012-07-03 18:00                       ` Eric Blake
2012-07-03 18:21                         ` Corey Bryant
2012-07-04  8:09                           ` Kevin Wolf
2012-07-05 15:06                             ` Corey Bryant
2012-07-09 14:05                               ` Luiz Capitulino
2012-07-09 15:05                                 ` Corey Bryant
2012-07-09 15:46                                   ` Kevin Wolf
2012-07-09 16:18                                     ` Luiz Capitulino
2012-07-09 17:59                                       ` Corey Bryant
2012-07-09 17:35                                     ` Corey Bryant
2012-07-09 17:48                                       ` Luiz Capitulino
2012-07-09 18:02                                         ` Corey Bryant
2012-07-10  7:53                                       ` Kevin Wolf
2012-07-09 18:20                                   ` Corey Bryant
2012-07-04  8:00                     ` Kevin Wolf
2012-07-05 14:22                       ` Corey Bryant
2012-07-05 14:51                         ` Kevin Wolf
2012-07-05 16:35                           ` Corey Bryant
2012-07-05 16:37                             ` Corey Bryant
2012-07-06  9:06                               ` Kevin Wolf
2012-07-05 17:00                             ` Eric Blake
2012-07-05 17:36                               ` Corey Bryant
2012-07-06  9:11                               ` Kevin Wolf
2012-07-06 17:14                                 ` Corey Bryant
2012-07-06 17:15                                   ` Corey Bryant
2012-07-06 17:40                                 ` Corey Bryant
2012-07-06 18:19                                   ` [Qemu-devel] [libvirt] " Corey Bryant
2012-07-09 14:04                                   ` [Qemu-devel] " Kevin Wolf
2012-07-09 15:23                                     ` Corey Bryant
2012-07-09 15:30                                       ` Kevin Wolf
2012-07-09 18:40   ` Anthony Liguori
2012-07-09 19:00     ` Luiz Capitulino
2012-07-10  8:54       ` Daniel P. Berrange
2012-07-10  7:58     ` Kevin Wolf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4FF33004.5030909@linux.vnet.ibm.com \
    --to=coreyb@linux.vnet.ibm.com \
    --cc=aliguori@us.ibm.com \
    --cc=eblake@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=lcapitulino@redhat.com \
    --cc=libvir-list@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).