* [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd @ 2012-06-22 18:36 Corey Bryant 2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 1/7] qemu-char: Add MSG_CMSG_CLOEXEC flag to recvmsg Corey Bryant ` (7 more replies) 0 siblings, 8 replies; 56+ messages in thread From: Corey Bryant @ 2012-06-22 18:36 UTC (permalink / raw) To: qemu-devel Cc: kwolf, aliguori, stefanha, libvir-list, lcapitulino, pbonzini, eblake libvirt's sVirt security driver provides SELinux MAC isolation for Qemu guest processes and their corresponding image files. In other words, sVirt uses SELinux to prevent a QEMU process from opening files that do not belong to it. sVirt provides this support by labeling guests and resources with security labels that are stored in file system extended attributes. Some file systems, such as NFS, do not support the extended attribute security namespace, and therefore cannot support sVirt isolation. A solution to this problem is to provide fd passing support, where libvirt opens files and passes file descriptors to QEMU. This, along with SELinux policy to prevent QEMU from opening files, can provide image file isolation for NFS files stored on the same NFS mount. This patch series adds the pass-fd QMP monitor command, which allows an fd to be passed via SCM_RIGHTS, and returns the received file descriptor. Support is also added to the block layer to allow QEMU to dup the fd when the filename is of the /dev/fd/X format. This is useful if MAC policy prevents QEMU from opening specific types of files. One nice thing about this approach is that no new SELinux policy is required to prevent open of NFS files (files with type nfs_t). The virt_use_nfs boolean type simply needs to be set to false, and open will be prevented (and dup will be allowed). For example: # setsebool virt_use_nfs 0 # getsebool virt_use_nfs virt_use_nfs --> off Corey Bryant (7): qemu-char: Add MSG_CMSG_CLOEXEC flag to recvmsg qapi: Convert getfd and closefd qapi: Add pass-fd QMP command qapi: Re-arrange monitor.c functions block: Prevent /dev/fd/X filename from being detected as floppy block: Convert open calls to qemu_open osdep: Enable qemu_open to dup pre-opened fd block/raw-posix.c | 22 +++++----- block/raw-win32.c | 4 +- block/vdi.c | 5 ++- block/vmdk.c | 21 ++++------ block/vpc.c | 2 +- block/vvfat.c | 21 +++++----- cutils.c | 26 +++++++++--- dump.c | 3 +- hmp-commands.hx | 6 +-- hmp.c | 18 ++++++++ hmp.h | 2 + main-loop.c | 6 +-- migration-fd.c | 2 +- monitor.c | 120 ++++++++++++++++++++++++++++++++--------------------- monitor.h | 2 +- net.c | 6 ++- osdep.c | 91 ++++++++++++++++++++++++++++++++++++++++ qapi-schema.json | 71 +++++++++++++++++++++++++++++++ qemu-char.c | 2 +- qemu-common.h | 2 +- qmp-commands.hx | 56 ++++++++++++++++++++++--- 21 files changed, 378 insertions(+), 110 deletions(-) -- 1.7.10.2 ^ permalink raw reply [flat|nested] 56+ messages in thread
* [Qemu-devel] [PATCH v4 1/7] qemu-char: Add MSG_CMSG_CLOEXEC flag to recvmsg 2012-06-22 18:36 [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd Corey Bryant @ 2012-06-22 18:36 ` Corey Bryant 2012-06-22 19:31 ` Eric Blake 2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 2/7] qapi: Convert getfd and closefd Corey Bryant ` (6 subsequent siblings) 7 siblings, 1 reply; 56+ messages in thread From: Corey Bryant @ 2012-06-22 18:36 UTC (permalink / raw) To: qemu-devel Cc: kwolf, aliguori, stefanha, libvir-list, lcapitulino, pbonzini, eblake This sets the close-on-exec flag for the file descriptor received via SCM_RIGHTS. Signed-off-by: Corey Bryant <coreyb@linux.vnet.ibm.com> --- v4 -This patch is new in v4 (eblake@redhat.com) qemu-char.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/qemu-char.c b/qemu-char.c index c2aaaee..f890113 100644 --- a/qemu-char.c +++ b/qemu-char.c @@ -2263,7 +2263,7 @@ static ssize_t tcp_chr_recv(CharDriverState *chr, char *buf, size_t len) msg.msg_control = &msg_control; msg.msg_controllen = sizeof(msg_control); - ret = recvmsg(s->fd, &msg, 0); + ret = recvmsg(s->fd, &msg, MSG_CMSG_CLOEXEC); if (ret > 0 && s->is_unix) unix_process_msgfd(chr, &msg); -- 1.7.10.2 ^ permalink raw reply related [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [PATCH v4 1/7] qemu-char: Add MSG_CMSG_CLOEXEC flag to recvmsg 2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 1/7] qemu-char: Add MSG_CMSG_CLOEXEC flag to recvmsg Corey Bryant @ 2012-06-22 19:31 ` Eric Blake 0 siblings, 0 replies; 56+ messages in thread From: Eric Blake @ 2012-06-22 19:31 UTC (permalink / raw) To: Corey Bryant Cc: kwolf, aliguori, stefanha, libvir-list, qemu-devel, lcapitulino, pbonzini [-- Attachment #1: Type: text/plain, Size: 1262 bytes --] On 06/22/2012 12:36 PM, Corey Bryant wrote: > This sets the close-on-exec flag for the file descriptor received > via SCM_RIGHTS. > > Signed-off-by: Corey Bryant <coreyb@linux.vnet.ibm.com> > --- > v4 > -This patch is new in v4 (eblake@redhat.com) > > qemu-char.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/qemu-char.c b/qemu-char.c > index c2aaaee..f890113 100644 > --- a/qemu-char.c > +++ b/qemu-char.c > @@ -2263,7 +2263,7 @@ static ssize_t tcp_chr_recv(CharDriverState *chr, char *buf, size_t len) > msg.msg_control = &msg_control; > msg.msg_controllen = sizeof(msg_control); > > - ret = recvmsg(s->fd, &msg, 0); > + ret = recvmsg(s->fd, &msg, MSG_CMSG_CLOEXEC); MSG_CMSG_CLOEXEC is not (yet) in POSIX (although it has been proposed for addition); therefore, at the moment, it only exists on Linux and Cygwin. Does this need to have conditional code to allow compilation on BSD, such as: #ifndef MSG_CMSG_CLOEXEC # define MSG_CMSG_CLOEXEC 0 #endif as well as fallback code that sets FD_CLOEXEC manually via fcntl() when MSG_CMSG_CLOEXEC is missing? -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 620 bytes --] ^ permalink raw reply [flat|nested] 56+ messages in thread
* [Qemu-devel] [PATCH v4 2/7] qapi: Convert getfd and closefd 2012-06-22 18:36 [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd Corey Bryant 2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 1/7] qemu-char: Add MSG_CMSG_CLOEXEC flag to recvmsg Corey Bryant @ 2012-06-22 18:36 ` Corey Bryant 2012-07-11 18:51 ` Luiz Capitulino 2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 3/7] qapi: Add pass-fd QMP command Corey Bryant ` (5 subsequent siblings) 7 siblings, 1 reply; 56+ messages in thread From: Corey Bryant @ 2012-06-22 18:36 UTC (permalink / raw) To: qemu-devel Cc: kwolf, aliguori, stefanha, libvir-list, lcapitulino, pbonzini, eblake Signed-off-by: Corey Bryant <coreyb@linux.vnet.ibm.com> --- v2: -Convert getfd and closefd to QAPI (lcapitulino@redhat.com) -Remove changes that returned fd from getfd (lcapitulino@redhat.com) -Wrap hmp_* functions around qmp_* functions (kwolf@redhat.com) -Move hmp_* functions to hmp.c (lcapitulino@redhat.com) -Drop .user_print lines (lcapitulino@redhat.com) -Use 'cmd' instead of 'cmd_new' for HMP (lcapitulino@redhat.com) -Change QMP command existance back to 0.14 (lcapitulino@redhat.com) v3: -Remove unnecessary return statements from qmp_* functions (lcapitulino@redhat.com) -Add notes to QMP command describing behavior in more detail (lcapitulino@redhat.com, eblake@redhat.com) v4: -No changes hmp-commands.hx | 6 ++---- hmp.c | 18 ++++++++++++++++++ hmp.h | 2 ++ monitor.c | 32 ++++++++++++++------------------ qapi-schema.json | 35 +++++++++++++++++++++++++++++++++++ qmp-commands.hx | 14 ++++++++++---- 6 files changed, 81 insertions(+), 26 deletions(-) diff --git a/hmp-commands.hx b/hmp-commands.hx index f5d9d91..eea8b32 100644 --- a/hmp-commands.hx +++ b/hmp-commands.hx @@ -1236,8 +1236,7 @@ ETEXI .args_type = "fdname:s", .params = "getfd name", .help = "receive a file descriptor via SCM rights and assign it a name", - .user_print = monitor_user_noop, - .mhandler.cmd_new = do_getfd, + .mhandler.cmd = hmp_getfd, }, STEXI @@ -1253,8 +1252,7 @@ ETEXI .args_type = "fdname:s", .params = "closefd name", .help = "close a file descriptor previously passed via SCM rights", - .user_print = monitor_user_noop, - .mhandler.cmd_new = do_closefd, + .mhandler.cmd = hmp_closefd, }, STEXI diff --git a/hmp.c b/hmp.c index 2ce8cb9..6241856 100644 --- a/hmp.c +++ b/hmp.c @@ -999,3 +999,21 @@ void hmp_netdev_del(Monitor *mon, const QDict *qdict) qmp_netdev_del(id, &err); hmp_handle_error(mon, &err); } + +void hmp_getfd(Monitor *mon, const QDict *qdict) +{ + const char *fdname = qdict_get_str(qdict, "fdname"); + Error *errp = NULL; + + qmp_getfd(fdname, &errp); + hmp_handle_error(mon, &errp); +} + +void hmp_closefd(Monitor *mon, const QDict *qdict) +{ + const char *fdname = qdict_get_str(qdict, "fdname"); + Error *errp = NULL; + + qmp_closefd(fdname, &errp); + hmp_handle_error(mon, &errp); +} diff --git a/hmp.h b/hmp.h index 79d138d..8d2b0d7 100644 --- a/hmp.h +++ b/hmp.h @@ -64,5 +64,7 @@ void hmp_device_del(Monitor *mon, const QDict *qdict); void hmp_dump_guest_memory(Monitor *mon, const QDict *qdict); void hmp_netdev_add(Monitor *mon, const QDict *qdict); void hmp_netdev_del(Monitor *mon, const QDict *qdict); +void hmp_getfd(Monitor *mon, const QDict *qdict); +void hmp_closefd(Monitor *mon, const QDict *qdict); #endif diff --git a/monitor.c b/monitor.c index a3bc2c7..1a7f7e7 100644 --- a/monitor.c +++ b/monitor.c @@ -2182,48 +2182,45 @@ static void do_inject_mce(Monitor *mon, const QDict *qdict) } #endif -static int do_getfd(Monitor *mon, const QDict *qdict, QObject **ret_data) +void qmp_getfd(const char *fdname, Error **errp) { - const char *fdname = qdict_get_str(qdict, "fdname"); mon_fd_t *monfd; int fd; - fd = qemu_chr_fe_get_msgfd(mon->chr); + fd = qemu_chr_fe_get_msgfd(cur_mon->chr); if (fd == -1) { - qerror_report(QERR_FD_NOT_SUPPLIED); - return -1; + error_set(errp, QERR_FD_NOT_SUPPLIED); + return; } if (qemu_isdigit(fdname[0])) { - qerror_report(QERR_INVALID_PARAMETER_VALUE, "fdname", - "a name not starting with a digit"); - return -1; + error_set(errp, QERR_INVALID_PARAMETER_VALUE, "fdname", + "a name not starting with a digit"); + return; } - QLIST_FOREACH(monfd, &mon->fds, next) { + QLIST_FOREACH(monfd, &cur_mon->fds, next) { if (strcmp(monfd->name, fdname) != 0) { continue; } close(monfd->fd); monfd->fd = fd; - return 0; + return; } monfd = g_malloc0(sizeof(mon_fd_t)); monfd->name = g_strdup(fdname); monfd->fd = fd; - QLIST_INSERT_HEAD(&mon->fds, monfd, next); - return 0; + QLIST_INSERT_HEAD(&cur_mon->fds, monfd, next); } -static int do_closefd(Monitor *mon, const QDict *qdict, QObject **ret_data) +void qmp_closefd(const char *fdname, Error **errp) { - const char *fdname = qdict_get_str(qdict, "fdname"); mon_fd_t *monfd; - QLIST_FOREACH(monfd, &mon->fds, next) { + QLIST_FOREACH(monfd, &cur_mon->fds, next) { if (strcmp(monfd->name, fdname) != 0) { continue; } @@ -2232,11 +2229,10 @@ static int do_closefd(Monitor *mon, const QDict *qdict, QObject **ret_data) close(monfd->fd); g_free(monfd->name); g_free(monfd); - return 0; + return; } - qerror_report(QERR_FD_NOT_FOUND, fdname); - return -1; + error_set(errp, QERR_FD_NOT_FOUND, fdname); } static void do_loadvm(Monitor *mon, const QDict *qdict) diff --git a/qapi-schema.json b/qapi-schema.json index 3b6e346..26a6b84 100644 --- a/qapi-schema.json +++ b/qapi-schema.json @@ -1862,3 +1862,38 @@ # Since: 0.14.0 ## { 'command': 'netdev_del', 'data': {'id': 'str'} } + +## +# @getfd: +# +# Receive a file descriptor via SCM rights and assign it a name +# +# @fdname: file descriptor name +# +# Returns: Nothing on success +# If file descriptor was not received, FdNotSupplied +# If @fdname is not valid, InvalidParameterType +# +# Since: 0.14.0 +# +# Notes: If @fdname already exists, the file descriptor assigned to +# it will be closed and replaced by the received file +# descriptor. +# The 'closefd' command can be used to explicitly close the +# file descriptor when it is no longer needed. +## +{ 'command': 'getfd', 'data': {'fdname': 'str'} } + +## +# @closefd: +# +# Close a file descriptor previously passed via SCM rights +# +# @fdname: file descriptor name +# +# Returns: Nothing on success +# If @fdname is not found, FdNotFound +# +# Since: 0.14.0 +## +{ 'command': 'closefd', 'data': {'fdname': 'str'} } diff --git a/qmp-commands.hx b/qmp-commands.hx index 2e1a38e..e3cf3c5 100644 --- a/qmp-commands.hx +++ b/qmp-commands.hx @@ -873,8 +873,7 @@ EQMP .args_type = "fdname:s", .params = "getfd name", .help = "receive a file descriptor via SCM rights and assign it a name", - .user_print = monitor_user_noop, - .mhandler.cmd_new = do_getfd, + .mhandler.cmd_new = qmp_marshal_input_getfd, }, SQMP @@ -892,6 +891,14 @@ Example: -> { "execute": "getfd", "arguments": { "fdname": "fd1" } } <- { "return": {} } +Notes: + +(1) If the name specified by the "fdname" argument already exists, + the file descriptor assigned to it will be closed and replaced + by the received file descriptor. +(2) The 'closefd' command can be used to explicitly close the file + descriptor when it is no longer needed. + EQMP { @@ -899,8 +906,7 @@ EQMP .args_type = "fdname:s", .params = "closefd name", .help = "close a file descriptor previously passed via SCM rights", - .user_print = monitor_user_noop, - .mhandler.cmd_new = do_closefd, + .mhandler.cmd_new = qmp_marshal_input_closefd, }, SQMP -- 1.7.10.2 ^ permalink raw reply related [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [PATCH v4 2/7] qapi: Convert getfd and closefd 2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 2/7] qapi: Convert getfd and closefd Corey Bryant @ 2012-07-11 18:51 ` Luiz Capitulino 0 siblings, 0 replies; 56+ messages in thread From: Luiz Capitulino @ 2012-07-11 18:51 UTC (permalink / raw) To: Corey Bryant Cc: kwolf, aliguori, stefanha, libvir-list, qemu-devel, pbonzini, eblake On Fri, 22 Jun 2012 14:36:09 -0400 Corey Bryant <coreyb@linux.vnet.ibm.com> wrote: > Signed-off-by: Corey Bryant <coreyb@linux.vnet.ibm.com> I've cherry-picked this one into the qmp tree. > --- > v2: > -Convert getfd and closefd to QAPI (lcapitulino@redhat.com) > -Remove changes that returned fd from getfd (lcapitulino@redhat.com) > -Wrap hmp_* functions around qmp_* functions (kwolf@redhat.com) > -Move hmp_* functions to hmp.c (lcapitulino@redhat.com) > -Drop .user_print lines (lcapitulino@redhat.com) > -Use 'cmd' instead of 'cmd_new' for HMP (lcapitulino@redhat.com) > -Change QMP command existance back to 0.14 (lcapitulino@redhat.com) > > v3: > -Remove unnecessary return statements from qmp_* functions > (lcapitulino@redhat.com) > -Add notes to QMP command describing behavior in more detail > (lcapitulino@redhat.com, eblake@redhat.com) > > v4: > -No changes > > hmp-commands.hx | 6 ++---- > hmp.c | 18 ++++++++++++++++++ > hmp.h | 2 ++ > monitor.c | 32 ++++++++++++++------------------ > qapi-schema.json | 35 +++++++++++++++++++++++++++++++++++ > qmp-commands.hx | 14 ++++++++++---- > 6 files changed, 81 insertions(+), 26 deletions(-) > > diff --git a/hmp-commands.hx b/hmp-commands.hx > index f5d9d91..eea8b32 100644 > --- a/hmp-commands.hx > +++ b/hmp-commands.hx > @@ -1236,8 +1236,7 @@ ETEXI > .args_type = "fdname:s", > .params = "getfd name", > .help = "receive a file descriptor via SCM rights and assign it a name", > - .user_print = monitor_user_noop, > - .mhandler.cmd_new = do_getfd, > + .mhandler.cmd = hmp_getfd, > }, > > STEXI > @@ -1253,8 +1252,7 @@ ETEXI > .args_type = "fdname:s", > .params = "closefd name", > .help = "close a file descriptor previously passed via SCM rights", > - .user_print = monitor_user_noop, > - .mhandler.cmd_new = do_closefd, > + .mhandler.cmd = hmp_closefd, > }, > > STEXI > diff --git a/hmp.c b/hmp.c > index 2ce8cb9..6241856 100644 > --- a/hmp.c > +++ b/hmp.c > @@ -999,3 +999,21 @@ void hmp_netdev_del(Monitor *mon, const QDict *qdict) > qmp_netdev_del(id, &err); > hmp_handle_error(mon, &err); > } > + > +void hmp_getfd(Monitor *mon, const QDict *qdict) > +{ > + const char *fdname = qdict_get_str(qdict, "fdname"); > + Error *errp = NULL; > + > + qmp_getfd(fdname, &errp); > + hmp_handle_error(mon, &errp); > +} > + > +void hmp_closefd(Monitor *mon, const QDict *qdict) > +{ > + const char *fdname = qdict_get_str(qdict, "fdname"); > + Error *errp = NULL; > + > + qmp_closefd(fdname, &errp); > + hmp_handle_error(mon, &errp); > +} > diff --git a/hmp.h b/hmp.h > index 79d138d..8d2b0d7 100644 > --- a/hmp.h > +++ b/hmp.h > @@ -64,5 +64,7 @@ void hmp_device_del(Monitor *mon, const QDict *qdict); > void hmp_dump_guest_memory(Monitor *mon, const QDict *qdict); > void hmp_netdev_add(Monitor *mon, const QDict *qdict); > void hmp_netdev_del(Monitor *mon, const QDict *qdict); > +void hmp_getfd(Monitor *mon, const QDict *qdict); > +void hmp_closefd(Monitor *mon, const QDict *qdict); > > #endif > diff --git a/monitor.c b/monitor.c > index a3bc2c7..1a7f7e7 100644 > --- a/monitor.c > +++ b/monitor.c > @@ -2182,48 +2182,45 @@ static void do_inject_mce(Monitor *mon, const QDict *qdict) > } > #endif > > -static int do_getfd(Monitor *mon, const QDict *qdict, QObject **ret_data) > +void qmp_getfd(const char *fdname, Error **errp) > { > - const char *fdname = qdict_get_str(qdict, "fdname"); > mon_fd_t *monfd; > int fd; > > - fd = qemu_chr_fe_get_msgfd(mon->chr); > + fd = qemu_chr_fe_get_msgfd(cur_mon->chr); > if (fd == -1) { > - qerror_report(QERR_FD_NOT_SUPPLIED); > - return -1; > + error_set(errp, QERR_FD_NOT_SUPPLIED); > + return; > } > > if (qemu_isdigit(fdname[0])) { > - qerror_report(QERR_INVALID_PARAMETER_VALUE, "fdname", > - "a name not starting with a digit"); > - return -1; > + error_set(errp, QERR_INVALID_PARAMETER_VALUE, "fdname", > + "a name not starting with a digit"); > + return; > } > > - QLIST_FOREACH(monfd, &mon->fds, next) { > + QLIST_FOREACH(monfd, &cur_mon->fds, next) { > if (strcmp(monfd->name, fdname) != 0) { > continue; > } > > close(monfd->fd); > monfd->fd = fd; > - return 0; > + return; > } > > monfd = g_malloc0(sizeof(mon_fd_t)); > monfd->name = g_strdup(fdname); > monfd->fd = fd; > > - QLIST_INSERT_HEAD(&mon->fds, monfd, next); > - return 0; > + QLIST_INSERT_HEAD(&cur_mon->fds, monfd, next); > } > > -static int do_closefd(Monitor *mon, const QDict *qdict, QObject **ret_data) > +void qmp_closefd(const char *fdname, Error **errp) > { > - const char *fdname = qdict_get_str(qdict, "fdname"); > mon_fd_t *monfd; > > - QLIST_FOREACH(monfd, &mon->fds, next) { > + QLIST_FOREACH(monfd, &cur_mon->fds, next) { > if (strcmp(monfd->name, fdname) != 0) { > continue; > } > @@ -2232,11 +2229,10 @@ static int do_closefd(Monitor *mon, const QDict *qdict, QObject **ret_data) > close(monfd->fd); > g_free(monfd->name); > g_free(monfd); > - return 0; > + return; > } > > - qerror_report(QERR_FD_NOT_FOUND, fdname); > - return -1; > + error_set(errp, QERR_FD_NOT_FOUND, fdname); > } > > static void do_loadvm(Monitor *mon, const QDict *qdict) > diff --git a/qapi-schema.json b/qapi-schema.json > index 3b6e346..26a6b84 100644 > --- a/qapi-schema.json > +++ b/qapi-schema.json > @@ -1862,3 +1862,38 @@ > # Since: 0.14.0 > ## > { 'command': 'netdev_del', 'data': {'id': 'str'} } > + > +## > +# @getfd: > +# > +# Receive a file descriptor via SCM rights and assign it a name > +# > +# @fdname: file descriptor name > +# > +# Returns: Nothing on success > +# If file descriptor was not received, FdNotSupplied > +# If @fdname is not valid, InvalidParameterType > +# > +# Since: 0.14.0 > +# > +# Notes: If @fdname already exists, the file descriptor assigned to > +# it will be closed and replaced by the received file > +# descriptor. > +# The 'closefd' command can be used to explicitly close the > +# file descriptor when it is no longer needed. > +## > +{ 'command': 'getfd', 'data': {'fdname': 'str'} } > + > +## > +# @closefd: > +# > +# Close a file descriptor previously passed via SCM rights > +# > +# @fdname: file descriptor name > +# > +# Returns: Nothing on success > +# If @fdname is not found, FdNotFound > +# > +# Since: 0.14.0 > +## > +{ 'command': 'closefd', 'data': {'fdname': 'str'} } > diff --git a/qmp-commands.hx b/qmp-commands.hx > index 2e1a38e..e3cf3c5 100644 > --- a/qmp-commands.hx > +++ b/qmp-commands.hx > @@ -873,8 +873,7 @@ EQMP > .args_type = "fdname:s", > .params = "getfd name", > .help = "receive a file descriptor via SCM rights and assign it a name", > - .user_print = monitor_user_noop, > - .mhandler.cmd_new = do_getfd, > + .mhandler.cmd_new = qmp_marshal_input_getfd, > }, > > SQMP > @@ -892,6 +891,14 @@ Example: > -> { "execute": "getfd", "arguments": { "fdname": "fd1" } } > <- { "return": {} } > > +Notes: > + > +(1) If the name specified by the "fdname" argument already exists, > + the file descriptor assigned to it will be closed and replaced > + by the received file descriptor. > +(2) The 'closefd' command can be used to explicitly close the file > + descriptor when it is no longer needed. > + > EQMP > > { > @@ -899,8 +906,7 @@ EQMP > .args_type = "fdname:s", > .params = "closefd name", > .help = "close a file descriptor previously passed via SCM rights", > - .user_print = monitor_user_noop, > - .mhandler.cmd_new = do_closefd, > + .mhandler.cmd_new = qmp_marshal_input_closefd, > }, > > SQMP ^ permalink raw reply [flat|nested] 56+ messages in thread
* [Qemu-devel] [PATCH v4 3/7] qapi: Add pass-fd QMP command 2012-06-22 18:36 [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd Corey Bryant 2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 1/7] qemu-char: Add MSG_CMSG_CLOEXEC flag to recvmsg Corey Bryant 2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 2/7] qapi: Convert getfd and closefd Corey Bryant @ 2012-06-22 18:36 ` Corey Bryant 2012-06-22 20:24 ` Eric Blake 2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 4/7] qapi: Re-arrange monitor.c functions Corey Bryant ` (4 subsequent siblings) 7 siblings, 1 reply; 56+ messages in thread From: Corey Bryant @ 2012-06-22 18:36 UTC (permalink / raw) To: qemu-devel Cc: kwolf, aliguori, stefanha, libvir-list, lcapitulino, pbonzini, eblake This patch adds the pass-fd QMP command using the QAPI framework. Like the getfd command, it is used to pass a file descriptor via SCM_RIGHTS and associate it with a name. However, the pass-fd command also returns the received file descriptor, which is a difference in behavior from the getfd command, which returns nothing. pass-fd also supports a boolean "force" argument that can be used to specify that an existing file descriptor is associated with the specified name, and that it should become a copy of the newly passed file descriptor. The closefd command can be used to close a file descriptor that was passed with the pass-fd command. Note that when using getfd or pass-fd, there are some commands (e.g. migrate with fd:name) that implicitly close the named fd. When this is not the case, closefd must be used to avoid fd leaks. Signed-off-by: Corey Bryant <coreyb@linux.vnet.ibm.com> --- v2: -Introduce new QMP command to pass/return fd (lcapitulino@redhat.com) -Use passfd as command name (berrange@redhat.com) v3: -Use pass-fd as command name (lcapitulino@redhat.com) -Fail pass-fd if fdname already exists (lcapitulino@redhat.com) -Add notes to QMP command describing behavior in more detail (lcapitulino@redhat.com, eblake@redhat.com) -Add note about fd leakage (eblake@redhat.com) v4: -Don't return same error class twice (lcapitulino@redhat.com) -Share code for qmp_gefd and qmp_pass_fd (lcapitulino@redhat.com) -Update qmp_closefd to call monitor_get_fd -Add support for "force" argument to pass-fd (eblake@redhat.com) dump.c | 3 +- migration-fd.c | 2 +- monitor.c | 96 +++++++++++++++++++++++++++++++++++------------------- monitor.h | 2 +- net.c | 6 ++-- qapi-schema.json | 36 ++++++++++++++++++++ qmp-commands.hx | 42 +++++++++++++++++++++++- 7 files changed, 146 insertions(+), 41 deletions(-) diff --git a/dump.c b/dump.c index 4412d7a..2fd8ced 100644 --- a/dump.c +++ b/dump.c @@ -836,9 +836,8 @@ void qmp_dump_guest_memory(bool paging, const char *file, bool has_begin, #if !defined(WIN32) if (strstart(file, "fd:", &p)) { - fd = monitor_get_fd(cur_mon, p); + fd = monitor_get_fd(cur_mon, p, errp); if (fd == -1) { - error_set(errp, QERR_FD_NOT_FOUND, p); return; } } diff --git a/migration-fd.c b/migration-fd.c index 50138ed..7335167 100644 --- a/migration-fd.c +++ b/migration-fd.c @@ -75,7 +75,7 @@ static int fd_close(MigrationState *s) int fd_start_outgoing_migration(MigrationState *s, const char *fdname) { - s->fd = monitor_get_fd(cur_mon, fdname); + s->fd = monitor_get_fd(cur_mon, fdname, NULL); if (s->fd == -1) { DPRINTF("fd_migration: invalid file descriptor identifier\n"); goto err_after_get_fd; diff --git a/monitor.c b/monitor.c index 1a7f7e7..3433c06 100644 --- a/monitor.c +++ b/monitor.c @@ -814,7 +814,7 @@ static int add_graphics_client(Monitor *mon, const QDict *qdict, QObject **ret_d CharDriverState *s; if (strcmp(protocol, "spice") == 0) { - int fd = monitor_get_fd(mon, fdname); + int fd = monitor_get_fd(mon, fdname, NULL); int skipauth = qdict_get_try_bool(qdict, "skipauth", 0); int tls = qdict_get_try_bool(qdict, "tls", 0); if (!using_spice) { @@ -828,18 +828,18 @@ static int add_graphics_client(Monitor *mon, const QDict *qdict, QObject **ret_d return 0; #ifdef CONFIG_VNC } else if (strcmp(protocol, "vnc") == 0) { - int fd = monitor_get_fd(mon, fdname); + int fd = monitor_get_fd(mon, fdname, NULL); int skipauth = qdict_get_try_bool(qdict, "skipauth", 0); - vnc_display_add_client(NULL, fd, skipauth); - return 0; + vnc_display_add_client(NULL, fd, skipauth); + return 0; #endif } else if ((s = qemu_chr_find(protocol)) != NULL) { - int fd = monitor_get_fd(mon, fdname); - if (qemu_chr_add_client(s, fd) < 0) { - qerror_report(QERR_ADD_CLIENT_FAILED); - return -1; - } - return 0; + int fd = monitor_get_fd(mon, fdname, NULL); + if (qemu_chr_add_client(s, fd) < 0) { + qerror_report(QERR_ADD_CLIENT_FAILED); + return -1; + } + return 0; } qerror_report(QERR_INVALID_PARAMETER, "protocol"); @@ -2182,57 +2182,69 @@ static void do_inject_mce(Monitor *mon, const QDict *qdict) } #endif -void qmp_getfd(const char *fdname, Error **errp) +static int monitor_put_fd(Monitor *mon, const char *fdname, bool replace, + bool copy, Error **errp) { mon_fd_t *monfd; int fd; - fd = qemu_chr_fe_get_msgfd(cur_mon->chr); + fd = qemu_chr_fe_get_msgfd(mon->chr); if (fd == -1) { error_set(errp, QERR_FD_NOT_SUPPLIED); - return; + return -1; } if (qemu_isdigit(fdname[0])) { error_set(errp, QERR_INVALID_PARAMETER_VALUE, "fdname", "a name not starting with a digit"); - return; + return -1; } - QLIST_FOREACH(monfd, &cur_mon->fds, next) { + QLIST_FOREACH(monfd, &mon->fds, next) { if (strcmp(monfd->name, fdname) != 0) { continue; } - close(monfd->fd); - monfd->fd = fd; - return; + if (replace) { + /* the existing fd is replaced with the new fd */ + close(monfd->fd); + monfd->fd = fd; + return fd; + } + + if (copy) { + /* the existing fd becomes a copy of the new fd */ + if (dup2(fd, monfd->fd) == -1) { + error_set(errp, QERR_UNDEFINED_ERROR); + return -1; + } + close(fd); + return monfd->fd; + } + + error_set(errp, QERR_INVALID_PARAMETER, "fdname"); + return -1; + } + + if (copy) { + error_set(errp, QERR_INVALID_PARAMETER, "fdname"); + return -1; } monfd = g_malloc0(sizeof(mon_fd_t)); monfd->name = g_strdup(fdname); monfd->fd = fd; - QLIST_INSERT_HEAD(&cur_mon->fds, monfd, next); + QLIST_INSERT_HEAD(&mon->fds, monfd, next); + return fd; } void qmp_closefd(const char *fdname, Error **errp) { - mon_fd_t *monfd; - - QLIST_FOREACH(monfd, &cur_mon->fds, next) { - if (strcmp(monfd->name, fdname) != 0) { - continue; - } - - QLIST_REMOVE(monfd, next); - close(monfd->fd); - g_free(monfd->name); - g_free(monfd); - return; + int fd = monitor_get_fd(cur_mon, fdname, errp); + if (fd != -1) { + close(fd); } - - error_set(errp, QERR_FD_NOT_FOUND, fdname); } static void do_loadvm(Monitor *mon, const QDict *qdict) @@ -2247,7 +2259,7 @@ static void do_loadvm(Monitor *mon, const QDict *qdict) } } -int monitor_get_fd(Monitor *mon, const char *fdname) +int monitor_get_fd(Monitor *mon, const char *fdname, Error **errp) { mon_fd_t *monfd; @@ -2268,9 +2280,25 @@ int monitor_get_fd(Monitor *mon, const char *fdname) return fd; } + error_set(errp, QERR_FD_NOT_FOUND, fdname); return -1; } +int64_t qmp_pass_fd(const char *fdname, bool has_force, bool force, + Error **errp) +{ + bool replace = false; + bool copy = has_force ? force : false; + return monitor_put_fd(cur_mon, fdname, replace, copy, errp); +} + +void qmp_getfd(const char *fdname, Error **errp) +{ + bool replace = true; + bool copy = false; + monitor_put_fd(cur_mon, fdname, replace, copy, errp); +} + /* mon_cmds and info_cmds would be sorted at runtime */ static mon_cmd_t mon_cmds[] = { #include "hmp-commands.h" diff --git a/monitor.h b/monitor.h index cd1d878..8fa515d 100644 --- a/monitor.h +++ b/monitor.h @@ -63,7 +63,7 @@ int monitor_read_block_device_key(Monitor *mon, const char *device, BlockDriverCompletionFunc *completion_cb, void *opaque); -int monitor_get_fd(Monitor *mon, const char *fdname); +int monitor_get_fd(Monitor *mon, const char *fdname, Error **errp); void monitor_vprintf(Monitor *mon, const char *fmt, va_list ap) GCC_FMT_ATTR(2, 0); diff --git a/net.c b/net.c index 4aa416c..28860b8 100644 --- a/net.c +++ b/net.c @@ -730,12 +730,14 @@ int qemu_find_nic_model(NICInfo *nd, const char * const *models, int net_handle_fd_param(Monitor *mon, const char *param) { int fd; + Error *local_err = NULL; if (!qemu_isdigit(param[0]) && mon) { - fd = monitor_get_fd(mon, param); + fd = monitor_get_fd(mon, param, &local_err); if (fd == -1) { - error_report("No file descriptor named %s found", param); + qerror_report_err(local_err); + error_free(local_err); return -1; } } else { diff --git a/qapi-schema.json b/qapi-schema.json index 26a6b84..2ac1261 100644 --- a/qapi-schema.json +++ b/qapi-schema.json @@ -1864,6 +1864,41 @@ { 'command': 'netdev_del', 'data': {'id': 'str'} } ## +# @pass-fd: +# +# Pass a file descriptor via SCM rights and assign it a name +# +# @fdname: file descriptor name +# +# @force: #optional true specifies that an existing file descriptor is +# associated with @fdname, and that it should become a copy of the +# newly passed file descriptor. +# +# Returns: The QEMU file descriptor that is assigned to @fdname +# If file descriptor was not received, FdNotSupplied +# If @fdname is not valid, InvalidParameterValue +# If @fdname already exists, and @force is not true, InvalidParameter +# If @fdname does not already exist, and @force is true, +# InvalidParameter +# If @force fails to copy the passed file descriptor, +# UndefinedError +# +# Since: 1.2.0 +# +# Notes: If @fdname already exists, and @force is not true, the +# command will fail. +# +# If @fdname already exists, and @force is true, the value of +# the existing file descriptor is returned when the command is +# successful. +# +# The 'closefd' command can be used to explicitly close the +# file descriptor when it is no longer needed. +## +{ 'command': 'pass-fd', 'data': {'fdname': 'str', '*force': 'bool'}, + 'returns': 'int' } + +## # @getfd: # # Receive a file descriptor via SCM rights and assign it a name @@ -1879,6 +1914,7 @@ # Notes: If @fdname already exists, the file descriptor assigned to # it will be closed and replaced by the received file # descriptor. +# # The 'closefd' command can be used to explicitly close the # file descriptor when it is no longer needed. ## diff --git a/qmp-commands.hx b/qmp-commands.hx index e3cf3c5..7e3c07e 100644 --- a/qmp-commands.hx +++ b/qmp-commands.hx @@ -869,9 +869,49 @@ Example: EQMP { + .name = "pass-fd", + .args_type = "fdname:s,force:b?", + .params = "pass-fd fdname force", + .help = "pass a file descriptor via SCM rights and assign it a name", + .mhandler.cmd_new = qmp_marshal_input_pass_fd, + }, + +SQMP +pass-fd +------- + +Pass a file descriptor via SCM rights and assign it a name. + +Arguments: + +- "fdname": file descriptor name (json-string) +- "force": true specifies that an existing file descriptor is associated + with "fdname", and that it should become a copy of the newly + passed file descriptor. (json-bool, optional) + +Return a json-int with the QEMU file descriptor that is assigned to "fdname". + +Example: + +-> { "execute": "pass-fd", "arguments": { "fdname": "fd1" } } +<- { "return": 42 } + +Notes: + +(1) If the name specified by the "fdname" argument already exists, and + "force" is not true, the command will fail. +(2) If the name specified by the "fdname" argument already exists, and + "force" is true, the value of the existing file descriptor is + returned when the command is successful. +(3) The 'closefd' command can be used to explicitly close the file + descriptor when it is no longer needed. + +EQMP + + { .name = "getfd", .args_type = "fdname:s", - .params = "getfd name", + .params = "getfd fdname", .help = "receive a file descriptor via SCM rights and assign it a name", .mhandler.cmd_new = qmp_marshal_input_getfd, }, -- 1.7.10.2 ^ permalink raw reply related [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [PATCH v4 3/7] qapi: Add pass-fd QMP command 2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 3/7] qapi: Add pass-fd QMP command Corey Bryant @ 2012-06-22 20:24 ` Eric Blake 0 siblings, 0 replies; 56+ messages in thread From: Eric Blake @ 2012-06-22 20:24 UTC (permalink / raw) To: Corey Bryant Cc: kwolf, aliguori, stefanha, libvir-list, qemu-devel, lcapitulino, pbonzini [-- Attachment #1: Type: text/plain, Size: 1616 bytes --] On 06/22/2012 12:36 PM, Corey Bryant wrote: > This patch adds the pass-fd QMP command using the QAPI framework. > Like the getfd command, it is used to pass a file descriptor via > SCM_RIGHTS and associate it with a name. However, the pass-fd > command also returns the received file descriptor, which is a > difference in behavior from the getfd command, which returns > nothing. pass-fd also supports a boolean "force" argument that > can be used to specify that an existing file descriptor is > associated with the specified name, and that it should become a > copy of the newly passed file descriptor. > + if (replace) { > + /* the existing fd is replaced with the new fd */ > + close(monfd->fd); > + monfd->fd = fd; > + return fd; > + } > + > + if (copy) { Since 'replace' and 'copy' are never both true, should this instead be a tri-state enum argument instead of two independent bool arguments? > + /* the existing fd becomes a copy of the new fd */ > + if (dup2(fd, monfd->fd) == -1) { Broken - you want to use dup3(fd, monfd->fd, O_CLOEXEC) (and fall back to dup2()+fcntl(F_GETFD/F_SETFD) on platforms where dup3 is not available; it has been proposed for the next revision of POSIX but right now is pretty much limited to Linux and Cygwin). Otherwise, you are undoing the fact that patch 1/7 just changed the 'getfd' list to always keep all its fds marked cloexec. -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 620 bytes --] ^ permalink raw reply [flat|nested] 56+ messages in thread
* [Qemu-devel] [PATCH v4 4/7] qapi: Re-arrange monitor.c functions 2012-06-22 18:36 [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd Corey Bryant ` (2 preceding siblings ...) 2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 3/7] qapi: Add pass-fd QMP command Corey Bryant @ 2012-06-22 18:36 ` Corey Bryant 2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 5/7] block: Prevent /dev/fd/X filename from being detected as floppy Corey Bryant ` (3 subsequent siblings) 7 siblings, 0 replies; 56+ messages in thread From: Corey Bryant @ 2012-06-22 18:36 UTC (permalink / raw) To: qemu-devel Cc: kwolf, aliguori, stefanha, libvir-list, lcapitulino, pbonzini, eblake Signed-off-by: Corey Bryant <coreyb@linux.vnet.ibm.com> --- v4: -This patch is new in v4. monitor.c | 40 ++++++++++++++++++++-------------------- 1 file changed, 20 insertions(+), 20 deletions(-) diff --git a/monitor.c b/monitor.c index 3433c06..153e949 100644 --- a/monitor.c +++ b/monitor.c @@ -2239,26 +2239,6 @@ static int monitor_put_fd(Monitor *mon, const char *fdname, bool replace, return fd; } -void qmp_closefd(const char *fdname, Error **errp) -{ - int fd = monitor_get_fd(cur_mon, fdname, errp); - if (fd != -1) { - close(fd); - } -} - -static void do_loadvm(Monitor *mon, const QDict *qdict) -{ - int saved_vm_running = runstate_is_running(); - const char *name = qdict_get_str(qdict, "name"); - - vm_stop(RUN_STATE_RESTORE_VM); - - if (load_vmstate(name) == 0 && saved_vm_running) { - vm_start(); - } -} - int monitor_get_fd(Monitor *mon, const char *fdname, Error **errp) { mon_fd_t *monfd; @@ -2299,6 +2279,26 @@ void qmp_getfd(const char *fdname, Error **errp) monitor_put_fd(cur_mon, fdname, replace, copy, errp); } +void qmp_closefd(const char *fdname, Error **errp) +{ + int fd = monitor_get_fd(cur_mon, fdname, errp); + if (fd != -1) { + close(fd); + } +} + +static void do_loadvm(Monitor *mon, const QDict *qdict) +{ + int saved_vm_running = runstate_is_running(); + const char *name = qdict_get_str(qdict, "name"); + + vm_stop(RUN_STATE_RESTORE_VM); + + if (load_vmstate(name) == 0 && saved_vm_running) { + vm_start(); + } +} + /* mon_cmds and info_cmds would be sorted at runtime */ static mon_cmd_t mon_cmds[] = { #include "hmp-commands.h" -- 1.7.10.2 ^ permalink raw reply related [flat|nested] 56+ messages in thread
* [Qemu-devel] [PATCH v4 5/7] block: Prevent /dev/fd/X filename from being detected as floppy 2012-06-22 18:36 [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd Corey Bryant ` (3 preceding siblings ...) 2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 4/7] qapi: Re-arrange monitor.c functions Corey Bryant @ 2012-06-22 18:36 ` Corey Bryant 2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 6/7] block: Convert open calls to qemu_open Corey Bryant ` (2 subsequent siblings) 7 siblings, 0 replies; 56+ messages in thread From: Corey Bryant @ 2012-06-22 18:36 UTC (permalink / raw) To: qemu-devel Cc: kwolf, aliguori, stefanha, libvir-list, lcapitulino, pbonzini, eblake Reported-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Corey Bryant <coreyb@linux.vnet.ibm.com> --- v3: -This patch is new in v3. It was previously submitted on its own, and is now being included in this series. v4 -Moved patch to be earlier in series (lcapitulino@redhat.com) block/raw-posix.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/block/raw-posix.c b/block/raw-posix.c index 03fcfcc..0d9be0b 100644 --- a/block/raw-posix.c +++ b/block/raw-posix.c @@ -946,9 +946,11 @@ static int floppy_probe_device(const char *filename) int prio = 0; struct floppy_struct fdparam; struct stat st; + const char *p; - if (strstart(filename, "/dev/fd", NULL)) + if (strstart(filename, "/dev/fd", &p) && p[0] != '/') { prio = 50; + } fd = open(filename, O_RDONLY | O_NONBLOCK); if (fd < 0) { -- 1.7.10.2 ^ permalink raw reply related [flat|nested] 56+ messages in thread
* [Qemu-devel] [PATCH v4 6/7] block: Convert open calls to qemu_open 2012-06-22 18:36 [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd Corey Bryant ` (4 preceding siblings ...) 2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 5/7] block: Prevent /dev/fd/X filename from being detected as floppy Corey Bryant @ 2012-06-22 18:36 ` Corey Bryant 2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 7/7] osdep: Enable qemu_open to dup pre-opened fd Corey Bryant [not found] ` <20120626091004.GA14451@redhat.com> 7 siblings, 0 replies; 56+ messages in thread From: Corey Bryant @ 2012-06-22 18:36 UTC (permalink / raw) To: qemu-devel Cc: kwolf, aliguori, stefanha, libvir-list, lcapitulino, pbonzini, eblake This patch converts all block layer open calls to qemu_open. This enables all block layer open paths to dup(X) a pre-opened file descriptor if the filename is of the format /dev/fd/X. This is useful if QEMU is restricted from opening certain files. Note that this adds the O_CLOEXEC flag to the changed open paths when the O_CLOEXEC macro is defined. Signed-off-by: Corey Bryant <coreyb@linux.vnet.ibm.com> --- v2: -Convert calls to qemu_open instead of file_open (kwolf@redhat.com) -Mention introduction of O_CLOEXEC (kwolf@redhat.com) v3: -No changes v4: -No changes block/raw-posix.c | 18 +++++++++--------- block/raw-win32.c | 4 ++-- block/vdi.c | 5 +++-- block/vmdk.c | 21 +++++++++------------ block/vpc.c | 2 +- block/vvfat.c | 21 +++++++++++---------- 6 files changed, 35 insertions(+), 36 deletions(-) diff --git a/block/raw-posix.c b/block/raw-posix.c index 0d9be0b..68886cd 100644 --- a/block/raw-posix.c +++ b/block/raw-posix.c @@ -568,8 +568,8 @@ static int raw_create(const char *filename, QEMUOptionParameter *options) options++; } - fd = open(filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY, - 0644); + fd = qemu_open(filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY, + 0644); if (fd < 0) { result = -errno; } else { @@ -741,7 +741,7 @@ static int hdev_open(BlockDriverState *bs, const char *filename, int flags) if ( bsdPath[ 0 ] != '\0' ) { strcat(bsdPath,"s0"); /* some CDs don't have a partition 0 */ - fd = open(bsdPath, O_RDONLY | O_BINARY | O_LARGEFILE); + fd = qemu_open(bsdPath, O_RDONLY | O_BINARY | O_LARGEFILE); if (fd < 0) { bsdPath[strlen(bsdPath)-1] = '1'; } else { @@ -798,7 +798,7 @@ static int fd_open(BlockDriverState *bs) #endif return -EIO; } - s->fd = open(bs->filename, s->open_flags & ~O_NONBLOCK); + s->fd = qemu_open(bs->filename, s->open_flags & ~O_NONBLOCK); if (s->fd < 0) { s->fd_error_time = get_clock(); s->fd_got_error = 1; @@ -872,7 +872,7 @@ static int hdev_create(const char *filename, QEMUOptionParameter *options) options++; } - fd = open(filename, O_WRONLY | O_BINARY); + fd = qemu_open(filename, O_WRONLY | O_BINARY); if (fd < 0) return -errno; @@ -952,7 +952,7 @@ static int floppy_probe_device(const char *filename) prio = 50; } - fd = open(filename, O_RDONLY | O_NONBLOCK); + fd = qemu_open(filename, O_RDONLY | O_NONBLOCK); if (fd < 0) { goto out; } @@ -1005,7 +1005,7 @@ static void floppy_eject(BlockDriverState *bs, bool eject_flag) close(s->fd); s->fd = -1; } - fd = open(bs->filename, s->open_flags | O_NONBLOCK); + fd = qemu_open(bs->filename, s->open_flags | O_NONBLOCK); if (fd >= 0) { if (ioctl(fd, FDEJECT, 0) < 0) perror("FDEJECT"); @@ -1055,7 +1055,7 @@ static int cdrom_probe_device(const char *filename) int prio = 0; struct stat st; - fd = open(filename, O_RDONLY | O_NONBLOCK); + fd = qemu_open(filename, O_RDONLY | O_NONBLOCK); if (fd < 0) { goto out; } @@ -1179,7 +1179,7 @@ static int cdrom_reopen(BlockDriverState *bs) */ if (s->fd >= 0) close(s->fd); - fd = open(bs->filename, s->open_flags, 0644); + fd = qemu_open(bs->filename, s->open_flags, 0644); if (fd < 0) { s->fd = -1; return -EIO; diff --git a/block/raw-win32.c b/block/raw-win32.c index e4b0b75..8d7838d 100644 --- a/block/raw-win32.c +++ b/block/raw-win32.c @@ -255,8 +255,8 @@ static int raw_create(const char *filename, QEMUOptionParameter *options) options++; } - fd = open(filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY, - 0644); + fd = qemu_open(filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY, + 0644); if (fd < 0) return -EIO; set_sparse(fd); diff --git a/block/vdi.c b/block/vdi.c index 119d3c7..6183fdf 100644 --- a/block/vdi.c +++ b/block/vdi.c @@ -648,8 +648,9 @@ static int vdi_create(const char *filename, QEMUOptionParameter *options) options++; } - fd = open(filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY | O_LARGEFILE, - 0644); + fd = qemu_open(filename, + O_WRONLY | O_CREAT | O_TRUNC | O_BINARY | O_LARGEFILE, + 0644); if (fd < 0) { return -errno; } diff --git a/block/vmdk.c b/block/vmdk.c index 18e9b4c..557dc1b 100644 --- a/block/vmdk.c +++ b/block/vmdk.c @@ -1161,10 +1161,9 @@ static int vmdk_create_extent(const char *filename, int64_t filesize, VMDK4Header header; uint32_t tmp, magic, grains, gd_size, gt_size, gt_count; - fd = open( - filename, - O_WRONLY | O_CREAT | O_TRUNC | O_BINARY | O_LARGEFILE, - 0644); + fd = qemu_open(filename, + O_WRONLY | O_CREAT | O_TRUNC | O_BINARY | O_LARGEFILE, + 0644); if (fd < 0) { return -errno; } @@ -1484,15 +1483,13 @@ static int vmdk_create(const char *filename, QEMUOptionParameter *options) (flags & BLOCK_FLAG_COMPAT6 ? 6 : 4), total_size / (int64_t)(63 * 16 * 512)); if (split || flat) { - fd = open( - filename, - O_WRONLY | O_CREAT | O_TRUNC | O_BINARY | O_LARGEFILE, - 0644); + fd = qemu_open(filename, + O_WRONLY | O_CREAT | O_TRUNC | O_BINARY | O_LARGEFILE, + 0644); } else { - fd = open( - filename, - O_WRONLY | O_BINARY | O_LARGEFILE, - 0644); + fd = qemu_open(filename, + O_WRONLY | O_BINARY | O_LARGEFILE, + 0644); } if (fd < 0) { return -errno; diff --git a/block/vpc.c b/block/vpc.c index 5cd13d1..60ebf5a 100644 --- a/block/vpc.c +++ b/block/vpc.c @@ -678,7 +678,7 @@ static int vpc_create(const char *filename, QEMUOptionParameter *options) } /* Create the file */ - fd = open(filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY, 0644); + fd = qemu_open(filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY, 0644); if (fd < 0) { return -EIO; } diff --git a/block/vvfat.c b/block/vvfat.c index 0fd3367..81c3a19 100644 --- a/block/vvfat.c +++ b/block/vvfat.c @@ -1152,16 +1152,17 @@ static inline mapping_t* find_mapping_for_cluster(BDRVVVFATState* s,int cluster_ static int open_file(BDRVVVFATState* s,mapping_t* mapping) { if(!mapping) - return -1; + return -1; if(!s->current_mapping || - strcmp(s->current_mapping->path,mapping->path)) { - /* open file */ - int fd = open(mapping->path, O_RDONLY | O_BINARY | O_LARGEFILE); - if(fd<0) - return -1; - vvfat_close_current_file(s); - s->current_fd = fd; - s->current_mapping = mapping; + strcmp(s->current_mapping->path, mapping->path)) { + /* open file */ + int fd = qemu_open(mapping->path, O_RDONLY | O_BINARY | O_LARGEFILE); + if (fd < 0) { + return -1; + } + vvfat_close_current_file(s); + s->current_fd = fd; + s->current_mapping = mapping; } return 0; } @@ -2215,7 +2216,7 @@ static int commit_one_file(BDRVVVFATState* s, for (i = s->cluster_size; i < offset; i += s->cluster_size) c = modified_fat_get(s, c); - fd = open(mapping->path, O_RDWR | O_CREAT | O_BINARY, 0666); + fd = qemu_open(mapping->path, O_RDWR | O_CREAT | O_BINARY, 0666); if (fd < 0) { fprintf(stderr, "Could not open %s... (%s, %d)\n", mapping->path, strerror(errno), errno); -- 1.7.10.2 ^ permalink raw reply related [flat|nested] 56+ messages in thread
* [Qemu-devel] [PATCH v4 7/7] osdep: Enable qemu_open to dup pre-opened fd 2012-06-22 18:36 [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd Corey Bryant ` (5 preceding siblings ...) 2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 6/7] block: Convert open calls to qemu_open Corey Bryant @ 2012-06-22 18:36 ` Corey Bryant 2012-06-22 19:58 ` Eric Blake [not found] ` <20120626091004.GA14451@redhat.com> 7 siblings, 1 reply; 56+ messages in thread From: Corey Bryant @ 2012-06-22 18:36 UTC (permalink / raw) To: qemu-devel Cc: kwolf, aliguori, stefanha, libvir-list, lcapitulino, pbonzini, eblake This patch adds support to qemu_open to dup(fd) a pre-opened file descriptor if the filename is of the format /dev/fd/X. This can be used when QEMU is restricted from opening files, and the management application opens files on QEMU's behalf. If the fd was passed to the monitor with the pass-fd command, it must be explicitly closed with the 'closefd' command when it is no longer required, in order to prevent fd leaks. Signed-off-by: Corey Bryant <coreyb@linux.vnet.ibm.com> --- v2: -Get rid of file_open and move dup code to qemu_open (kwolf@redhat.com) -Use strtol wrapper instead of atoi (kwolf@redhat.com) v3: -Add note about fd leakage (eblake@redhat.com) v4 -Moved patch to be later in series (lcapitulino@redhat.com) -Update qemu_open to check access mode flags and set flags that can be set (eblake@redhat.com, kwolf@redhat.com) cutils.c | 26 +++++++++++++---- main-loop.c | 6 ++-- osdep.c | 91 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++ qemu-common.h | 2 +- 4 files changed, 116 insertions(+), 9 deletions(-) diff --git a/cutils.c b/cutils.c index af308cd..f45d921 100644 --- a/cutils.c +++ b/cutils.c @@ -339,17 +339,33 @@ bool buffer_is_zero(const void *buf, size_t len) } #ifndef _WIN32 -/* Sets a specific flag */ -int fcntl_setfl(int fd, int flag) +/* Sets a specific flag on or off */ +int fcntl_setfl(int fd, int flag, int onoff) { int flags; + if (onoff != 0 && onoff != 1) { + return -EINVAL; + } + flags = fcntl(fd, F_GETFL); - if (flags == -1) + if (flags == -1) { return -errno; + } - if (fcntl(fd, F_SETFL, flags | flag) == -1) - return -errno; + if (onoff == 1) { + if ((flags & flag) != flag) { + if (fcntl(fd, F_SETFL, flags | flag) == -1) { + return -errno; + } + } + } else { + if ((flags & flag) == flag) { + if (fcntl(fd, F_SETFL, flags & ~flag) == -1) { + return -errno; + } + } + } return 0; } diff --git a/main-loop.c b/main-loop.c index eb3b6e6..644fcc3 100644 --- a/main-loop.c +++ b/main-loop.c @@ -75,11 +75,11 @@ static int qemu_event_init(void) if (err == -1) { return -errno; } - err = fcntl_setfl(fds[0], O_NONBLOCK); + err = fcntl_setfl(fds[0], O_NONBLOCK, 1); if (err < 0) { goto fail; } - err = fcntl_setfl(fds[1], O_NONBLOCK); + err = fcntl_setfl(fds[1], O_NONBLOCK, 1); if (err < 0) { goto fail; } @@ -154,7 +154,7 @@ static int qemu_signal_init(void) return -errno; } - fcntl_setfl(sigfd, O_NONBLOCK); + fcntl_setfl(sigfd, O_NONBLOCK, 1); qemu_set_fd_handler2(sigfd, NULL, sigfd_handler, NULL, (void *)(intptr_t)sigfd); diff --git a/osdep.c b/osdep.c index 3e6bada..a6fc758d 100644 --- a/osdep.c +++ b/osdep.c @@ -73,6 +73,63 @@ int qemu_madvise(void *addr, size_t len, int advice) #endif } +/* + * Dups an fd and sets the flags + */ +static int qemu_dup(int fd, int flags) +{ + int ret; + int serrno; + + if (flags & O_CLOEXEC) { + ret = fcntl(fd, F_DUPFD_CLOEXEC, 0); + if (ret == -1 && errno == EINVAL) { + ret = dup(fd); + if (ret == -1) { + goto fail; + } + if (fcntl_setfl(ret, O_CLOEXEC, (flags & O_CLOEXEC) ? 1 : 0) < 0) { + goto fail; + } + } + } else { + ret = dup(fd); + } + + if (ret == -1) { + goto fail; + } + + /* Truncate the file in the cases that open would truncate it */ + if (flags & O_TRUNC || + ((flags & (O_CREAT | O_EXCL)) == (O_CREAT | O_EXCL))) { + if (ftruncate(ret, 0) == -1) { + goto fail; + } + } + + if ((fcntl_setfl(ret, O_APPEND, (flags & O_APPEND) ? 1 : 0) < 0) || + (fcntl_setfl(ret, O_ASYNC, (flags & O_ASYNC) ? 1 : 0) < 0) || + (fcntl_setfl(ret, O_DIRECT, (flags & O_DIRECT) ? 1 : 0) < 0) || + (fcntl_setfl(ret, O_LARGEFILE, (flags & O_LARGEFILE) ? 1 : 0) < 0) || + (fcntl_setfl(ret, O_NDELAY, (flags & O_NDELAY) ? 1 : 0) < 0) || + (fcntl_setfl(ret, O_NOATIME, (flags & O_NOATIME) ? 1 : 0) < 0) || + (fcntl_setfl(ret, O_NOCTTY, (flags & O_NOCTTY) ? 1 : 0) < 0) || + (fcntl_setfl(ret, O_NONBLOCK, (flags & O_NONBLOCK) ? 1 : 0) < 0) || + (fcntl_setfl(ret, O_SYNC, (flags & O_SYNC) ? 1 : 0) < 0)) { + goto fail; + } + + return ret; + +fail: + serrno = errno; + if (ret != -1) { + close(ret); + } + errno = serrno; + return -1; +} /* * Opens a file with FD_CLOEXEC set @@ -82,6 +139,40 @@ int qemu_open(const char *name, int flags, ...) int ret; int mode = 0; +#ifndef _WIN32 + const char *p; + + /* Attempt dup of fd for pre-opened file */ + if (strstart(name, "/dev/fd/", &p)) { + int fd; + int eflags; + + fd = qemu_parse_fd(p); + if (fd == -1) { + return -1; + } + + /* Get the existing fd's flags */ + eflags = fcntl(fd, F_GETFL); + if (eflags == -1) { + return -1; + } + + if (((flags & O_RDWR) != (eflags & O_RDWR)) || + ((flags & O_RDONLY) != (eflags & O_RDONLY)) || + ((flags & O_WRONLY) != (eflags & O_WRONLY))) { + errno = EACCES; + return -1; + } + + if (fcntl_setfl(fd, O_CLOEXEC, 1) < 0) { + return -1; + } + + return qemu_dup(fd, flags); + } +#endif + if (flags & O_CREAT) { va_list ap; diff --git a/qemu-common.h b/qemu-common.h index 91e0562..99cbbc5 100644 --- a/qemu-common.h +++ b/qemu-common.h @@ -144,7 +144,7 @@ int qemu_strnlen(const char *s, int max_len); time_t mktimegm(struct tm *tm); int qemu_fls(int i); int qemu_fdatasync(int fd); -int fcntl_setfl(int fd, int flag); +int fcntl_setfl(int fd, int flag, int onoff); int qemu_parse_fd(const char *param); /* -- 1.7.10.2 ^ permalink raw reply related [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [PATCH v4 7/7] osdep: Enable qemu_open to dup pre-opened fd 2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 7/7] osdep: Enable qemu_open to dup pre-opened fd Corey Bryant @ 2012-06-22 19:58 ` Eric Blake 0 siblings, 0 replies; 56+ messages in thread From: Eric Blake @ 2012-06-22 19:58 UTC (permalink / raw) To: Corey Bryant Cc: kwolf, aliguori, stefanha, libvir-list, qemu-devel, lcapitulino, pbonzini [-- Attachment #1: Type: text/plain, Size: 4185 bytes --] On 06/22/2012 12:36 PM, Corey Bryant wrote: > This patch adds support to qemu_open to dup(fd) a pre-opened file > descriptor if the filename is of the format /dev/fd/X. > > This can be used when QEMU is restricted from opening files, and > the management application opens files on QEMU's behalf. > > If the fd was passed to the monitor with the pass-fd command, it > must be explicitly closed with the 'closefd' command when it is > no longer required, in order to prevent fd leaks. > > +static int qemu_dup(int fd, int flags) > +{ > + int ret; > + int serrno; > + > + if (flags & O_CLOEXEC) { > + ret = fcntl(fd, F_DUPFD_CLOEXEC, 0); F_DUPFD_CLOEXEC is required by POSIX, but not implemented on all platforms yet. Do you need to be checking with #ifdef F_DUPFD_CLOEXEC to avoid compilation failure? > + if (ret == -1 && errno == EINVAL) { > + ret = dup(fd); > + if (ret == -1) { > + goto fail; > + } > + if (fcntl_setfl(ret, O_CLOEXEC, (flags & O_CLOEXEC) ? 1 : 0) < 0) { Broken. O_CLOEXEC _only_ affects open(); to change it on an existing fd, you have to use fcntl(F_GETFD/F_SETFD) (not F_GETFL/F_SETFL). > + > + if ((fcntl_setfl(ret, O_APPEND, (flags & O_APPEND) ? 1 : 0) < 0) || > + (fcntl_setfl(ret, O_ASYNC, (flags & O_ASYNC) ? 1 : 0) < 0) || > + (fcntl_setfl(ret, O_DIRECT, (flags & O_DIRECT) ? 1 : 0) < 0) || > + (fcntl_setfl(ret, O_LARGEFILE, (flags & O_LARGEFILE) ? 1 : 0) < 0) || Pointless. O_LARGEFILE should _always_ be set, since we are compiling for 64-bit off_t always. > + (fcntl_setfl(ret, O_NDELAY, (flags & O_NDELAY) ? 1 : 0) < 0) || > + (fcntl_setfl(ret, O_NOATIME, (flags & O_NOATIME) ? 1 : 0) < 0) || > + (fcntl_setfl(ret, O_NOCTTY, (flags & O_NOCTTY) ? 1 : 0) < 0) || > + (fcntl_setfl(ret, O_NONBLOCK, (flags & O_NONBLOCK) ? 1 : 0) < 0) || > + (fcntl_setfl(ret, O_SYNC, (flags & O_SYNC) ? 1 : 0) < 0)) { Yuck. That's a lot of syscalls (1 per fcntl_setfl() if they are already set correctly, and 2 per fcntl_setfl() call if we are toggling each one). It might be better to combine this into at most 2 fcntl() calls, instead of a long sequence. > + /* Get the existing fd's flags */ > + eflags = fcntl(fd, F_GETFL); > + if (eflags == -1) { > + return -1; > + } > + > + if (((flags & O_RDWR) != (eflags & O_RDWR)) || > + ((flags & O_RDONLY) != (eflags & O_RDONLY)) || > + ((flags & O_WRONLY) != (eflags & O_WRONLY))) { Broken. O_RDWR, O_RDONLY, and O_WRONLY are NOT bitmasks, but are values in the range of O_ACCMODE. In particular, O_RDONLY==0 on some platforms (Linux), and ==1 on others (Hurd), and although POSIX recommends that O_RDWR==(O_RDONLY|O_WRONLY) for any new systems, no one has really done that except Hurd. A correct way to write this is: switch (flags & O_ACCMODE) { case O_RDWR: if ((eflags & O_ACCMODE) != O_RDWR) { goto error; break; case O_RDONLY: if ((eflags & O_ACCMODE) != O_RDONLY) { goto error; break; case O_RDONLY: if ((eflags & O_ACCMODE) != O_RDONLY) { goto error; break; default: goto error: } [Technically, POSIX also requires O_ACCMODE to include O_SEARCH and O_EXEC, although those two constants might be the same value; but right now Linux has not yet implemented that bit; but unless qemu ever gains the need to open executable binaries with O_EXEC or directories with O_SEARCH, we probably don't have to worry about that aspect of O_ACCMODE here.] > + errno = EACCES; > + return -1; > + } > + > + if (fcntl_setfl(fd, O_CLOEXEC, 1) < 0) { Again, broken. Besides, why are you attempting it both here and in qemu_dup()? Shouldn't once be enough? > + return -1; > + } > + > + return qemu_dup(fd, flags); -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 620 bytes --] ^ permalink raw reply [flat|nested] 56+ messages in thread
[parent not found: <20120626091004.GA14451@redhat.com>]
[parent not found: <4FE9A0F0.2050809@redhat.com>]
[parent not found: <20120626175045.2c7011b3@doriath.home>]
[parent not found: <4FEA37A9.10707@linux.vnet.ibm.com>]
[parent not found: <4FEA3D9C.8080205@redhat.com>]
* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd [not found] ` <4FEA3D9C.8080205@redhat.com> @ 2012-07-02 22:02 ` Corey Bryant 2012-07-02 22:31 ` Eric Blake 2012-07-03 15:40 ` Corey Bryant 0 siblings, 2 replies; 56+ messages in thread From: Corey Bryant @ 2012-07-02 22:02 UTC (permalink / raw) To: Eric Blake Cc: Kevin Wolf, aliguori, stefanha, libvir-list, qemu-devel, Luiz Capitulino, pbonzini On 06/26/2012 06:54 PM, Eric Blake wrote: > On 06/26/2012 04:28 PM, Corey Bryant wrote: > >>>>> With this proposed series, we have usage akin to: >>>>> >>>>> 1. pass_fd FDSET={M} -> returns a string "/dev/fd/N" showing QEMU's >>>>> view of the FD >>>>> 2. drive_add file=/dev/fd/N >>>>> 3. if failure: >>>>> close_fd "/dev/fd/N" >>>> >>>> In fact, there are more steps: >>>> >>>> 4. use it successfully >>>> 5. close_fd "/dev/fd/N" >>>> >>>> I think it would well be possible that qemu just closes the fd when it's >>>> not used internally any more. >>> >>> pass-fd could have a flag indicating qemu to do that. >>> >> >> It seems like libvirt would be in a better position to understand when a >> file is no longer in use, and then it can call close_fd. No? Of course >> the the only fd that needs to be closed is the originally passed fd. The >> dup'd fd's are closed by QEMU. > > The more we discuss this, the more I'm convinced that commands like > 'block-commit' that will reopen a file to change from O_RDONLY to O_RDWR > will need to have an optional argument that says the name of the file to > reopen. That is, I've seen three proposals: > > Proposal One: enhance every command to take an fd:name protocol. > PRO: Successful use of a name removes the fd from the 'getfd' list. > CON: Lots of commands to change. > CON: The command sequence is not atomic. > CON: Block layer does not have a file name tied to the fd, so the reopen > operation also needs to learn the fd:name protocol, but since we're > already touching the command it's not much more work. > USAGE: > 1. getfd FDSET={M}, ties new fd to "name" > 2. drive_add fd=name - looks up the fd by name from the list > 3a. on success, fd is no longer in the list, drive_add consumed it > 3b. on failure, libvirt must call 'closefd name' to avoid a leak > 4. getfd FDSET={M2}, ties new fd to "newname" > 5. block-commit fd=newname - looks up fd by name from the list > 6a. on success, fd is no longer in the list, block-commit consumed it > 6b. on failure, libvirt must call 'closefd newname' to avoid a leak > > Proposal Two: Create a permanent name via 'getfd' or 'pass-fd', then > update that name to the new fd in advance of any operation that needs to > do a reopen. > PRO: All other commands work without impact by using qemu_open(), by > getting a clone of the permanent name. > CON: The permanent name must remain open as long as qemu might need to > reopen it, and reopening needs the pass-fd force option. > CON: The command sequence is not atomic. > USAGE: > 1. pass_fd FDSET={M} -> returns an integer N (or a string "/dev/fd/N") > showing QEMU's permanent name of the FD > 2. drive_add file=<permanent name> means that qemu_open() will clone the > fd, and drive_add is now using yet another FD while N remains open; > meanwhile, the block layer knows that the drive is tied to the permanent > name > 3. pass-fd force FDSET={M2} -> replaces fd N with the passed M2, and > still returns /dev/fd/N > 4. block-commit - looks up the block-device name (/dev/fd/N), which maps > back to the permanent fd, and gets a copy of the newly passed M2 > 5. on completion (success or failure), libvirt must call 'closefd name' > to avoid a leak > > Proposal Three: Use thread-local fds passed alongside any command, > rather than passing via a separate command > PRO: All commands can now atomically handle fd passing > PRO: Commands that only need a one-shot open can now use fd > CON: Commands that need to reopen still need modification to allow a > name override during the reopen > 1. drive_add nfds=1 file="/dev/fdlist/1" FDSET={M} -> on success, the fd > is used as the block file, on failure it is atomically closed, so there > is no leak either way. However, the block file has no permanent name. > 2. block-commit nfds=1 file-override="/dev/fdlist/1" FDSET={M2} -> file > override must be passed again (since we have no guarantee that the > /dev/fdlist/1 of _this_ command matches the command-local naming used in > the previous command), but the fd is again used atomically > > Under proposal 3, there is no need to dup fd's, merely only to check > that qemu_open("/dev/fdlist/n", flags) has compatible flags with the fd > received via FDSET. And the fact that things are made atomic means that > libvirt no longer has to worry about calling closefd, nor does it have > to worry about being interrupted between two monitor commands and not > knowing what state qemu is in. But it does mean teaching every possible > command that wants to do a reopen to learn how to use fd overrides > rather than to reuse the permanent name that was originally in place on > the first open. > Here's another option that Kevin and I discussed today on IRC. I've modified a few minor details since the discussion. And Kevin please correct me if anything is wrong. Proposal Four: Pass a set of fds via 'pass-fds'. The group of fds should all refer to the same file, but may have different access flags (ie. O_RDWR, O_RDONLY). qemu_open can then dup the fd that has the matching access mode flags. pass-fds: { 'command': 'pass-fds', 'data': {'fdsetname': 'str', '*close': 'bool'}, 'returns': 'string' } close-fds: { 'command': 'close-fds', 'data': {'fdsetname': 'str' } where: @fdsetname - the file descriptor set name @close - *if true QEMU closes the monitor fds when they expire. if false, fds remain on the monitor until close-fds command. PRO: Supports reopen PRO: All other commands work without impact by using qemu_open() PRO: No fd leakage if close==true specified CON: Determining when to close fds when close==true may be tricky USAGE: 1. pass-fd FDSET={M} -> returns a string "/dev/fdset/1") that refers to the passed set of fds. 2. drive_add file=/dev/fdset/1 -- qemu_open() uses the first fd from the list that has access flags matching the qemu_open() action flags. 3. block-commit -- qemu_open() reopens "/dev/fdset/1" by using the first fd from the list that has access flags matching the qemu_open() action flags. 4. The monitor fds are closed: A) *If @close == true, qemu closes the set of fds when the timer expires B) If @close == false, libvirt must issue close-fds command to close the set of fds *How to solve this has yet to be determined. Kevin mentioned potentially using a bottom-half or a timer in qemu_close(). -- Regards, Corey ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd 2012-07-02 22:02 ` [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd Corey Bryant @ 2012-07-02 22:31 ` Eric Blake 2012-07-03 9:07 ` Daniel P. Berrange ` (2 more replies) 2012-07-03 15:40 ` Corey Bryant 1 sibling, 3 replies; 56+ messages in thread From: Eric Blake @ 2012-07-02 22:31 UTC (permalink / raw) To: Corey Bryant Cc: Kevin Wolf, aliguori, stefanha, libvir-list, qemu-devel, Luiz Capitulino, pbonzini [-- Attachment #1: Type: text/plain, Size: 3952 bytes --] On 07/02/2012 04:02 PM, Corey Bryant wrote: > Here's another option that Kevin and I discussed today on IRC. I've > modified a few minor details since the discussion. And Kevin please > correct me if anything is wrong. > > Proposal Four: Pass a set of fds via 'pass-fds'. The group of fds > should all refer to the same file, but may have different access flags > (ie. O_RDWR, O_RDONLY). qemu_open can then dup the fd that has the > matching access mode flags. But this means that libvirt has to open a file O_RDWR up front for any file that it _might_ need qemu to reopen later, and that qemu is now hanging on to 2 fds per fdset instead of 1 fd for the life of any client of the fdset. I see no reason why libvirt can't pass in an O_RDWR fd when qemu only needs to use an O_RDONLY fd; making qemu insist on an O_RDONLY fd makes life harder for libvirt. On the other hand, I can see from a safety standpoint that passing in an O_RDWR fd opens up more potential for errors than passing in an O_RDONLY fd, but if you don't know up front whether you will ever need to write into a file, then it would be better to pass in O_RDONLY. At least I don't see it as a security risk: passing in O_RDWR but setting SELinux permissions on the fd to only allow read() but not write() via the labeling of the fd may be possible, so that libvirt could still prevent accidental writes into an O_RDWR file that starts life only needing to service reads. > pass-fds: > { 'command': 'pass-fds', > 'data': {'fdsetname': 'str', '*close': 'bool'}, > 'returns': 'string' } This still doesn't solve Dan's complaint that things are not atomic; if the monitor dies between the pass-fds and the later use of the fdset, we would need a counterpart monitor command to query what fdsets are currently known by qemu. And like you pointed out, you didn't make it clear how a timeout mechanism would be implemented to auto-close fds that are not consumed in a fixed amount of time - would that be another optional parameter to 'pass-fds'? Or do we need a way to initially create a set with only one O_RDONLY fd, but later pass in a new O_RDWR fd but associate it with the existing set rather than creating a new set (akin to the 'force' option of 'pass-fd' in proposal two)? > close-fds: > { 'command': 'close-fds', > 'data': {'fdsetname': 'str' } > where: > @fdsetname - the file descriptor set name > @close - *if true QEMU closes the monitor fds when they expire. > if false, fds remain on the monitor until close-fds > command. > PRO: Supports reopen > PRO: All other commands work without impact by using qemu_open() > PRO: No fd leakage if close==true specified > CON: Determining when to close fds when close==true may be tricky > USAGE: > 1. pass-fd FDSET={M} -> returns a string "/dev/fdset/1") that refers to > the passed set of fds. > 2. drive_add file=/dev/fdset/1 -- qemu_open() uses the first fd from the > list that has access flags matching the qemu_open() action flags. > 3. block-commit -- qemu_open() reopens "/dev/fdset/1" by using the first > fd from the list that has access flags matching the qemu_open() action > flags. > 4. The monitor fds are closed: > A) *If @close == true, qemu closes the set of fds when the timer > expires > B) If @close == false, libvirt must issue close-fds command to close > the set of fds > > *How to solve this has yet to be determined. Kevin mentioned > potentially using a bottom-half or a timer in qemu_close(). Still, it is certainly an option worth thinking about, and I'm hoping we can come to a solid design that everyone agrees provides the desired security and flexibility without having to recode every single existing command. -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 620 bytes --] ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd 2012-07-02 22:31 ` Eric Blake @ 2012-07-03 9:07 ` Daniel P. Berrange 2012-07-03 9:40 ` Kevin Wolf 2012-07-03 13:42 ` Corey Bryant 2 siblings, 0 replies; 56+ messages in thread From: Daniel P. Berrange @ 2012-07-03 9:07 UTC (permalink / raw) To: Eric Blake Cc: Kevin Wolf, aliguori, stefanha, libvir-list, Corey Bryant, qemu-devel, Luiz Capitulino, pbonzini On Mon, Jul 02, 2012 at 04:31:09PM -0600, Eric Blake wrote: > On 07/02/2012 04:02 PM, Corey Bryant wrote: > > > Here's another option that Kevin and I discussed today on IRC. I've > > modified a few minor details since the discussion. And Kevin please > > correct me if anything is wrong. > > > > Proposal Four: Pass a set of fds via 'pass-fds'. The group of fds > > should all refer to the same file, but may have different access flags > > (ie. O_RDWR, O_RDONLY). qemu_open can then dup the fd that has the > > matching access mode flags. > > But this means that libvirt has to open a file O_RDWR up front for any > file that it _might_ need qemu to reopen later, and that qemu is now > hanging on to 2 fds per fdset instead of 1 fd for the life of any client > of the fdset. > > I see no reason why libvirt can't pass in an O_RDWR fd when qemu only > needs to use an O_RDONLY fd; If libvirt has only granted read-only access to the file with sVirt, then passing a O_RDWR file handle to QEMU will result in an SELinux denial, even if QEMU doesn't try to do I/O on it. So this is out of the question. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd 2012-07-02 22:31 ` Eric Blake 2012-07-03 9:07 ` Daniel P. Berrange @ 2012-07-03 9:40 ` Kevin Wolf 2012-07-03 13:42 ` Corey Bryant 2 siblings, 0 replies; 56+ messages in thread From: Kevin Wolf @ 2012-07-03 9:40 UTC (permalink / raw) To: Eric Blake Cc: aliguori, stefanha, libvir-list, Corey Bryant, qemu-devel, Luiz Capitulino, pbonzini Am 03.07.2012 00:31, schrieb Eric Blake: > On 07/02/2012 04:02 PM, Corey Bryant wrote: > >> Here's another option that Kevin and I discussed today on IRC. I've >> modified a few minor details since the discussion. And Kevin please >> correct me if anything is wrong. >> >> Proposal Four: Pass a set of fds via 'pass-fds'. The group of fds >> should all refer to the same file, but may have different access flags >> (ie. O_RDWR, O_RDONLY). qemu_open can then dup the fd that has the >> matching access mode flags. > > But this means that libvirt has to open a file O_RDWR up front for any > file that it _might_ need qemu to reopen later, and that qemu is now > hanging on to 2 fds per fdset instead of 1 fd for the life of any client > of the fdset. It doesn't have to be up front. There's no reason not to have monitor commands that add or remove fds from a given fdset later. > I see no reason why libvirt can't pass in an O_RDWR fd when qemu only > needs to use an O_RDONLY fd; making qemu insist on an O_RDONLY fd makes > life harder for libvirt. On the other hand, I can see from a safety > standpoint that passing in an O_RDWR fd opens up more potential for > errors than passing in an O_RDONLY fd, Yes, this is exactly my consideration. Having a read-only file opened as O_RDWR gives us the chance to make stupid mistakes. > but if you don't know up front > whether you will ever need to write into a file, then it would be better > to pass in O_RDONLY. At least I don't see it as a security risk: > passing in O_RDWR but setting SELinux permissions on the fd to only > allow read() but not write() via the labeling of the fd may be possible, > so that libvirt could still prevent accidental writes into an O_RDWR > file that starts life only needing to service reads. But this would assume that libvirt knows exactly when a reopen happens, for each current and future qemu version. I wouldn't want to tie qemu's internals so closely to the management application, even if libvirt could probably get it reasonably right (allow rw before sending a monitor command; revoke rw when receiving a QMP event that the commit has completed). It wouldn't work if we had a qemu-initiated ro->rw transition, but I don't think we have one at the moment. >> pass-fds: >> { 'command': 'pass-fds', >> 'data': {'fdsetname': 'str', '*close': 'bool'}, >> 'returns': 'string' } > > This still doesn't solve Dan's complaint that things are not atomic; if > the monitor dies between the pass-fds and the later use of the fdset, we > would need a counterpart monitor command to query what fdsets are > currently known by qemu. If you want a query-fdsets, that should be easy enough. Actually, we might be able to even make the command transactionable. This would of course require blockdev-add to be transactionable as well to be of any use. > And like you pointed out, you didn't make it > clear how a timeout mechanism would be implemented to auto-close fds > that are not consumed in a fixed amount of time - would that be another > optional parameter to 'pass-fds'? Do you think a timeout is helpful? Can't we just drop libvirt's reference automatically if the monitor connection gets closed? The BH/timer thing Corey mentioned is more about the qemu internal problem that during a reopen there may be a short period where the old fd is closed, but the new one not yet opened, so the fdset might need to survive a short period with refcount 0. > Or do we need a way to initially create a set with only one O_RDONLY fd, > but later pass in a new O_RDWR fd but associate it with the existing set > rather than creating a new set (akin to the 'force' option of 'pass-fd' > in proposal two)? As I said above, yes, I think this makes a lot sense. Kevin ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd 2012-07-02 22:31 ` Eric Blake 2012-07-03 9:07 ` Daniel P. Berrange 2012-07-03 9:40 ` Kevin Wolf @ 2012-07-03 13:42 ` Corey Bryant 2 siblings, 0 replies; 56+ messages in thread From: Corey Bryant @ 2012-07-03 13:42 UTC (permalink / raw) To: Eric Blake Cc: Kevin Wolf, aliguori, stefanha, libvir-list, qemu-devel, Luiz Capitulino, pbonzini On 07/02/2012 06:31 PM, Eric Blake wrote: > On 07/02/2012 04:02 PM, Corey Bryant wrote: > >> Here's another option that Kevin and I discussed today on IRC. I've >> modified a few minor details since the discussion. And Kevin please >> correct me if anything is wrong. >> >> Proposal Four: Pass a set of fds via 'pass-fds'. The group of fds >> should all refer to the same file, but may have different access flags >> (ie. O_RDWR, O_RDONLY). qemu_open can then dup the fd that has the >> matching access mode flags. > > But this means that libvirt has to open a file O_RDWR up front for any > file that it _might_ need qemu to reopen later, and that qemu is now > hanging on to 2 fds per fdset instead of 1 fd for the life of any client > of the fdset. > > I see no reason why libvirt can't pass in an O_RDWR fd when qemu only > needs to use an O_RDONLY fd; making qemu insist on an O_RDONLY fd makes > life harder for libvirt. On the other hand, I can see from a safety > standpoint that passing in an O_RDWR fd opens up more potential for > errors than passing in an O_RDONLY fd, but if you don't know up front > whether you will ever need to write into a file, then it would be better > to pass in O_RDONLY. At least I don't see it as a security risk: > passing in O_RDWR but setting SELinux permissions on the fd to only > allow read() but not write() via the labeling of the fd may be possible, > so that libvirt could still prevent accidental writes into an O_RDWR > file that starts life only needing to service reads. > >> pass-fds: >> { 'command': 'pass-fds', >> 'data': {'fdsetname': 'str', '*close': 'bool'}, >> 'returns': 'string' } > > This still doesn't solve Dan's complaint that things are not atomic; if > the monitor dies between the pass-fds and the later use of the fdset, we > would need a counterpart monitor command to query what fdsets are > currently known by qemu. And like you pointed out, you didn't make it > clear how a timeout mechanism would be implemented to auto-close fds > that are not consumed in a fixed amount of time - would that be another > optional parameter to 'pass-fds'? Yes see the description of the close bool on pass-fds below which determines how the fds are closed. It's not an atomic operation, but it does handle Dan's concern of preventing fd leaks. If libvirt did crash after the pass-fd, then couldn't it just abandon the fd set (which will get closed) and libvirt could send a new fd set and work with that? > > Or do we need a way to initially create a set with only one O_RDONLY fd, > but later pass in a new O_RDWR fd but associate it with the existing set > rather than creating a new set (akin to the 'force' option of 'pass-fd' > in proposal two)? Yeah I see what you're saying. I was also wondering if it would make sense for qemu to consume the passed fds and use those (ie. no dup()). This would prevent the issues of having to close the fds that linger on the monitor. But I don't know if it's realistic for libvirt to know how many open calls qemu will need to make (this would need 1 fd passed per open call). > >> close-fds: >> { 'command': 'close-fds', >> 'data': {'fdsetname': 'str' } >> where: >> @fdsetname - the file descriptor set name >> @close - *if true QEMU closes the monitor fds when they expire. >> if false, fds remain on the monitor until close-fds >> command. >> PRO: Supports reopen >> PRO: All other commands work without impact by using qemu_open() >> PRO: No fd leakage if close==true specified >> CON: Determining when to close fds when close==true may be tricky >> USAGE: >> 1. pass-fd FDSET={M} -> returns a string "/dev/fdset/1") that refers to >> the passed set of fds. >> 2. drive_add file=/dev/fdset/1 -- qemu_open() uses the first fd from the >> list that has access flags matching the qemu_open() action flags. >> 3. block-commit -- qemu_open() reopens "/dev/fdset/1" by using the first >> fd from the list that has access flags matching the qemu_open() action >> flags. >> 4. The monitor fds are closed: >> A) *If @close == true, qemu closes the set of fds when the timer >> expires >> B) If @close == false, libvirt must issue close-fds command to close >> the set of fds >> >> *How to solve this has yet to be determined. Kevin mentioned >> potentially using a bottom-half or a timer in qemu_close(). > > Still, it is certainly an option worth thinking about, and I'm hoping we > can come to a solid design that everyone agrees provides the desired > security and flexibility without having to recode every single existing > command. > I agree. -- Regards, Corey ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd 2012-07-02 22:02 ` [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd Corey Bryant 2012-07-02 22:31 ` Eric Blake @ 2012-07-03 15:40 ` Corey Bryant 2012-07-03 15:59 ` Kevin Wolf 1 sibling, 1 reply; 56+ messages in thread From: Corey Bryant @ 2012-07-03 15:40 UTC (permalink / raw) To: Eric Blake, Kevin Wolf Cc: aliguori, stefanha, libvir-list, qemu-devel, Luiz Capitulino, pbonzini On 07/02/2012 06:02 PM, Corey Bryant wrote: > > > On 06/26/2012 06:54 PM, Eric Blake wrote: >> On 06/26/2012 04:28 PM, Corey Bryant wrote: >> >>>>>> With this proposed series, we have usage akin to: >>>>>> >>>>>> 1. pass_fd FDSET={M} -> returns a string "/dev/fd/N" showing >>>>>> QEMU's >>>>>> view of the FD >>>>>> 2. drive_add file=/dev/fd/N >>>>>> 3. if failure: >>>>>> close_fd "/dev/fd/N" >>>>> >>>>> In fact, there are more steps: >>>>> >>>>> 4. use it successfully >>>>> 5. close_fd "/dev/fd/N" >>>>> >>>>> I think it would well be possible that qemu just closes the fd when >>>>> it's >>>>> not used internally any more. >>>> >>>> pass-fd could have a flag indicating qemu to do that. >>>> >>> >>> It seems like libvirt would be in a better position to understand when a >>> file is no longer in use, and then it can call close_fd. No? Of course >>> the the only fd that needs to be closed is the originally passed fd. The >>> dup'd fd's are closed by QEMU. >> >> The more we discuss this, the more I'm convinced that commands like >> 'block-commit' that will reopen a file to change from O_RDONLY to O_RDWR >> will need to have an optional argument that says the name of the file to >> reopen. That is, I've seen three proposals: >> >> Proposal One: enhance every command to take an fd:name protocol. >> PRO: Successful use of a name removes the fd from the 'getfd' list. >> CON: Lots of commands to change. >> CON: The command sequence is not atomic. >> CON: Block layer does not have a file name tied to the fd, so the reopen >> operation also needs to learn the fd:name protocol, but since we're >> already touching the command it's not much more work. >> USAGE: >> 1. getfd FDSET={M}, ties new fd to "name" >> 2. drive_add fd=name - looks up the fd by name from the list >> 3a. on success, fd is no longer in the list, drive_add consumed it >> 3b. on failure, libvirt must call 'closefd name' to avoid a leak >> 4. getfd FDSET={M2}, ties new fd to "newname" >> 5. block-commit fd=newname - looks up fd by name from the list >> 6a. on success, fd is no longer in the list, block-commit consumed it >> 6b. on failure, libvirt must call 'closefd newname' to avoid a leak >> >> Proposal Two: Create a permanent name via 'getfd' or 'pass-fd', then >> update that name to the new fd in advance of any operation that needs to >> do a reopen. >> PRO: All other commands work without impact by using qemu_open(), by >> getting a clone of the permanent name. >> CON: The permanent name must remain open as long as qemu might need to >> reopen it, and reopening needs the pass-fd force option. >> CON: The command sequence is not atomic. >> USAGE: >> 1. pass_fd FDSET={M} -> returns an integer N (or a string "/dev/fd/N") >> showing QEMU's permanent name of the FD >> 2. drive_add file=<permanent name> means that qemu_open() will clone the >> fd, and drive_add is now using yet another FD while N remains open; >> meanwhile, the block layer knows that the drive is tied to the permanent >> name >> 3. pass-fd force FDSET={M2} -> replaces fd N with the passed M2, and >> still returns /dev/fd/N >> 4. block-commit - looks up the block-device name (/dev/fd/N), which maps >> back to the permanent fd, and gets a copy of the newly passed M2 >> 5. on completion (success or failure), libvirt must call 'closefd name' >> to avoid a leak >> >> Proposal Three: Use thread-local fds passed alongside any command, >> rather than passing via a separate command >> PRO: All commands can now atomically handle fd passing >> PRO: Commands that only need a one-shot open can now use fd >> CON: Commands that need to reopen still need modification to allow a >> name override during the reopen >> 1. drive_add nfds=1 file="/dev/fdlist/1" FDSET={M} -> on success, the fd >> is used as the block file, on failure it is atomically closed, so there >> is no leak either way. However, the block file has no permanent name. >> 2. block-commit nfds=1 file-override="/dev/fdlist/1" FDSET={M2} -> file >> override must be passed again (since we have no guarantee that the >> /dev/fdlist/1 of _this_ command matches the command-local naming used in >> the previous command), but the fd is again used atomically >> >> Under proposal 3, there is no need to dup fd's, merely only to check >> that qemu_open("/dev/fdlist/n", flags) has compatible flags with the fd >> received via FDSET. And the fact that things are made atomic means that >> libvirt no longer has to worry about calling closefd, nor does it have >> to worry about being interrupted between two monitor commands and not >> knowing what state qemu is in. But it does mean teaching every possible >> command that wants to do a reopen to learn how to use fd overrides >> rather than to reuse the permanent name that was originally in place on >> the first open. >> > > Here's another option that Kevin and I discussed today on IRC. I've > modified a few minor details since the discussion. And Kevin please > correct me if anything is wrong. > > Proposal Four: Pass a set of fds via 'pass-fds'. The group of fds > should all refer to the same file, but may have different access flags > (ie. O_RDWR, O_RDONLY). qemu_open can then dup the fd that has the > matching access mode flags. > pass-fds: > { 'command': 'pass-fds', > 'data': {'fdsetname': 'str', '*close': 'bool'}, > 'returns': 'string' } > close-fds: > { 'command': 'close-fds', > 'data': {'fdsetname': 'str' } > where: > @fdsetname - the file descriptor set name > @close - *if true QEMU closes the monitor fds when they expire. > if false, fds remain on the monitor until close-fds > command. > PRO: Supports reopen > PRO: All other commands work without impact by using qemu_open() > PRO: No fd leakage if close==true specified > CON: Determining when to close fds when close==true may be tricky > USAGE: > 1. pass-fd FDSET={M} -> returns a string "/dev/fdset/1") that refers to > the passed set of fds. > 2. drive_add file=/dev/fdset/1 -- qemu_open() uses the first fd from the > list that has access flags matching the qemu_open() action flags. > 3. block-commit -- qemu_open() reopens "/dev/fdset/1" by using the first > fd from the list that has access flags matching the qemu_open() action > flags. > 4. The monitor fds are closed: > A) *If @close == true, qemu closes the set of fds when the timer > expires > B) If @close == false, libvirt must issue close-fds command to close > the set of fds > > *How to solve this has yet to be determined. Kevin mentioned > potentially using a bottom-half or a timer in qemu_close(). > Thanks again for taking time to discuss this at today's QEMU community call. Here's the proposal we discussed at the call. Please let me know if I missed anything or if there are any issues with this design. Proposal Five: New monitor commands enable adding/removing an fd to/from a set. The set of fds should all refer to the same file, but may have different access flags (ie. O_RDWR, O_RDONLY). qemu_open can then dup the fd that has the matching access mode flags. PRO: Supports reopen PRO: All other commands work without impact by using qemu_open() PRO: No fd leakage (fds are associated with monitor connection and, if not in use, closed when monitor disconnects) PRO: Security-wise this is ok since libvirt can manage the set of fd's (ie. it can add/remove an O_RDWR fd to/from the set as needed). CON: Not atomic (e.g. doesn't add an fd with single drive_add command). USAGE: 1. add-fd /dev/fdset/1 FDSET={M} -> qemu adds fd to set named "/dev/fdset/1" - command returns qemu fd (e.g fd=4) to caller. libvirt in-use flag turned on for fd. 2. drive_add file=/dev/fdset/1 -> qemu_open uses the first fd from the set that has access flags matching the qemu_open action flags. qemu_open increments refcount for this fd. 3. add-fd /dev/fdset/1 FDSET={M} -> qemu adds fd to set named "/dev/fdset/1" - command returns qemu fd to caller (e.g fd=5). libvirt in-use flag turned on for fd. 3. block-commit -> qemu_open reopens "/dev/fdset/1" by using the first fd from the set that has access flags matching the qemu_open action flags. qemu_open increments refcount for this fd. 4. remove-fd /dev/fdset/1 5 -> caller requests fd==5 be removed from the set. turns libvirt in-use flag off marking the fd ready to be closed when qemu is done with it. 5. qemu_close decrements refcount for fd, and closes fd when refcount is zero and libvirt in use flag is off. More functional details: -If libvirt crashes it can call "query-fd /dev/fdset/1" to determine which fds are open in the set. -If monitor connection closes, qemu will close fds that have a refcount of zero. Do we also need a qemu in-use flag in case refcount is zero and fd is still in use? -This support requires introduction of qemu_close function that will be called everywhere in block layer that close is currently called. Notes: -Patch series 1 will include support for all of the above. This will be my initial focus. -Patch series 2 will include command line support that enables association of command line fd with a monitor set. This will be my secondary focus, most likely after patch series 1 is applied. -- Regards, Corey ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd 2012-07-03 15:40 ` Corey Bryant @ 2012-07-03 15:59 ` Kevin Wolf 2012-07-03 16:25 ` Corey Bryant 0 siblings, 1 reply; 56+ messages in thread From: Kevin Wolf @ 2012-07-03 15:59 UTC (permalink / raw) To: Corey Bryant Cc: aliguori, stefanha, libvir-list, qemu-devel, Luiz Capitulino, pbonzini, Eric Blake Am 03.07.2012 17:40, schrieb Corey Bryant: > Thanks again for taking time to discuss this at today's QEMU community call. > > Here's the proposal we discussed at the call. Please let me know if I > missed anything or if there are any issues with this design. > > Proposal Five: New monitor commands enable adding/removing an fd > to/from a set. The set of fds should all refer to the same file, but > may have different access flags (ie. O_RDWR, O_RDONLY). qemu_open can > then dup the fd that has the matching access mode flags. > PRO: Supports reopen > PRO: All other commands work without impact by using qemu_open() > PRO: No fd leakage (fds are associated with monitor connection and, if > not in use, closed when monitor disconnects) > PRO: Security-wise this is ok since libvirt can manage the set of fd's > (ie. it can add/remove an O_RDWR fd to/from the set as needed). > CON: Not atomic (e.g. doesn't add an fd with single drive_add command). > USAGE: > 1. add-fd /dev/fdset/1 FDSET={M} -> qemu adds fd to set named > "/dev/fdset/1" - command returns qemu fd (e.g fd=4) to caller. libvirt > in-use flag turned on for fd. I thought qemu would rather return the number of the fdset (which it also assigns if none it passed, i.e. for fdset creation). Does libvirt need the number of an individual fd? If libvirt prefers to assign fdset numbers itself, I'm not against it, it's just something that wasn't clear to me yet. > 2. drive_add file=/dev/fdset/1 -> qemu_open uses the first fd from the > set that has access flags matching the qemu_open action flags. > qemu_open increments refcount for this fd. > 3. add-fd /dev/fdset/1 FDSET={M} -> qemu adds fd to set named > "/dev/fdset/1" - command returns qemu fd to caller (e.g fd=5). libvirt > in-use flag turned on for fd. > 3. block-commit -> qemu_open reopens "/dev/fdset/1" by using the first > fd from the set that has access flags matching the qemu_open action > flags. qemu_open increments refcount for this fd. > 4. remove-fd /dev/fdset/1 5 -> caller requests fd==5 be removed from the > set. turns libvirt in-use flag off marking the fd ready to be closed > when qemu is done with it. If we decided to not return the individual fd numbers to libvirt, file descriptors would be uniquely identified by an fdset/flags pair here. > 5. qemu_close decrements refcount for fd, and closes fd when refcount is > zero and libvirt in use flag is off. The monitor could just hold another reference, then we save the additional flag. But that's a qemu implementation detail. > More functional details: > -If libvirt crashes it can call "query-fd /dev/fdset/1" to determine > which fds are open in the set. We also need a query-fdsets command that lists all fdsets that exist. If we add information about single fds to the return value of it, we probably don't need a separate query-fd that operates on a single fdset. > -If monitor connection closes, qemu will close fds that have a refcount > of zero. Do we also need a qemu in-use flag in case refcount is zero > and fd is still in use? In use by whom? If it's still in use in qemu (as in "in-use flag would be set") and we have a refcount of zero, then that's a bug. > -This support requires introduction of qemu_close function that will be > called everywhere in block layer that close is currently called. > > Notes: > -Patch series 1 will include support for all of the above. This will be > my initial focus. > -Patch series 2 will include command line support that enables > association of command line fd with a monitor set. This will be my > secondary focus, most likely after patch series 1 is applied. Thanks, this is a good and as far as I can tell complete summary of what we discussed. Kevin ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd 2012-07-03 15:59 ` Kevin Wolf @ 2012-07-03 16:25 ` Corey Bryant 2012-07-03 17:03 ` Eric Blake 0 siblings, 1 reply; 56+ messages in thread From: Corey Bryant @ 2012-07-03 16:25 UTC (permalink / raw) To: Kevin Wolf Cc: aliguori, stefanha, libvir-list, qemu-devel, Luiz Capitulino, pbonzini, Eric Blake On 07/03/2012 11:59 AM, Kevin Wolf wrote: > Am 03.07.2012 17:40, schrieb Corey Bryant: >> Thanks again for taking time to discuss this at today's QEMU community call. >> >> Here's the proposal we discussed at the call. Please let me know if I >> missed anything or if there are any issues with this design. >> >> Proposal Five: New monitor commands enable adding/removing an fd >> to/from a set. The set of fds should all refer to the same file, but >> may have different access flags (ie. O_RDWR, O_RDONLY). qemu_open can >> then dup the fd that has the matching access mode flags. >> PRO: Supports reopen >> PRO: All other commands work without impact by using qemu_open() >> PRO: No fd leakage (fds are associated with monitor connection and, if >> not in use, closed when monitor disconnects) >> PRO: Security-wise this is ok since libvirt can manage the set of fd's >> (ie. it can add/remove an O_RDWR fd to/from the set as needed). >> CON: Not atomic (e.g. doesn't add an fd with single drive_add command). >> USAGE: >> 1. add-fd /dev/fdset/1 FDSET={M} -> qemu adds fd to set named >> "/dev/fdset/1" - command returns qemu fd (e.g fd=4) to caller. libvirt >> in-use flag turned on for fd. > > I thought qemu would rather return the number of the fdset (which it > also assigns if none it passed, i.e. for fdset creation). Does libvirt > need the number of an individual fd? > > If libvirt prefers to assign fdset numbers itself, I'm not against it, > it's just something that wasn't clear to me yet. > That's fine. QEMU can return the fdset number or a string (/dev/fdset/1) if none is specified. And an fdset will need to be specified if adding to an existing set. I think libvirt will need the fd returned by add-fd so that it can evaluate fds returned by query-fd. It's also useful for remove-fd. >> 2. drive_add file=/dev/fdset/1 -> qemu_open uses the first fd from the >> set that has access flags matching the qemu_open action flags. >> qemu_open increments refcount for this fd. >> 3. add-fd /dev/fdset/1 FDSET={M} -> qemu adds fd to set named >> "/dev/fdset/1" - command returns qemu fd to caller (e.g fd=5). libvirt >> in-use flag turned on for fd. >> 3. block-commit -> qemu_open reopens "/dev/fdset/1" by using the first >> fd from the set that has access flags matching the qemu_open action >> flags. qemu_open increments refcount for this fd. >> 4. remove-fd /dev/fdset/1 5 -> caller requests fd==5 be removed from the >> set. turns libvirt in-use flag off marking the fd ready to be closed >> when qemu is done with it. > > If we decided to not return the individual fd numbers to libvirt, file > descriptors would be uniquely identified by an fdset/flags pair here. > Are you saying we'd pass the fdset name and flags parameters on remove-fd to somehow identify the fds to remove? >> 5. qemu_close decrements refcount for fd, and closes fd when refcount is >> zero and libvirt in use flag is off. > > The monitor could just hold another reference, then we save the > additional flag. But that's a qemu implementation detail. > I'm not sure I understand what you mean. >> More functional details: >> -If libvirt crashes it can call "query-fd /dev/fdset/1" to determine >> which fds are open in the set. > > We also need a query-fdsets command that lists all fdsets that exist. If > we add information about single fds to the return value of it, we > probably don't need a separate query-fd that operates on a single fdset. > Yes, good point. And maybe we don't need 2 commands. query-fdsets could return all the sets and all the fds that are in those sets. >> -If monitor connection closes, qemu will close fds that have a refcount >> of zero. Do we also need a qemu in-use flag in case refcount is zero >> and fd is still in use? > > In use by whom? If it's still in use in qemu (as in "in-use flag would > be set") and we have a refcount of zero, then that's a bug. > In use by qemu. I don't think it's a bug. I think there are situations where refcount gets to zero but qemu is still using the fd. >> -This support requires introduction of qemu_close function that will be >> called everywhere in block layer that close is currently called. >> >> Notes: >> -Patch series 1 will include support for all of the above. This will be >> my initial focus. >> -Patch series 2 will include command line support that enables >> association of command line fd with a monitor set. This will be my >> secondary focus, most likely after patch series 1 is applied. > > Thanks, this is a good and as far as I can tell complete summary of what > we discussed. > > Kevin > Definitely! Thank you for all the input. -- Regards, Corey ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd 2012-07-03 16:25 ` Corey Bryant @ 2012-07-03 17:03 ` Eric Blake 2012-07-03 17:46 ` Corey Bryant 2012-07-04 8:00 ` Kevin Wolf 0 siblings, 2 replies; 56+ messages in thread From: Eric Blake @ 2012-07-03 17:03 UTC (permalink / raw) To: Corey Bryant Cc: Kevin Wolf, aliguori, stefanha, libvir-list, qemu-devel, Luiz Capitulino, pbonzini [-- Attachment #1: Type: text/plain, Size: 6833 bytes --] On 07/03/2012 10:25 AM, Corey Bryant wrote: >> I thought qemu would rather return the number of the fdset (which it >> also assigns if none it passed, i.e. for fdset creation). Does libvirt >> need the number of an individual fd? >> >> If libvirt prefers to assign fdset numbers itself, I'm not against it, >> it's just something that wasn't clear to me yet. >> > > That's fine. QEMU can return the fdset number or a string > (/dev/fdset/1) if none is specified. And an fdset will need to be > specified if adding to an existing set. > > I think libvirt will need the fd returned by add-fd so that it can > evaluate fds returned by query-fd. It's also useful for remove-fd. Correct - since we will be adding a remove-fd, then that command needs to know both the fdset name and the individual fd within the set to be removed. > >>> 2. drive_add file=/dev/fdset/1 -> qemu_open uses the first fd from the >>> set that has access flags matching the qemu_open action flags. >>> qemu_open increments refcount for this fd. >>> 3. add-fd /dev/fdset/1 FDSET={M} -> qemu adds fd to set named >>> "/dev/fdset/1" - command returns qemu fd to caller (e.g fd=5). libvirt >>> in-use flag turned on for fd. >>> 3. block-commit -> qemu_open reopens "/dev/fdset/1" by using the first >>> fd from the set that has access flags matching the qemu_open action >>> flags. qemu_open increments refcount for this fd. >>> 4. remove-fd /dev/fdset/1 5 -> caller requests fd==5 be removed from the >>> set. turns libvirt in-use flag off marking the fd ready to be closed >>> when qemu is done with it. >> >> If we decided to not return the individual fd numbers to libvirt, file >> descriptors would be uniquely identified by an fdset/flags pair here. >> > > Are you saying we'd pass the fdset name and flags parameters on > remove-fd to somehow identify the fds to remove? Passing the flag parameters is not trivial, as that would mean the QMP code would have to define constants mapping to all of the O_* flags that qemu_open supports. It's easier to support closing by fd number. > >>> 5. qemu_close decrements refcount for fd, and closes fd when refcount is >>> zero and libvirt in use flag is off. >> >> The monitor could just hold another reference, then we save the >> additional flag. But that's a qemu implementation detail. >> > > I'm not sure I understand what you mean. pass-fd (or add-fd, whatever name we give it) adds an fd to an fdset, with initial use count of 1 (the use is the monitor). qemu_open() increments the use count. A new qemu_close() wrapper would decrement the use count. And both calling 'remove-fd', or closing the QMP monitor of an fd that has not yet been passed through 'remove-fd', serves as a way to decrement the use count. You'd still have to track whether the monitor is using an fd (to avoid over-decrementing on QMP monitor close), but by having the monitor's use also tracked under the refcount, then refcount reaching 0 is sufficient to auto-close an fd. I think that also means that re-establishing the client QMP connection would increment For some examples: 1. client calls 'add-fd', qemu is now tracking fd=4 with refcount 1, in use by monitor, as member of fdset1 2. client crashes, so all tracked fds are visited; fd=4 had not yet been passed to 'remove-fd', so qemu decrements refcount; refcount of fd=4 is now 0 so qemu closes it 1. client calls 'add-fd', qemu is now tracking fd=4 with refcount 1, in use by monitor, as member of fdset1 2. client calls 'device-add' with /dev/fdset/1 as the backing filename, so qemu_open() increments the refcount to 2 3. client crashes, so all tracked fds are visited; fd=4 had not yet been passed to 'remove-fd', so qemu decrements refcount to 1, but leaves fd=4 open because it is still in use by the block device 4. client re-establishes QMP connection, and 'query-fds' lets client learn about fd=4 still being open as part of fdset1, but also informs client that fd is not in use by the monitor 1. client calls 'add-fd', qemu is now tracking fd=4 with refcount 1, in use by monitor, as member of fdset1 2. client calls 'device-add' with /dev/fdset/1 as the backing filename, so qemu_open() increments the refcount to 2 3. client calls 'remove-fd fdset=1 fd=4', so qemu marks fd=4 as no longer in use by the monitor, refcount decremented to 1 but still left open because it is in use by the block device 4. client crashes, so all tracked fds are visited; but fd=4 is already marked as not in use by the monitor, so its refcount is unchanged 1. client calls 'add-fd', qemu is now tracking fd=4 with refcount 1, in use by monitor, as member of fdset1 2. client calls 'device-add' with /dev/fdset/1 as the backing filename, but the command fails for some other reason, so the refcount is still 1 at the end of the command (although it may have been temporarily incremented then decremented during the command) 3. client calls 'remove-fd fdset=1 fd=4' to deal with the failure (or QMP connection is closed), so qemu marks fd=4 as no longer in use by the monitor, refcount is now decremented to 0 and fd=4 is closed I think that covers the idea; you need a bool in_use for tracking monitor state (the monitor is in use until either a remove-fd or a monitor connection closes), as well as a ref-count. >> We also need a query-fdsets command that lists all fdsets that exist. If >> we add information about single fds to the return value of it, we >> probably don't need a separate query-fd that operates on a single fdset. >> > > Yes, good point. And maybe we don't need 2 commands. query-fdsets > could return all the sets and all the fds that are in those sets. Yes, I think a single query command is good enough here, something like: { "execute":"query-fdsets" } => { "return" : { "sets": [ { "name": "fdset1", "fds": [ { "fd": 4, "monitor": true, "refcount": 1 } ] }, { "name": "fdset2", "fds": [ { "fd": 5, "monitor": false, "refcount": 1 }, { "fd": 6, "monitor": true, "refcount": 2 } ] } ] } } >> In use by whom? If it's still in use in qemu (as in "in-use flag would >> be set") and we have a refcount of zero, then that's a bug. >> > > In use by qemu. I don't think it's a bug. I think there are situations > where refcount gets to zero but qemu is still using the fd. I think the refcount being non-zero _is_ what defines an fd as being in use by qemu (including use by the monitor). Any place you have to close an fd before reopening it is dangerous; the safe way is always to open with the new permissions before closing the old permissions. -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 620 bytes --] ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd 2012-07-03 17:03 ` Eric Blake @ 2012-07-03 17:46 ` Corey Bryant 2012-07-03 18:00 ` Eric Blake 2012-07-04 8:00 ` Kevin Wolf 1 sibling, 1 reply; 56+ messages in thread From: Corey Bryant @ 2012-07-03 17:46 UTC (permalink / raw) To: Eric Blake Cc: Kevin Wolf, aliguori, stefanha, libvir-list, qemu-devel, Luiz Capitulino, pbonzini On 07/03/2012 01:03 PM, Eric Blake wrote: > On 07/03/2012 10:25 AM, Corey Bryant wrote: > >>> I thought qemu would rather return the number of the fdset (which it >>> also assigns if none it passed, i.e. for fdset creation). Does libvirt >>> need the number of an individual fd? >>> >>> If libvirt prefers to assign fdset numbers itself, I'm not against it, >>> it's just something that wasn't clear to me yet. >>> >> >> That's fine. QEMU can return the fdset number or a string >> (/dev/fdset/1) if none is specified. And an fdset will need to be >> specified if adding to an existing set. >> >> I think libvirt will need the fd returned by add-fd so that it can >> evaluate fds returned by query-fd. It's also useful for remove-fd. > > Correct - since we will be adding a remove-fd, then that command needs > to know both the fdset name and the individual fd within the set to be > removed. > Ok >> >>>> 2. drive_add file=/dev/fdset/1 -> qemu_open uses the first fd from the >>>> set that has access flags matching the qemu_open action flags. >>>> qemu_open increments refcount for this fd. >>>> 3. add-fd /dev/fdset/1 FDSET={M} -> qemu adds fd to set named >>>> "/dev/fdset/1" - command returns qemu fd to caller (e.g fd=5). libvirt >>>> in-use flag turned on for fd. >>>> 3. block-commit -> qemu_open reopens "/dev/fdset/1" by using the first >>>> fd from the set that has access flags matching the qemu_open action >>>> flags. qemu_open increments refcount for this fd. >>>> 4. remove-fd /dev/fdset/1 5 -> caller requests fd==5 be removed from the >>>> set. turns libvirt in-use flag off marking the fd ready to be closed >>>> when qemu is done with it. >>> >>> If we decided to not return the individual fd numbers to libvirt, file >>> descriptors would be uniquely identified by an fdset/flags pair here. >>> >> >> Are you saying we'd pass the fdset name and flags parameters on >> remove-fd to somehow identify the fds to remove? > > Passing the flag parameters is not trivial, as that would mean the QMP > code would have to define constants mapping to all of the O_* flags that > qemu_open supports. It's easier to support closing by fd number. > I understand what you were saying now, although I guess it's not applicable at this point. I'll plan on returning the fd from add-fd and passing the fd on the remove-fd command. >> >>>> 5. qemu_close decrements refcount for fd, and closes fd when refcount is >>>> zero and libvirt in use flag is off. >>> >>> The monitor could just hold another reference, then we save the >>> additional flag. But that's a qemu implementation detail. >>> >> >> I'm not sure I understand what you mean. > > pass-fd (or add-fd, whatever name we give it) adds an fd to an fdset, > with initial use count of 1 (the use is the monitor). qemu_open() > increments the use count. A new qemu_close() wrapper would decrement > the use count. And both calling 'remove-fd', or closing the QMP monitor > of an fd that has not yet been passed through 'remove-fd', serves as a > way to decrement the use count. You'd still have to track whether the > monitor is using an fd (to avoid over-decrementing on QMP monitor > close), but by having the monitor's use also tracked under the refcount, > then refcount reaching 0 is sufficient to auto-close an fd. I think > that also means that re-establishing the client QMP connection would > increment For some examples: Yes, I think adding a +1 to the refcount for the monitor makes sense. I'm a bit unsure how to increment the refcount when a monitor reconnects though. Maybe it is as simple as adding a +1 to each fd's refcount when the next QMP monitor connects. > > 1. client calls 'add-fd', qemu is now tracking fd=4 with refcount 1, in > use by monitor, as member of fdset1 > 2. client crashes, so all tracked fds are visited; fd=4 had not yet been > passed to 'remove-fd', so qemu decrements refcount; refcount of fd=4 is > now 0 so qemu closes it Just a note that the fd above also hasn't yet been referenced by a drive-add/device-add, so it will be closed in step 2. > > 1. client calls 'add-fd', qemu is now tracking fd=4 with refcount 1, in > use by monitor, as member of fdset1 > 2. client calls 'device-add' with /dev/fdset/1 as the backing filename, > so qemu_open() increments the refcount to 2 > 3. client crashes, so all tracked fds are visited; fd=4 had not yet been > passed to 'remove-fd', so qemu decrements refcount to 1, but leaves fd=4 > open because it is still in use by the block device > 4. client re-establishes QMP connection, and 'query-fds' lets client > learn about fd=4 still being open as part of fdset1, but also informs > client that fd is not in use by the monitor And in step 4 the QMP connection will increment the refcount +1 for all fds that persisted through the QMP disconnect. (?) > > 1. client calls 'add-fd', qemu is now tracking fd=4 with refcount 1, in > use by monitor, as member of fdset1 > 2. client calls 'device-add' with /dev/fdset/1 as the backing filename, > so qemu_open() increments the refcount to 2 > 3. client calls 'remove-fd fdset=1 fd=4', so qemu marks fd=4 as no > longer in use by the monitor, refcount decremented to 1 but still left > open because it is in use by the block device > 4. client crashes, so all tracked fds are visited; but fd=4 is already > marked as not in use by the monitor, so its refcount is unchanged > > 1. client calls 'add-fd', qemu is now tracking fd=4 with refcount 1, in > use by monitor, as member of fdset1 > 2. client calls 'device-add' with /dev/fdset/1 as the backing filename, > but the command fails for some other reason, so the refcount is still 1 > at the end of the command (although it may have been temporarily > incremented then decremented during the command) > 3. client calls 'remove-fd fdset=1 fd=4' to deal with the failure (or > QMP connection is closed), so qemu marks fd=4 as no longer in use by the > monitor, refcount is now decremented to 0 and fd=4 is closed > > I think that covers the idea; you need a bool in_use for tracking > monitor state (the monitor is in use until either a remove-fd or a > monitor connection closes), as well as a ref-count. > Yes, it all makes sense to me. Thanks for the scenarios. >>> We also need a query-fdsets command that lists all fdsets that exist. If >>> we add information about single fds to the return value of it, we >>> probably don't need a separate query-fd that operates on a single fdset. >>> >> >> Yes, good point. And maybe we don't need 2 commands. query-fdsets >> could return all the sets and all the fds that are in those sets. > > Yes, I think a single query command is good enough here, something like: > > { "execute":"query-fdsets" } => > { "return" : { "sets": [ > { "name": "fdset1", > "fds": [ { "fd": 4, "monitor": true, "refcount": 1 } ] }, > { "name": "fdset2", > "fds": [ { "fd": 5, "monitor": false, "refcount": 1 }, > { "fd": 6, "monitor": true, "refcount": 2 } ] } ] } } > > Ok, thanks! >>> In use by whom? If it's still in use in qemu (as in "in-use flag would >>> be set") and we have a refcount of zero, then that's a bug. >>> >> >> In use by qemu. I don't think it's a bug. I think there are situations >> where refcount gets to zero but qemu is still using the fd. > > I think the refcount being non-zero _is_ what defines an fd as being in > use by qemu (including use by the monitor). Any place you have to close > an fd before reopening it is dangerous; the safe way is always to open > with the new permissions before closing the old permissions. > Maybe Kevin wants to weigh in on this. Perhaps it's an issue that can be separated from my patch series. -- Regards, Corey ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd 2012-07-03 17:46 ` Corey Bryant @ 2012-07-03 18:00 ` Eric Blake 2012-07-03 18:21 ` Corey Bryant 0 siblings, 1 reply; 56+ messages in thread From: Eric Blake @ 2012-07-03 18:00 UTC (permalink / raw) To: Corey Bryant Cc: Kevin Wolf, aliguori, stefanha, libvir-list, qemu-devel, Luiz Capitulino, pbonzini [-- Attachment #1: Type: text/plain, Size: 2393 bytes --] On 07/03/2012 11:46 AM, Corey Bryant wrote: > > Yes, I think adding a +1 to the refcount for the monitor makes sense. > > I'm a bit unsure how to increment the refcount when a monitor reconnects > though. Maybe it is as simple as adding a +1 to each fd's refcount when > the next QMP monitor connects. Or maybe delay a +1 until after a 'query-fds' - it is not until the monitor has reconnected and learned what fds it should be aware of that incrementing the refcount again makes sense. But that would mean making 'query-fds' track whether this is the first call since the monitor reconnected, as it shouldn't normally increase refcounts. The other alternative is that the monitor never re-increments a refcount. Once a monitor disconnects, that fd is lost to the monitor, and a reconnected monitor must pass in a new fd to be re-associated with the fdset. In other words, the monitor's use of an fd is a one-way operation, starting life in use but ending at the first disconnect or remove-fd. >> 1. client calls 'add-fd', qemu is now tracking fd=4 with refcount 1, in >> use by monitor, as member of fdset1 >> 2. client calls 'device-add' with /dev/fdset/1 as the backing filename, >> so qemu_open() increments the refcount to 2 >> 3. client crashes, so all tracked fds are visited; fd=4 had not yet been >> passed to 'remove-fd', so qemu decrements refcount to 1, but leaves fd=4 >> open because it is still in use by the block device >> 4. client re-establishes QMP connection, and 'query-fds' lets client >> learn about fd=4 still being open as part of fdset1, but also informs >> client that fd is not in use by the monitor > > And in step 4 the QMP connection will increment the refcount +1 for all > fds that persisted through the QMP disconnect. (?) I'm not sure we need the refcount increment on reconnect. 'query-fds' should provide enough information for the new monitor to know what fdsets are still in use by qemu, even though they are no longer available to 'remove-fd' from the monitor, and if the monitor is worried about keeping the fd set alive it can call 'add-fd' to again associate a new fd with the set. The lifetime of a set is thus as long as any of its associated fds have a non-zero refcount. -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 620 bytes --] ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd 2012-07-03 18:00 ` Eric Blake @ 2012-07-03 18:21 ` Corey Bryant 2012-07-04 8:09 ` Kevin Wolf 0 siblings, 1 reply; 56+ messages in thread From: Corey Bryant @ 2012-07-03 18:21 UTC (permalink / raw) To: Eric Blake Cc: Kevin Wolf, aliguori, stefanha, libvir-list, qemu-devel, Luiz Capitulino, pbonzini On 07/03/2012 02:00 PM, Eric Blake wrote: > On 07/03/2012 11:46 AM, Corey Bryant wrote: > >> >> Yes, I think adding a +1 to the refcount for the monitor makes sense. >> >> I'm a bit unsure how to increment the refcount when a monitor reconnects >> though. Maybe it is as simple as adding a +1 to each fd's refcount when >> the next QMP monitor connects. > > Or maybe delay a +1 until after a 'query-fds' - it is not until the > monitor has reconnected and learned what fds it should be aware of that > incrementing the refcount again makes sense. But that would mean making > 'query-fds' track whether this is the first call since the monitor > reconnected, as it shouldn't normally increase refcounts. This doesn't sound ideal. > > The other alternative is that the monitor never re-increments a > refcount. Once a monitor disconnects, that fd is lost to the monitor, > and a reconnected monitor must pass in a new fd to be re-associated with > the fdset. In other words, the monitor's use of an fd is a one-way > operation, starting life in use but ending at the first disconnect or > remove-fd. > > I would vote for this 2nd alternative. As long as we're not introducing an fd leak. And I don't think we are if we decrement the refcount on remove-fd or on QMP disconnect. >>> 1. client calls 'add-fd', qemu is now tracking fd=4 with refcount 1, in >>> use by monitor, as member of fdset1 >>> 2. client calls 'device-add' with /dev/fdset/1 as the backing filename, >>> so qemu_open() increments the refcount to 2 >>> 3. client crashes, so all tracked fds are visited; fd=4 had not yet been >>> passed to 'remove-fd', so qemu decrements refcount to 1, but leaves fd=4 >>> open because it is still in use by the block device >>> 4. client re-establishes QMP connection, and 'query-fds' lets client >>> learn about fd=4 still being open as part of fdset1, but also informs >>> client that fd is not in use by the monitor >> >> And in step 4 the QMP connection will increment the refcount +1 for all >> fds that persisted through the QMP disconnect. (?) > > I'm not sure we need the refcount increment on reconnect. 'query-fds' > should provide enough information for the new monitor to know what > fdsets are still in use by qemu, even though they are no longer > available to 'remove-fd' from the monitor, and if the monitor is worried > about keeping the fd set alive it can call 'add-fd' to again associate a > new fd with the set. The lifetime of a set is thus as long as any of > its associated fds have a non-zero refcount. > This sounds good to me. And qemu_open will need to make sure the monitor in_use flag is true before dup'ing an fd. -- Regards, Corey ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd 2012-07-03 18:21 ` Corey Bryant @ 2012-07-04 8:09 ` Kevin Wolf 2012-07-05 15:06 ` Corey Bryant 0 siblings, 1 reply; 56+ messages in thread From: Kevin Wolf @ 2012-07-04 8:09 UTC (permalink / raw) To: Corey Bryant Cc: aliguori, stefanha, libvir-list, qemu-devel, Luiz Capitulino, pbonzini, Eric Blake Am 03.07.2012 20:21, schrieb Corey Bryant: > On 07/03/2012 02:00 PM, Eric Blake wrote: >> On 07/03/2012 11:46 AM, Corey Bryant wrote: >> >>> >>> Yes, I think adding a +1 to the refcount for the monitor makes sense. >>> >>> I'm a bit unsure how to increment the refcount when a monitor reconnects >>> though. Maybe it is as simple as adding a +1 to each fd's refcount when >>> the next QMP monitor connects. >> >> Or maybe delay a +1 until after a 'query-fds' - it is not until the >> monitor has reconnected and learned what fds it should be aware of that >> incrementing the refcount again makes sense. But that would mean making >> 'query-fds' track whether this is the first call since the monitor >> reconnected, as it shouldn't normally increase refcounts. > > This doesn't sound ideal. Yes, it's less than ideal. >> The other alternative is that the monitor never re-increments a >> refcount. Once a monitor disconnects, that fd is lost to the monitor, >> and a reconnected monitor must pass in a new fd to be re-associated with >> the fdset. In other words, the monitor's use of an fd is a one-way >> operation, starting life in use but ending at the first disconnect or >> remove-fd. > > I would vote for this 2nd alternative. As long as we're not introducing > an fd leak. And I don't think we are if we decrement the refcount on > remove-fd or on QMP disconnect. In fact, I believe this one is even worse. I can already see hacks like adding a dummy FD with invalid flags and removing it again just to regain control over the fdset... You earlier suggestion made a lot of sense to me: Whenever a new QMP monitor is connected, increase the refcount. That is, as long as any monitor is there, don't drop any fdsets unless explicitly requested via QMP. Kevin ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd 2012-07-04 8:09 ` Kevin Wolf @ 2012-07-05 15:06 ` Corey Bryant 2012-07-09 14:05 ` Luiz Capitulino 0 siblings, 1 reply; 56+ messages in thread From: Corey Bryant @ 2012-07-05 15:06 UTC (permalink / raw) To: Kevin Wolf Cc: aliguori, stefanha, libvir-list, qemu-devel, Luiz Capitulino, pbonzini, Eric Blake On 07/04/2012 04:09 AM, Kevin Wolf wrote: > Am 03.07.2012 20:21, schrieb Corey Bryant: >> On 07/03/2012 02:00 PM, Eric Blake wrote: >>> On 07/03/2012 11:46 AM, Corey Bryant wrote: >>> >>>> >>>> Yes, I think adding a +1 to the refcount for the monitor makes sense. >>>> >>>> I'm a bit unsure how to increment the refcount when a monitor reconnects >>>> though. Maybe it is as simple as adding a +1 to each fd's refcount when >>>> the next QMP monitor connects. >>> >>> Or maybe delay a +1 until after a 'query-fds' - it is not until the >>> monitor has reconnected and learned what fds it should be aware of that >>> incrementing the refcount again makes sense. But that would mean making >>> 'query-fds' track whether this is the first call since the monitor >>> reconnected, as it shouldn't normally increase refcounts. >> >> This doesn't sound ideal. > > Yes, it's less than ideal. > >>> The other alternative is that the monitor never re-increments a >>> refcount. Once a monitor disconnects, that fd is lost to the monitor, >>> and a reconnected monitor must pass in a new fd to be re-associated with >>> the fdset. In other words, the monitor's use of an fd is a one-way >>> operation, starting life in use but ending at the first disconnect or >>> remove-fd. >> >> I would vote for this 2nd alternative. As long as we're not introducing >> an fd leak. And I don't think we are if we decrement the refcount on >> remove-fd or on QMP disconnect. > > In fact, I believe this one is even worse. I can already see hacks like > adding a dummy FD with invalid flags and removing it again just to > regain control over the fdset... > > You earlier suggestion made a lot of sense to me: Whenever a new QMP > monitor is connected, increase the refcount. That is, as long as any > monitor is there, don't drop any fdsets unless explicitly requested via QMP. Ok. So refcount would be incremented (for the fd or fdset, whatever we decide on) when QMP reconnects. I'm assuming we wouldn't wait until after a query-fds call. I'll go with this approach until someone objects. -- Regards, Corey ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd 2012-07-05 15:06 ` Corey Bryant @ 2012-07-09 14:05 ` Luiz Capitulino 2012-07-09 15:05 ` Corey Bryant 0 siblings, 1 reply; 56+ messages in thread From: Luiz Capitulino @ 2012-07-09 14:05 UTC (permalink / raw) To: Corey Bryant Cc: Kevin Wolf, aliguori, stefanha, libvir-list, qemu-devel, pbonzini, Eric Blake On Thu, 05 Jul 2012 11:06:56 -0400 Corey Bryant <coreyb@linux.vnet.ibm.com> wrote: > > > On 07/04/2012 04:09 AM, Kevin Wolf wrote: > > Am 03.07.2012 20:21, schrieb Corey Bryant: > >> On 07/03/2012 02:00 PM, Eric Blake wrote: > >>> On 07/03/2012 11:46 AM, Corey Bryant wrote: > >>> > >>>> > >>>> Yes, I think adding a +1 to the refcount for the monitor makes sense. > >>>> > >>>> I'm a bit unsure how to increment the refcount when a monitor reconnects > >>>> though. Maybe it is as simple as adding a +1 to each fd's refcount when > >>>> the next QMP monitor connects. > >>> > >>> Or maybe delay a +1 until after a 'query-fds' - it is not until the > >>> monitor has reconnected and learned what fds it should be aware of that > >>> incrementing the refcount again makes sense. But that would mean making > >>> 'query-fds' track whether this is the first call since the monitor > >>> reconnected, as it shouldn't normally increase refcounts. > >> > >> This doesn't sound ideal. > > > > Yes, it's less than ideal. > > > >>> The other alternative is that the monitor never re-increments a > >>> refcount. Once a monitor disconnects, that fd is lost to the monitor, > >>> and a reconnected monitor must pass in a new fd to be re-associated with > >>> the fdset. In other words, the monitor's use of an fd is a one-way > >>> operation, starting life in use but ending at the first disconnect or > >>> remove-fd. > >> > >> I would vote for this 2nd alternative. As long as we're not introducing > >> an fd leak. And I don't think we are if we decrement the refcount on > >> remove-fd or on QMP disconnect. > > > > In fact, I believe this one is even worse. I can already see hacks like > > adding a dummy FD with invalid flags and removing it again just to > > regain control over the fdset... > > > > You earlier suggestion made a lot of sense to me: Whenever a new QMP > > monitor is connected, increase the refcount. That is, as long as any > > monitor is there, don't drop any fdsets unless explicitly requested via QMP. > > Ok. So refcount would be incremented (for the fd or fdset, whatever we > decide on) when QMP reconnects. I'm assuming we wouldn't wait until > after a query-fds call. I'm not sure this is a good idea because we will leak fds if the client forgets about the fds when re-connecting (ie. it was restarted) or if a different client connects to QMP. If we really want to do that, I think that the right way of doing this is to add a command for clients to re-again ownership of the fds on reconnection. But to be honest, I don't fully understand why this is needed. ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd 2012-07-09 14:05 ` Luiz Capitulino @ 2012-07-09 15:05 ` Corey Bryant 2012-07-09 15:46 ` Kevin Wolf 2012-07-09 18:20 ` Corey Bryant 0 siblings, 2 replies; 56+ messages in thread From: Corey Bryant @ 2012-07-09 15:05 UTC (permalink / raw) To: Luiz Capitulino Cc: Kevin Wolf, aliguori, stefanha, libvir-list, qemu-devel, pbonzini, Eric Blake On 07/09/2012 10:05 AM, Luiz Capitulino wrote: > On Thu, 05 Jul 2012 11:06:56 -0400 > Corey Bryant <coreyb@linux.vnet.ibm.com> wrote: > >> >> >> On 07/04/2012 04:09 AM, Kevin Wolf wrote: >>> Am 03.07.2012 20:21, schrieb Corey Bryant: >>>> On 07/03/2012 02:00 PM, Eric Blake wrote: >>>>> On 07/03/2012 11:46 AM, Corey Bryant wrote: >>>>> >>>>>> >>>>>> Yes, I think adding a +1 to the refcount for the monitor makes sense. >>>>>> >>>>>> I'm a bit unsure how to increment the refcount when a monitor reconnects >>>>>> though. Maybe it is as simple as adding a +1 to each fd's refcount when >>>>>> the next QMP monitor connects. >>>>> >>>>> Or maybe delay a +1 until after a 'query-fds' - it is not until the >>>>> monitor has reconnected and learned what fds it should be aware of that >>>>> incrementing the refcount again makes sense. But that would mean making >>>>> 'query-fds' track whether this is the first call since the monitor >>>>> reconnected, as it shouldn't normally increase refcounts. >>>> >>>> This doesn't sound ideal. >>> >>> Yes, it's less than ideal. >>> >>>>> The other alternative is that the monitor never re-increments a >>>>> refcount. Once a monitor disconnects, that fd is lost to the monitor, >>>>> and a reconnected monitor must pass in a new fd to be re-associated with >>>>> the fdset. In other words, the monitor's use of an fd is a one-way >>>>> operation, starting life in use but ending at the first disconnect or >>>>> remove-fd. >>>> >>>> I would vote for this 2nd alternative. As long as we're not introducing >>>> an fd leak. And I don't think we are if we decrement the refcount on >>>> remove-fd or on QMP disconnect. >>> >>> In fact, I believe this one is even worse. I can already see hacks like >>> adding a dummy FD with invalid flags and removing it again just to >>> regain control over the fdset... >>> >>> You earlier suggestion made a lot of sense to me: Whenever a new QMP >>> monitor is connected, increase the refcount. That is, as long as any >>> monitor is there, don't drop any fdsets unless explicitly requested via QMP. >> >> Ok. So refcount would be incremented (for the fd or fdset, whatever we >> decide on) when QMP reconnects. I'm assuming we wouldn't wait until >> after a query-fds call. > > I'm not sure this is a good idea because we will leak fds if the client forgets > about the fds when re-connecting (ie. it was restarted) or if a different > client connects to QMP. > > If we really want to do that, I think that the right way of doing this is to > add a command for clients to re-again ownership of the fds on reconnection. > > But to be honest, I don't fully understand why this is needed. > I'm not sure this is an issue with current design. I know things have changed a bit as the email threads evolved, so I'll paste the current design that I am working from. Please let me know if you still see any issues. FD passing: ----------- New monitor commands enable adding/removing an fd to/from a set. New monitor command query-fdsets enables querying of current monitor fdsets. The set of fds should all refer to the same file, with each fd having different access flags (ie. O_RDWR, O_RDONLY). qemu_open can then dup the fd that has the matching access mode flags. Design points: -------------- 1. add-fd -> fd is passed via SCM rights and qemu adds fd to first unused fdset (e.g. /dev/fdset/1) -> add-fd monitor function initializes the monitor inuse flag for the fdset to true -> add-fd monitor function initializes the remove flag for the fd to false -> add-fd returns fdset number and received fd number (e.g fd=3) to caller 2. drive_add file=/dev/fdset/1 -> qemu_open uses the first fd in fdset1 that has access flags matching the qemu_open action flags and has remove flag set to false -> qemu_open increments refcount for the fdset -> Need to make sure that if a command like 'device-add' fails that refcount is not incremented 3. add-fd fdset=1 -> fd is passed via SCM rights -> add-fd monitor function adds the received fd to the specified fdset (or fails if fdset doesn't exist) -> add-fd monitor function initializes the remove flag for the fd to false -> add-fd returns fdset number and received fd number (e.g fd=4) to caller 4. block-commit -> qemu_open performs "reopen" by using the first fd from the fdset that has access flags matching the qemu_open action flags and has remove flag set to false -> qemu_open increments refcount for the fdset -> Need to make sure that if a command like 'block-commit' fails that refcount is not incremented 5. remove-fd fdset=1 fd=4 -> remove-fd monitor function fails if fdset doesn't exist -> remove-fd monitor function turns on remove flag for fd=4 6. qemu_close (need to replace all close calls in block layer with qemu_close) -> qemu_close decrements refcount for fdset -> qemu_close closes all fds that have (refcount == 0 && (!inuse || remove)) -> qemu_close frees the fdset if no fds remain in it 7. disconnecting the QMP monitor -> monitor disconnect visits all fdsets on monitor and turns off monitor in-use flag for fdset 8. connecting the QMP monitor -> monitor connect visits all fdsets on monitor and turns on monitor in-use flag for fdset 9. query-fdsets -> returns all fdsets and fds that don't have remove flag on QMP command examples -------------------- -> { "execute": "add-fd", "arguments": { "fdset": 1 } } <- { "return": { "fdset": 1, "fd": 3 } } -> { "execute": "remove-fd", "arguments": { "fdset": 1, "fd": 3 } } <- { "return": {} } -> { "execute":"query-fdsets" } => <- { "return" : { "fdsets": [ { "name": "fdset1", "fds": [ { "fd": 4, "removed": false } ], "refcount": 1, "monitor": true }, { "name": "fdset2", "fds": [ { "fd": 5, "removed": false }, { "fd": 6, "removed": true } ], "refcount": 1, "monitor": true } } -- Regards, Corey ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd 2012-07-09 15:05 ` Corey Bryant @ 2012-07-09 15:46 ` Kevin Wolf 2012-07-09 16:18 ` Luiz Capitulino 2012-07-09 17:35 ` Corey Bryant 2012-07-09 18:20 ` Corey Bryant 1 sibling, 2 replies; 56+ messages in thread From: Kevin Wolf @ 2012-07-09 15:46 UTC (permalink / raw) To: Corey Bryant Cc: aliguori, stefanha, libvir-list, qemu-devel, Luiz Capitulino, pbonzini, Eric Blake Am 09.07.2012 17:05, schrieb Corey Bryant: > I'm not sure this is an issue with current design. I know things have > changed a bit as the email threads evolved, so I'll paste the current > design that I am working from. Please let me know if you still see any > issues. > > FD passing: > ----------- > New monitor commands enable adding/removing an fd to/from a set. New > monitor command query-fdsets enables querying of current monitor fdsets. > The set of fds should all refer to the same file, with each fd having > different access flags (ie. O_RDWR, O_RDONLY). qemu_open can then dup > the fd that has the matching access mode flags. > > Design points: > -------------- > 1. add-fd > -> fd is passed via SCM rights and qemu adds fd to first unused fdset > (e.g. /dev/fdset/1) > -> add-fd monitor function initializes the monitor inuse flag for the > fdset to true > -> add-fd monitor function initializes the remove flag for the fd to false > -> add-fd returns fdset number and received fd number (e.g fd=3) to caller > > 2. drive_add file=/dev/fdset/1 > -> qemu_open uses the first fd in fdset1 that has access flags matching > the qemu_open action flags and has remove flag set to false > -> qemu_open increments refcount for the fdset > -> Need to make sure that if a command like 'device-add' fails that > refcount is not incremented > > 3. add-fd fdset=1 > -> fd is passed via SCM rights > -> add-fd monitor function adds the received fd to the specified fdset > (or fails if fdset doesn't exist) > -> add-fd monitor function initializes the remove flag for the fd to false > -> add-fd returns fdset number and received fd number (e.g fd=4) to caller > > 4. block-commit > -> qemu_open performs "reopen" by using the first fd from the fdset that > has access flags matching the qemu_open action flags and has remove flag > set to false > -> qemu_open increments refcount for the fdset > -> Need to make sure that if a command like 'block-commit' fails that > refcount is not incremented > > 5. remove-fd fdset=1 fd=4 > -> remove-fd monitor function fails if fdset doesn't exist > -> remove-fd monitor function turns on remove flag for fd=4 What was again the reason why we keep removed fds in the fdset at all? The removed flag would make sense for a fdset after a hypothetical close-fdset call because the fdset needs to be kept around until the last user closes it, but I think removed fds can be deleted immediately. I think I might have confused remove-fd and close-fdset in earlier emails in this thread, so I hope this isn't inconsistent with what I said before. > 6. qemu_close (need to replace all close calls in block layer with > qemu_close) > -> qemu_close decrements refcount for fdset > -> qemu_close closes all fds that have (refcount == 0 && (!inuse || remove)) > -> qemu_close frees the fdset if no fds remain in it > > 7. disconnecting the QMP monitor > -> monitor disconnect visits all fdsets on monitor and turns off monitor > in-use flag for fdset And close all fds with refcount == 0. > 8. connecting the QMP monitor > -> monitor connect visits all fdsets on monitor and turns on monitor > in-use flag for fdset > > 9. query-fdsets > -> returns all fdsets and fds that don't have remove flag on Kevin ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd 2012-07-09 15:46 ` Kevin Wolf @ 2012-07-09 16:18 ` Luiz Capitulino 2012-07-09 17:59 ` Corey Bryant 2012-07-09 17:35 ` Corey Bryant 1 sibling, 1 reply; 56+ messages in thread From: Luiz Capitulino @ 2012-07-09 16:18 UTC (permalink / raw) To: Kevin Wolf Cc: aliguori, stefanha, libvir-list, Corey Bryant, qemu-devel, pbonzini, Eric Blake On Mon, 09 Jul 2012 17:46:00 +0200 Kevin Wolf <kwolf@redhat.com> wrote: > Am 09.07.2012 17:05, schrieb Corey Bryant: > > I'm not sure this is an issue with current design. I know things have > > changed a bit as the email threads evolved, so I'll paste the current > > design that I am working from. Please let me know if you still see any > > issues. > > > > FD passing: > > ----------- > > New monitor commands enable adding/removing an fd to/from a set. New > > monitor command query-fdsets enables querying of current monitor fdsets. > > The set of fds should all refer to the same file, with each fd having > > different access flags (ie. O_RDWR, O_RDONLY). qemu_open can then dup > > the fd that has the matching access mode flags. > > > > Design points: > > -------------- > > 1. add-fd > > -> fd is passed via SCM rights and qemu adds fd to first unused fdset > > (e.g. /dev/fdset/1) The fdset should be specified by the client, like: { "execute": "add-fd-set", "arguments": { "set-name": "/dev/fdset/1" } } > > -> add-fd monitor function initializes the monitor inuse flag for the > > fdset to true Why do we need the inuse flag? > > -> add-fd monitor function initializes the remove flag for the fd to false > > -> add-fd returns fdset number and received fd number (e.g fd=3) to caller > > > > 2. drive_add file=/dev/fdset/1 > > -> qemu_open uses the first fd in fdset1 that has access flags matching > > the qemu_open action flags and has remove flag set to false > > -> qemu_open increments refcount for the fdset > > -> Need to make sure that if a command like 'device-add' fails that > > refcount is not incremented > > > > 3. add-fd fdset=1 > > -> fd is passed via SCM rights > > -> add-fd monitor function adds the received fd to the specified fdset > > (or fails if fdset doesn't exist) > > -> add-fd monitor function initializes the remove flag for the fd to false > > -> add-fd returns fdset number and received fd number (e.g fd=4) to caller > > > > 4. block-commit > > -> qemu_open performs "reopen" by using the first fd from the fdset that > > has access flags matching the qemu_open action flags and has remove flag > > set to false > > -> qemu_open increments refcount for the fdset > > -> Need to make sure that if a command like 'block-commit' fails that > > refcount is not incremented > > > > 5. remove-fd fdset=1 fd=4 > > -> remove-fd monitor function fails if fdset doesn't exist > > -> remove-fd monitor function turns on remove flag for fd=4 > > What was again the reason why we keep removed fds in the fdset at all? > > The removed flag would make sense for a fdset after a hypothetical > close-fdset call because the fdset needs to be kept around until the > last user closes it, but I think removed fds can be deleted immediately. Agreed. ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd 2012-07-09 16:18 ` Luiz Capitulino @ 2012-07-09 17:59 ` Corey Bryant 0 siblings, 0 replies; 56+ messages in thread From: Corey Bryant @ 2012-07-09 17:59 UTC (permalink / raw) To: Luiz Capitulino Cc: Kevin Wolf, aliguori, stefanha, libvir-list, qemu-devel, pbonzini, Eric Blake On 07/09/2012 12:18 PM, Luiz Capitulino wrote: > On Mon, 09 Jul 2012 17:46:00 +0200 > Kevin Wolf <kwolf@redhat.com> wrote: > >> Am 09.07.2012 17:05, schrieb Corey Bryant: >>> I'm not sure this is an issue with current design. I know things have >>> changed a bit as the email threads evolved, so I'll paste the current >>> design that I am working from. Please let me know if you still see any >>> issues. >>> >>> FD passing: >>> ----------- >>> New monitor commands enable adding/removing an fd to/from a set. New >>> monitor command query-fdsets enables querying of current monitor fdsets. >>> The set of fds should all refer to the same file, with each fd having >>> different access flags (ie. O_RDWR, O_RDONLY). qemu_open can then dup >>> the fd that has the matching access mode flags. >>> >>> Design points: >>> -------------- >>> 1. add-fd >>> -> fd is passed via SCM rights and qemu adds fd to first unused fdset >>> (e.g. /dev/fdset/1) > > The fdset should be specified by the client, like: > > { "execute": "add-fd-set", "arguments": { "set-name": "/dev/fdset/1" } } > We could make the fdset name configurable. Then we wouldn't force clients into using the file=/dev/fdset/1 alias on commands like device_add. The risk with this is that clients need to be careful and use a unique name that doesn't conflict with any existing file names. The way it is currently, if add-fd is not given an fdset name, it will generate a new fdset and add the fd to it. add-fd always returns the fdset (int) and received fd (int) on success. (e.g. fdset=1 corresponds to file=/dev/fdset/1). Then the 2nd time you want to add an fd to this set, you have to specify fdset=1 on add-fd. I'll do whatever you all prefer. I think there are advantages in both approaches, however I'm leaning toward the current approach and forcing use of /dev/fdset/1 to keep all fd set names consistent. >>> -> add-fd monitor function initializes the monitor inuse flag for the >>> fdset to true > > Why do we need the inuse flag? > This helps to prevent fd leakage. Let's say the client adds fd=3 to /dev/fdset/1 and then the QMP monitor disconnects. Since the following evaluates to true when the monitor disconnects, the fd will be closed: (refcount == 0 && (!inuse || remove)) Note: refcount is incremented for the fdset on qemu_open and decremented on qemu_close, and no commands caused it to be incremented from zero in this example. >>> -> add-fd monitor function initializes the remove flag for the fd to false >>> -> add-fd returns fdset number and received fd number (e.g fd=3) to caller >>> >>> 2. drive_add file=/dev/fdset/1 >>> -> qemu_open uses the first fd in fdset1 that has access flags matching >>> the qemu_open action flags and has remove flag set to false >>> -> qemu_open increments refcount for the fdset >>> -> Need to make sure that if a command like 'device-add' fails that >>> refcount is not incremented >>> >>> 3. add-fd fdset=1 >>> -> fd is passed via SCM rights >>> -> add-fd monitor function adds the received fd to the specified fdset >>> (or fails if fdset doesn't exist) >>> -> add-fd monitor function initializes the remove flag for the fd to false >>> -> add-fd returns fdset number and received fd number (e.g fd=4) to caller >>> >>> 4. block-commit >>> -> qemu_open performs "reopen" by using the first fd from the fdset that >>> has access flags matching the qemu_open action flags and has remove flag >>> set to false >>> -> qemu_open increments refcount for the fdset >>> -> Need to make sure that if a command like 'block-commit' fails that >>> refcount is not incremented >>> >>> 5. remove-fd fdset=1 fd=4 >>> -> remove-fd monitor function fails if fdset doesn't exist >>> -> remove-fd monitor function turns on remove flag for fd=4 >> >> What was again the reason why we keep removed fds in the fdset at all? >> >> The removed flag would make sense for a fdset after a hypothetical >> close-fdset call because the fdset needs to be kept around until the >> last user closes it, but I think removed fds can be deleted immediately. > > Agreed. > Please take a look at my recent reply to Kevin about this and let me know if it clears things up. -- Regards, Corey ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd 2012-07-09 15:46 ` Kevin Wolf 2012-07-09 16:18 ` Luiz Capitulino @ 2012-07-09 17:35 ` Corey Bryant 2012-07-09 17:48 ` Luiz Capitulino 2012-07-10 7:53 ` Kevin Wolf 1 sibling, 2 replies; 56+ messages in thread From: Corey Bryant @ 2012-07-09 17:35 UTC (permalink / raw) To: Kevin Wolf Cc: aliguori, stefanha, libvir-list, qemu-devel, Luiz Capitulino, pbonzini, Eric Blake On 07/09/2012 11:46 AM, Kevin Wolf wrote: > Am 09.07.2012 17:05, schrieb Corey Bryant: >> I'm not sure this is an issue with current design. I know things have >> changed a bit as the email threads evolved, so I'll paste the current >> design that I am working from. Please let me know if you still see any >> issues. >> >> FD passing: >> ----------- >> New monitor commands enable adding/removing an fd to/from a set. New >> monitor command query-fdsets enables querying of current monitor fdsets. >> The set of fds should all refer to the same file, with each fd having >> different access flags (ie. O_RDWR, O_RDONLY). qemu_open can then dup >> the fd that has the matching access mode flags. >> >> Design points: >> -------------- >> 1. add-fd >> -> fd is passed via SCM rights and qemu adds fd to first unused fdset >> (e.g. /dev/fdset/1) >> -> add-fd monitor function initializes the monitor inuse flag for the >> fdset to true >> -> add-fd monitor function initializes the remove flag for the fd to false >> -> add-fd returns fdset number and received fd number (e.g fd=3) to caller >> >> 2. drive_add file=/dev/fdset/1 >> -> qemu_open uses the first fd in fdset1 that has access flags matching >> the qemu_open action flags and has remove flag set to false >> -> qemu_open increments refcount for the fdset >> -> Need to make sure that if a command like 'device-add' fails that >> refcount is not incremented >> >> 3. add-fd fdset=1 >> -> fd is passed via SCM rights >> -> add-fd monitor function adds the received fd to the specified fdset >> (or fails if fdset doesn't exist) >> -> add-fd monitor function initializes the remove flag for the fd to false >> -> add-fd returns fdset number and received fd number (e.g fd=4) to caller >> >> 4. block-commit >> -> qemu_open performs "reopen" by using the first fd from the fdset that >> has access flags matching the qemu_open action flags and has remove flag >> set to false >> -> qemu_open increments refcount for the fdset >> -> Need to make sure that if a command like 'block-commit' fails that >> refcount is not incremented >> >> 5. remove-fd fdset=1 fd=4 >> -> remove-fd monitor function fails if fdset doesn't exist >> -> remove-fd monitor function turns on remove flag for fd=4 > > What was again the reason why we keep removed fds in the fdset at all? Because if refcount is > 0 for the fd set, then the fd could be in use by a block device. So we keep it around until refcount is decremented to zero, at which point it is safe to close. > > The removed flag would make sense for a fdset after a hypothetical > close-fdset call because the fdset needs to be kept around until the > last user closes it, but I think removed fds can be deleted immediately. fds in an fd set really need to be kept around until zero block devices reference them. At that point, if '(refcount == 0 && (!inuse || remove))' is true, then we'll officially close the fd. > > I think I might have confused remove-fd and close-fdset in earlier > emails in this thread, so I hope this isn't inconsistent with what I > said before. > Ok no problem. >> 6. qemu_close (need to replace all close calls in block layer with >> qemu_close) >> -> qemu_close decrements refcount for fdset >> -> qemu_close closes all fds that have (refcount == 0 && (!inuse || remove)) >> -> qemu_close frees the fdset if no fds remain in it >> >> 7. disconnecting the QMP monitor >> -> monitor disconnect visits all fdsets on monitor and turns off monitor >> in-use flag for fdset > > And close all fds with refcount == 0. > Yes, this makes sense. It also makes sense to close removed fds with refcount == 0 in the remove-fd function. Basically this will be the same thing we do in qemu_close. We'll close any fds that evaulate the following as true: (refcount == 0 && (!inuse || remove)) >> 8. connecting the QMP monitor >> -> monitor connect visits all fdsets on monitor and turns on monitor >> in-use flag for fdset >> >> 9. query-fdsets >> -> returns all fdsets and fds that don't have remove flag on -- Regards, Corey ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd 2012-07-09 17:35 ` Corey Bryant @ 2012-07-09 17:48 ` Luiz Capitulino 2012-07-09 18:02 ` Corey Bryant 2012-07-10 7:53 ` Kevin Wolf 1 sibling, 1 reply; 56+ messages in thread From: Luiz Capitulino @ 2012-07-09 17:48 UTC (permalink / raw) To: Corey Bryant Cc: Kevin Wolf, aliguori, stefanha, libvir-list, qemu-devel, pbonzini, Eric Blake On Mon, 09 Jul 2012 13:35:19 -0400 Corey Bryant <coreyb@linux.vnet.ibm.com> wrote: > > > On 07/09/2012 11:46 AM, Kevin Wolf wrote: > > Am 09.07.2012 17:05, schrieb Corey Bryant: > >> I'm not sure this is an issue with current design. I know things have > >> changed a bit as the email threads evolved, so I'll paste the current > >> design that I am working from. Please let me know if you still see any > >> issues. > >> > >> FD passing: > >> ----------- > >> New monitor commands enable adding/removing an fd to/from a set. New > >> monitor command query-fdsets enables querying of current monitor fdsets. > >> The set of fds should all refer to the same file, with each fd having > >> different access flags (ie. O_RDWR, O_RDONLY). qemu_open can then dup > >> the fd that has the matching access mode flags. > >> > >> Design points: > >> -------------- > >> 1. add-fd > >> -> fd is passed via SCM rights and qemu adds fd to first unused fdset > >> (e.g. /dev/fdset/1) > >> -> add-fd monitor function initializes the monitor inuse flag for the > >> fdset to true > >> -> add-fd monitor function initializes the remove flag for the fd to false > >> -> add-fd returns fdset number and received fd number (e.g fd=3) to caller > >> > >> 2. drive_add file=/dev/fdset/1 > >> -> qemu_open uses the first fd in fdset1 that has access flags matching > >> the qemu_open action flags and has remove flag set to false > >> -> qemu_open increments refcount for the fdset > >> -> Need to make sure that if a command like 'device-add' fails that > >> refcount is not incremented > >> > >> 3. add-fd fdset=1 > >> -> fd is passed via SCM rights > >> -> add-fd monitor function adds the received fd to the specified fdset > >> (or fails if fdset doesn't exist) > >> -> add-fd monitor function initializes the remove flag for the fd to false > >> -> add-fd returns fdset number and received fd number (e.g fd=4) to caller > >> > >> 4. block-commit > >> -> qemu_open performs "reopen" by using the first fd from the fdset that > >> has access flags matching the qemu_open action flags and has remove flag > >> set to false > >> -> qemu_open increments refcount for the fdset > >> -> Need to make sure that if a command like 'block-commit' fails that > >> refcount is not incremented > >> > >> 5. remove-fd fdset=1 fd=4 > >> -> remove-fd monitor function fails if fdset doesn't exist > >> -> remove-fd monitor function turns on remove flag for fd=4 > > > > What was again the reason why we keep removed fds in the fdset at all? > > Because if refcount is > 0 for the fd set, then the fd could be in use > by a block device. So we keep it around until refcount is decremented > to zero, at which point it is safe to close. But then the refcount is associated with the set, not with any particular fd. ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd 2012-07-09 17:48 ` Luiz Capitulino @ 2012-07-09 18:02 ` Corey Bryant 0 siblings, 0 replies; 56+ messages in thread From: Corey Bryant @ 2012-07-09 18:02 UTC (permalink / raw) To: Luiz Capitulino Cc: Kevin Wolf, aliguori, stefanha, libvir-list, qemu-devel, pbonzini, Eric Blake On 07/09/2012 01:48 PM, Luiz Capitulino wrote: > On Mon, 09 Jul 2012 13:35:19 -0400 > Corey Bryant <coreyb@linux.vnet.ibm.com> wrote: > >> >> >> On 07/09/2012 11:46 AM, Kevin Wolf wrote: >>> Am 09.07.2012 17:05, schrieb Corey Bryant: >>>> I'm not sure this is an issue with current design. I know things have >>>> changed a bit as the email threads evolved, so I'll paste the current >>>> design that I am working from. Please let me know if you still see any >>>> issues. >>>> >>>> FD passing: >>>> ----------- >>>> New monitor commands enable adding/removing an fd to/from a set. New >>>> monitor command query-fdsets enables querying of current monitor fdsets. >>>> The set of fds should all refer to the same file, with each fd having >>>> different access flags (ie. O_RDWR, O_RDONLY). qemu_open can then dup >>>> the fd that has the matching access mode flags. >>>> >>>> Design points: >>>> -------------- >>>> 1. add-fd >>>> -> fd is passed via SCM rights and qemu adds fd to first unused fdset >>>> (e.g. /dev/fdset/1) >>>> -> add-fd monitor function initializes the monitor inuse flag for the >>>> fdset to true >>>> -> add-fd monitor function initializes the remove flag for the fd to false >>>> -> add-fd returns fdset number and received fd number (e.g fd=3) to caller >>>> >>>> 2. drive_add file=/dev/fdset/1 >>>> -> qemu_open uses the first fd in fdset1 that has access flags matching >>>> the qemu_open action flags and has remove flag set to false >>>> -> qemu_open increments refcount for the fdset >>>> -> Need to make sure that if a command like 'device-add' fails that >>>> refcount is not incremented >>>> >>>> 3. add-fd fdset=1 >>>> -> fd is passed via SCM rights >>>> -> add-fd monitor function adds the received fd to the specified fdset >>>> (or fails if fdset doesn't exist) >>>> -> add-fd monitor function initializes the remove flag for the fd to false >>>> -> add-fd returns fdset number and received fd number (e.g fd=4) to caller >>>> >>>> 4. block-commit >>>> -> qemu_open performs "reopen" by using the first fd from the fdset that >>>> has access flags matching the qemu_open action flags and has remove flag >>>> set to false >>>> -> qemu_open increments refcount for the fdset >>>> -> Need to make sure that if a command like 'block-commit' fails that >>>> refcount is not incremented >>>> >>>> 5. remove-fd fdset=1 fd=4 >>>> -> remove-fd monitor function fails if fdset doesn't exist >>>> -> remove-fd monitor function turns on remove flag for fd=4 >>> >>> What was again the reason why we keep removed fds in the fdset at all? >> >> Because if refcount is > 0 for the fd set, then the fd could be in use >> by a block device. So we keep it around until refcount is decremented >> to zero, at which point it is safe to close. > > But then the refcount is associated with the set, not with any particular fd. > Exactly, yes that's what we're doing. Sorry, I thought that was clear in the design overview I sent earlier today. -- Regards, Corey ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd 2012-07-09 17:35 ` Corey Bryant 2012-07-09 17:48 ` Luiz Capitulino @ 2012-07-10 7:53 ` Kevin Wolf 1 sibling, 0 replies; 56+ messages in thread From: Kevin Wolf @ 2012-07-10 7:53 UTC (permalink / raw) To: Corey Bryant Cc: aliguori, stefanha, libvir-list, qemu-devel, Luiz Capitulino, pbonzini, Eric Blake Am 09.07.2012 19:35, schrieb Corey Bryant: > > > On 07/09/2012 11:46 AM, Kevin Wolf wrote: >> Am 09.07.2012 17:05, schrieb Corey Bryant: >>> I'm not sure this is an issue with current design. I know things have >>> changed a bit as the email threads evolved, so I'll paste the current >>> design that I am working from. Please let me know if you still see any >>> issues. >>> >>> FD passing: >>> ----------- >>> New monitor commands enable adding/removing an fd to/from a set. New >>> monitor command query-fdsets enables querying of current monitor fdsets. >>> The set of fds should all refer to the same file, with each fd having >>> different access flags (ie. O_RDWR, O_RDONLY). qemu_open can then dup >>> the fd that has the matching access mode flags. >>> >>> Design points: >>> -------------- >>> 1. add-fd >>> -> fd is passed via SCM rights and qemu adds fd to first unused fdset >>> (e.g. /dev/fdset/1) >>> -> add-fd monitor function initializes the monitor inuse flag for the >>> fdset to true >>> -> add-fd monitor function initializes the remove flag for the fd to false >>> -> add-fd returns fdset number and received fd number (e.g fd=3) to caller >>> >>> 2. drive_add file=/dev/fdset/1 >>> -> qemu_open uses the first fd in fdset1 that has access flags matching >>> the qemu_open action flags and has remove flag set to false >>> -> qemu_open increments refcount for the fdset >>> -> Need to make sure that if a command like 'device-add' fails that >>> refcount is not incremented >>> >>> 3. add-fd fdset=1 >>> -> fd is passed via SCM rights >>> -> add-fd monitor function adds the received fd to the specified fdset >>> (or fails if fdset doesn't exist) >>> -> add-fd monitor function initializes the remove flag for the fd to false >>> -> add-fd returns fdset number and received fd number (e.g fd=4) to caller >>> >>> 4. block-commit >>> -> qemu_open performs "reopen" by using the first fd from the fdset that >>> has access flags matching the qemu_open action flags and has remove flag >>> set to false >>> -> qemu_open increments refcount for the fdset >>> -> Need to make sure that if a command like 'block-commit' fails that >>> refcount is not incremented >>> >>> 5. remove-fd fdset=1 fd=4 >>> -> remove-fd monitor function fails if fdset doesn't exist >>> -> remove-fd monitor function turns on remove flag for fd=4 >> >> What was again the reason why we keep removed fds in the fdset at all? > > Because if refcount is > 0 for the fd set, then the fd could be in use > by a block device. So we keep it around until refcount is decremented > to zero, at which point it is safe to close. > >> >> The removed flag would make sense for a fdset after a hypothetical >> close-fdset call because the fdset needs to be kept around until the >> last user closes it, but I think removed fds can be deleted immediately. > > fds in an fd set really need to be kept around until zero block devices > reference them. At that point, if '(refcount == 0 && (!inuse || > remove))' is true, then we'll officially close the fd. Block devices don't reference an fd in the fdset. There are two references in a block device. The first one is obviously the file descriptor they are using; it is a fd dup()ed from an fd in the fdset, but it's now independent of it. The other reference is the file name that is kept in the BlockDriverState, and it always points to "/dev/fdset/X", that is, the whole fdset instead of a single fd. What happens if you remove a file descriptor from an fdset that is in use, is that you can't reopen the fdset with the flags of the removed file descriptor any more. Which I believe is exactly the expected behaviour. libvirt would use this to revoke r/w access, for example (and which behaviour you already provide by checking removed in qemu_open). Are there any other use cases where it makes a difference whether a file descriptor is kept in the fdset with removed=1 or whether it's actually removed from the fdset? >> I think I might have confused remove-fd and close-fdset in earlier >> emails in this thread, so I hope this isn't inconsistent with what I >> said before. >> > > Ok no problem. > >>> 6. qemu_close (need to replace all close calls in block layer with >>> qemu_close) >>> -> qemu_close decrements refcount for fdset >>> -> qemu_close closes all fds that have (refcount == 0 && (!inuse || remove)) >>> -> qemu_close frees the fdset if no fds remain in it >>> >>> 7. disconnecting the QMP monitor >>> -> monitor disconnect visits all fdsets on monitor and turns off monitor >>> in-use flag for fdset >> >> And close all fds with refcount == 0. >> > > Yes, this makes sense. > > It also makes sense to close removed fds with refcount == 0 in the > remove-fd function. Basically this will be the same thing we do in > qemu_close. We'll close any fds that evaulate the following as true: > > (refcount == 0 && (!inuse || remove)) Yes, whatever condition we'll come up with, but it should be the same and checked in all places where its value might change. Kevin ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd 2012-07-09 15:05 ` Corey Bryant 2012-07-09 15:46 ` Kevin Wolf @ 2012-07-09 18:20 ` Corey Bryant 1 sibling, 0 replies; 56+ messages in thread From: Corey Bryant @ 2012-07-09 18:20 UTC (permalink / raw) To: Luiz Capitulino Cc: Kevin Wolf, aliguori, stefanha, libvir-list, qemu-devel, pbonzini, Eric Blake On 07/09/2012 11:05 AM, Corey Bryant wrote: > > > On 07/09/2012 10:05 AM, Luiz Capitulino wrote: >> On Thu, 05 Jul 2012 11:06:56 -0400 >> Corey Bryant <coreyb@linux.vnet.ibm.com> wrote: >> >>> >>> >>> On 07/04/2012 04:09 AM, Kevin Wolf wrote: >>>> Am 03.07.2012 20:21, schrieb Corey Bryant: >>>>> On 07/03/2012 02:00 PM, Eric Blake wrote: >>>>>> On 07/03/2012 11:46 AM, Corey Bryant wrote: >>>>>> >>>>>>> >>>>>>> Yes, I think adding a +1 to the refcount for the monitor makes >>>>>>> sense. >>>>>>> >>>>>>> I'm a bit unsure how to increment the refcount when a monitor >>>>>>> reconnects >>>>>>> though. Maybe it is as simple as adding a +1 to each fd's >>>>>>> refcount when >>>>>>> the next QMP monitor connects. >>>>>> >>>>>> Or maybe delay a +1 until after a 'query-fds' - it is not until the >>>>>> monitor has reconnected and learned what fds it should be aware of >>>>>> that >>>>>> incrementing the refcount again makes sense. But that would mean >>>>>> making >>>>>> 'query-fds' track whether this is the first call since the monitor >>>>>> reconnected, as it shouldn't normally increase refcounts. >>>>> >>>>> This doesn't sound ideal. >>>> >>>> Yes, it's less than ideal. >>>> >>>>>> The other alternative is that the monitor never re-increments a >>>>>> refcount. Once a monitor disconnects, that fd is lost to the >>>>>> monitor, >>>>>> and a reconnected monitor must pass in a new fd to be >>>>>> re-associated with >>>>>> the fdset. In other words, the monitor's use of an fd is a one-way >>>>>> operation, starting life in use but ending at the first disconnect or >>>>>> remove-fd. >>>>> >>>>> I would vote for this 2nd alternative. As long as we're not >>>>> introducing >>>>> an fd leak. And I don't think we are if we decrement the refcount on >>>>> remove-fd or on QMP disconnect. >>>> >>>> In fact, I believe this one is even worse. I can already see hacks like >>>> adding a dummy FD with invalid flags and removing it again just to >>>> regain control over the fdset... >>>> >>>> You earlier suggestion made a lot of sense to me: Whenever a new QMP >>>> monitor is connected, increase the refcount. That is, as long as any >>>> monitor is there, don't drop any fdsets unless explicitly requested >>>> via QMP. >>> >>> Ok. So refcount would be incremented (for the fd or fdset, whatever we >>> decide on) when QMP reconnects. I'm assuming we wouldn't wait until >>> after a query-fds call. >> >> I'm not sure this is a good idea because we will leak fds if the >> client forgets >> about the fds when re-connecting (ie. it was restarted) or if a different >> client connects to QMP. >> >> If we really want to do that, I think that the right way of doing this >> is to >> add a command for clients to re-again ownership of the fds on >> reconnection. >> >> But to be honest, I don't fully understand why this is needed. >> > > I'm not sure this is an issue with current design. I know things have > changed a bit as the email threads evolved, so I'll paste the current > design that I am working from. Please let me know if you still see any > issues. > > FD passing: > ----------- > New monitor commands enable adding/removing an fd to/from a set. New > monitor command query-fdsets enables querying of current monitor fdsets. > The set of fds should all refer to the same file, with each fd having > different access flags (ie. O_RDWR, O_RDONLY). qemu_open can then dup > the fd that has the matching access mode flags. > > Design points: > -------------- > 1. add-fd > -> fd is passed via SCM rights and qemu adds fd to first unused fdset > (e.g. /dev/fdset/1) > -> add-fd monitor function initializes the monitor inuse flag for the > fdset to true > -> add-fd monitor function initializes the remove flag for the fd to false > -> add-fd returns fdset number and received fd number (e.g fd=3) to caller > > 2. drive_add file=/dev/fdset/1 > -> qemu_open uses the first fd in fdset1 that has access flags matching > the qemu_open action flags and has remove flag set to false > -> qemu_open increments refcount for the fdset > -> Need to make sure that if a command like 'device-add' fails that > refcount is not incremented > > 3. add-fd fdset=1 > -> fd is passed via SCM rights > -> add-fd monitor function adds the received fd to the specified fdset > (or fails if fdset doesn't exist) > -> add-fd monitor function initializes the remove flag for the fd to false > -> add-fd returns fdset number and received fd number (e.g fd=4) to caller > > 4. block-commit > -> qemu_open performs "reopen" by using the first fd from the fdset that > has access flags matching the qemu_open action flags and has remove flag > set to false > -> qemu_open increments refcount for the fdset > -> Need to make sure that if a command like 'block-commit' fails that > refcount is not incremented > > 5. remove-fd fdset=1 fd=4 > -> remove-fd monitor function fails if fdset doesn't exist > -> remove-fd monitor function turns on remove flag for fd=4 > > 6. qemu_close (need to replace all close calls in block layer with > qemu_close) > -> qemu_close decrements refcount for fdset > -> qemu_close closes all fds that have (refcount == 0 && (!inuse || > remove)) > -> qemu_close frees the fdset if no fds remain in it > > 7. disconnecting the QMP monitor > -> monitor disconnect visits all fdsets on monitor and turns off monitor > in-use flag for fdset > > 8. connecting the QMP monitor > -> monitor connect visits all fdsets on monitor and turns on monitor > in-use flag for fdset > > 9. query-fdsets > -> returns all fdsets and fds that don't have remove flag on > > QMP command examples > -------------------- > -> { "execute": "add-fd", "arguments": { "fdset": 1 } } > <- { "return": { "fdset": 1, "fd": 3 } } > > -> { "execute": "remove-fd", "arguments": { "fdset": 1, "fd": 3 } } > <- { "return": {} } > > -> { "execute":"query-fdsets" } => > <- { "return" : { "fdsets": [ > { "name": "fdset1", > "fds": [ { "fd": 4, "removed": false } ], > "refcount": 1, > "monitor": true }, > { "name": "fdset2", > "fds": [ { "fd": 5, "removed": false }, > { "fd": 6, "removed": true } ], > "refcount": 1, > "monitor": true } } > Here's a minor correction to query-fdsets. s/name/fdset and I changed fdset to an int to stay in sync with add-fd and remove-fd. -> { "execute":"query-fdsets" } => <- { "return" : { "fdsets": [ { "fdset": 1, "fds": [ { "fd": 4, "removed": false } ], "refcount": 1, "monitor": true }, { "fdset": 2, "fds": [ { "fd": 5, "removed": false }, { "fd": 6, "removed": true } ], "refcount": 1, "monitor": true } } -- Regards, Corey ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd 2012-07-03 17:03 ` Eric Blake 2012-07-03 17:46 ` Corey Bryant @ 2012-07-04 8:00 ` Kevin Wolf 2012-07-05 14:22 ` Corey Bryant 1 sibling, 1 reply; 56+ messages in thread From: Kevin Wolf @ 2012-07-04 8:00 UTC (permalink / raw) To: Eric Blake Cc: aliguori, stefanha, libvir-list, Corey Bryant, qemu-devel, Luiz Capitulino, pbonzini Am 03.07.2012 19:03, schrieb Eric Blake: >>>> 2. drive_add file=/dev/fdset/1 -> qemu_open uses the first fd from the >>>> set that has access flags matching the qemu_open action flags. >>>> qemu_open increments refcount for this fd. >>>> 3. add-fd /dev/fdset/1 FDSET={M} -> qemu adds fd to set named >>>> "/dev/fdset/1" - command returns qemu fd to caller (e.g fd=5). libvirt >>>> in-use flag turned on for fd. >>>> 3. block-commit -> qemu_open reopens "/dev/fdset/1" by using the first >>>> fd from the set that has access flags matching the qemu_open action >>>> flags. qemu_open increments refcount for this fd. >>>> 4. remove-fd /dev/fdset/1 5 -> caller requests fd==5 be removed from the >>>> set. turns libvirt in-use flag off marking the fd ready to be closed >>>> when qemu is done with it. >>> >>> If we decided to not return the individual fd numbers to libvirt, file >>> descriptors would be uniquely identified by an fdset/flags pair here. >>> >> >> Are you saying we'd pass the fdset name and flags parameters on >> remove-fd to somehow identify the fds to remove? > > Passing the flag parameters is not trivial, as that would mean the QMP > code would have to define constants mapping to all of the O_* flags that > qemu_open supports. It's easier to support closing by fd number. Good point. So pass-fd or whatever it will be called needs to returnn both fdset number and the individual fd number. >>>> 5. qemu_close decrements refcount for fd, and closes fd when refcount is >>>> zero and libvirt in use flag is off. >>> >>> The monitor could just hold another reference, then we save the >>> additional flag. But that's a qemu implementation detail. >>> >> >> I'm not sure I understand what you mean. > > pass-fd (or add-fd, whatever name we give it) adds an fd to an fdset, > with initial use count of 1 (the use is the monitor). qemu_open() > increments the use count. A new qemu_close() wrapper would decrement > the use count. And both calling 'remove-fd', or closing the QMP monitor > of an fd that has not yet been passed through 'remove-fd', serves as a > way to decrement the use count. You'd still have to track whether the > monitor is using an fd (to avoid over-decrementing on QMP monitor > close), but by having the monitor's use also tracked under the refcount, > then refcount reaching 0 is sufficient to auto-close an fd. > I think > that also means that re-establishing the client QMP connection would > increment This is an interesting idea. I've never thought about what to do with a reconnect. It felt a bit odd at first, but now your proposal totally makes sense to me. > For some examples: > > 1. client calls 'add-fd', qemu is now tracking fd=4 with refcount 1, in > use by monitor, as member of fdset1 > 2. client crashes, so all tracked fds are visited; fd=4 had not yet been > passed to 'remove-fd', so qemu decrements refcount; refcount of fd=4 is > now 0 so qemu closes it Doesn't the refcount belong to an fdset rather than a single fd? As long as the fdset has still a reference, it's possible to reach the other fds in the set through a reopen. For example: 1. client calls 'add-fd', qemu is now tracking fd=4 (O_RDWR) as a member of fdset1, in use by monitor, refcount 1 2. client calls 'add-fd', qemu is now tracing fd=5 (O_RDONLY) as a member of fdset, in use by monitor, refcount is still 1 3. client calls 'device-add' with /dev/fdset/1 as the backing filename and r/w, so qemu_open() dup()s fd 4 and increments the refcount to 2 4. client crashes, so all tracked fdsets are visited; fdset1 gets its refcount decreased to 1, and both member fds 4 and 5 stay open. If you had refcounted a single fd, fd 5 would be closed now and qemu couldn't switch to O_RDONLY any more. > 1. client calls 'add-fd', qemu is now tracking fd=4 with refcount 1, in > use by monitor, as member of fdset1 > 2. client calls 'device-add' with /dev/fdset/1 as the backing filename, > so qemu_open() increments the refcount to 2 > 3. client crashes, so all tracked fds are visited; fd=4 had not yet been > passed to 'remove-fd', so qemu decrements refcount to 1, but leaves fd=4 > open because it is still in use by the block device > 4. client re-establishes QMP connection, and 'query-fds' lets client > learn about fd=4 still being open as part of fdset1, but also informs > client that fd is not in use by the monitor > > 1. client calls 'add-fd', qemu is now tracking fd=4 with refcount 1, in > use by monitor, as member of fdset1 > 2. client calls 'device-add' with /dev/fdset/1 as the backing filename, > so qemu_open() increments the refcount to 2 > 3. client calls 'remove-fd fdset=1 fd=4', so qemu marks fd=4 as no > longer in use by the monitor, refcount decremented to 1 but still left > open because it is in use by the block device > 4. client crashes, so all tracked fds are visited; but fd=4 is already > marked as not in use by the monitor, so its refcount is unchanged > > 1. client calls 'add-fd', qemu is now tracking fd=4 with refcount 1, in > use by monitor, as member of fdset1 > 2. client calls 'device-add' with /dev/fdset/1 as the backing filename, > but the command fails for some other reason, so the refcount is still 1 > at the end of the command (although it may have been temporarily > incremented then decremented during the command) > 3. client calls 'remove-fd fdset=1 fd=4' to deal with the failure (or > QMP connection is closed), so qemu marks fd=4 as no longer in use by the > monitor, refcount is now decremented to 0 and fd=4 is closed Yup, apart from the comment above the examples look good to me. > I think that covers the idea; you need a bool in_use for tracking > monitor state (the monitor is in use until either a remove-fd or a > monitor connection closes), as well as a ref-count. Yes, makes sense to me. >> We also need a query-fdsets command that lists all fdsets that exist. If >>> we add information about single fds to the return value of it, we >>> probably don't need a separate query-fd that operates on a single fdset. >>> >> >> Yes, good point. And maybe we don't need 2 commands. query-fdsets >> could return all the sets and all the fds that are in those sets. > > Yes, I think a single query command is good enough here, something like: > > { "execute":"query-fdsets" } => > { "return" : { "sets": [ > { "name": "fdset1", > "fds": [ { "fd": 4, "monitor": true, "refcount": 1 } ] }, > { "name": "fdset2", > "fds": [ { "fd": 5, "monitor": false, "refcount": 1 }, > { "fd": 6, "monitor": true, "refcount": 2 } ] } ] } } Looks good, this is exactly what I was thinking of. >>> In use by whom? If it's still in use in qemu (as in "in-use flag would >>> be set") and we have a refcount of zero, then that's a bug. >>> >> >> In use by qemu. I don't think it's a bug. I think there are situations >> where refcount gets to zero but qemu is still using the fd. > > I think the refcount being non-zero _is_ what defines an fd as being in > use by qemu (including use by the monitor). Any place you have to close > an fd before reopening it is dangerous; the safe way is always to open > with the new permissions before closing the old permissions. There is one case I'm aware of where we need to be careful: Before opening an image, qemu may probe the format. In this case, the image gets opened twice, and the first close comes before the second open. I'm not entirely sure how hard it would be to get rid of that behaviour. If we can't get rid of it, we have a small window that the refcount doesn't really cover, and if we weren't careful it would become racy. This is why I mentioned earlier that maybe we need to defer the refcount decrease a bit. However, I can't see how the in-use flag would make a difference there. If the refcount can't cover it, the in-use flag can't either. Kevin ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd 2012-07-04 8:00 ` Kevin Wolf @ 2012-07-05 14:22 ` Corey Bryant 2012-07-05 14:51 ` Kevin Wolf 0 siblings, 1 reply; 56+ messages in thread From: Corey Bryant @ 2012-07-05 14:22 UTC (permalink / raw) To: Kevin Wolf Cc: aliguori, stefanha, libvir-list, qemu-devel, Luiz Capitulino, pbonzini, Eric Blake On 07/04/2012 04:00 AM, Kevin Wolf wrote: > Am 03.07.2012 19:03, schrieb Eric Blake: >>>>> 2. drive_add file=/dev/fdset/1 -> qemu_open uses the first fd from the >>>>> set that has access flags matching the qemu_open action flags. >>>>> qemu_open increments refcount for this fd. >>>>> 3. add-fd /dev/fdset/1 FDSET={M} -> qemu adds fd to set named >>>>> "/dev/fdset/1" - command returns qemu fd to caller (e.g fd=5). libvirt >>>>> in-use flag turned on for fd. >>>>> 3. block-commit -> qemu_open reopens "/dev/fdset/1" by using the first >>>>> fd from the set that has access flags matching the qemu_open action >>>>> flags. qemu_open increments refcount for this fd. >>>>> 4. remove-fd /dev/fdset/1 5 -> caller requests fd==5 be removed from the >>>>> set. turns libvirt in-use flag off marking the fd ready to be closed >>>>> when qemu is done with it. >>>> >>>> If we decided to not return the individual fd numbers to libvirt, file >>>> descriptors would be uniquely identified by an fdset/flags pair here. >>>> >>> >>> Are you saying we'd pass the fdset name and flags parameters on >>> remove-fd to somehow identify the fds to remove? >> >> Passing the flag parameters is not trivial, as that would mean the QMP >> code would have to define constants mapping to all of the O_* flags that >> qemu_open supports. It's easier to support closing by fd number. > > Good point. So pass-fd or whatever it will be called needs to returnn > both fdset number and the individual fd number. > >>>>> 5. qemu_close decrements refcount for fd, and closes fd when refcount is >>>>> zero and libvirt in use flag is off. >>>> >>>> The monitor could just hold another reference, then we save the >>>> additional flag. But that's a qemu implementation detail. >>>> >>> >>> I'm not sure I understand what you mean. >> >> pass-fd (or add-fd, whatever name we give it) adds an fd to an fdset, >> with initial use count of 1 (the use is the monitor). qemu_open() >> increments the use count. A new qemu_close() wrapper would decrement >> the use count. And both calling 'remove-fd', or closing the QMP monitor >> of an fd that has not yet been passed through 'remove-fd', serves as a >> way to decrement the use count. You'd still have to track whether the >> monitor is using an fd (to avoid over-decrementing on QMP monitor >> close), but by having the monitor's use also tracked under the refcount, >> then refcount reaching 0 is sufficient to auto-close an fd. > >> I think >> that also means that re-establishing the client QMP connection would >> increment > > This is an interesting idea. I've never thought about what to do with a > reconnect. It felt a bit odd at first, but now your proposal totally > makes sense to me. > > >> For some examples: >> >> 1. client calls 'add-fd', qemu is now tracking fd=4 with refcount 1, in >> use by monitor, as member of fdset1 >> 2. client crashes, so all tracked fds are visited; fd=4 had not yet been >> passed to 'remove-fd', so qemu decrements refcount; refcount of fd=4 is >> now 0 so qemu closes it > > Doesn't the refcount belong to an fdset rather than a single fd? As long > as the fdset has still a reference, it's possible to reach the other fds > in the set through a reopen. For example: > > 1. client calls 'add-fd', qemu is now tracking fd=4 (O_RDWR) as a member > of fdset1, in use by monitor, refcount 1 > 2. client calls 'add-fd', qemu is now tracing fd=5 (O_RDONLY) as a > member of fdset, in use by monitor, refcount is still 1 > 3. client calls 'device-add' with /dev/fdset/1 as the backing filename > and r/w, so qemu_open() dup()s fd 4 and increments the refcount to 2 > 4. client crashes, so all tracked fdsets are visited; fdset1 gets its > refcount decreased to 1, and both member fds 4 and 5 stay open. > > If you had refcounted a single fd, fd 5 would be closed now and qemu > couldn't switch to O_RDONLY any more. > If the O_RDONLY is for a future command, it would make more sense to add that fd before that future command. Or are you passing it at step 2 because it is needed for the probe re-open issue? >> 1. client calls 'add-fd', qemu is now tracking fd=4 with refcount 1, in >> use by monitor, as member of fdset1 >> 2. client calls 'device-add' with /dev/fdset/1 as the backing filename, >> so qemu_open() increments the refcount to 2 >> 3. client crashes, so all tracked fds are visited; fd=4 had not yet been >> passed to 'remove-fd', so qemu decrements refcount to 1, but leaves fd=4 >> open because it is still in use by the block device >> 4. client re-establishes QMP connection, and 'query-fds' lets client >> learn about fd=4 still being open as part of fdset1, but also informs >> client that fd is not in use by the monitor >> >> 1. client calls 'add-fd', qemu is now tracking fd=4 with refcount 1, in >> use by monitor, as member of fdset1 >> 2. client calls 'device-add' with /dev/fdset/1 as the backing filename, >> so qemu_open() increments the refcount to 2 >> 3. client calls 'remove-fd fdset=1 fd=4', so qemu marks fd=4 as no >> longer in use by the monitor, refcount decremented to 1 but still left >> open because it is in use by the block device >> 4. client crashes, so all tracked fds are visited; but fd=4 is already >> marked as not in use by the monitor, so its refcount is unchanged >> >> 1. client calls 'add-fd', qemu is now tracking fd=4 with refcount 1, in >> use by monitor, as member of fdset1 >> 2. client calls 'device-add' with /dev/fdset/1 as the backing filename, >> but the command fails for some other reason, so the refcount is still 1 >> at the end of the command (although it may have been temporarily >> incremented then decremented during the command) >> 3. client calls 'remove-fd fdset=1 fd=4' to deal with the failure (or >> QMP connection is closed), so qemu marks fd=4 as no longer in use by the >> monitor, refcount is now decremented to 0 and fd=4 is closed > > Yup, apart from the comment above the examples look good to me. > >> I think that covers the idea; you need a bool in_use for tracking >> monitor state (the monitor is in use until either a remove-fd or a >> monitor connection closes), as well as a ref-count. > > Yes, makes sense to me. > >>> We also need a query-fdsets command that lists all fdsets that exist. If >>>> we add information about single fds to the return value of it, we >>>> probably don't need a separate query-fd that operates on a single fdset. >>>> >>> >>> Yes, good point. And maybe we don't need 2 commands. query-fdsets >>> could return all the sets and all the fds that are in those sets. >> >> Yes, I think a single query command is good enough here, something like: >> >> { "execute":"query-fdsets" } => >> { "return" : { "sets": [ >> { "name": "fdset1", >> "fds": [ { "fd": 4, "monitor": true, "refcount": 1 } ] }, >> { "name": "fdset2", >> "fds": [ { "fd": 5, "monitor": false, "refcount": 1 }, >> { "fd": 6, "monitor": true, "refcount": 2 } ] } ] } } > > Looks good, this is exactly what I was thinking of. > >>>> In use by whom? If it's still in use in qemu (as in "in-use flag would >>>> be set") and we have a refcount of zero, then that's a bug. >>>> >>> >>> In use by qemu. I don't think it's a bug. I think there are situations >>> where refcount gets to zero but qemu is still using the fd. >> >> I think the refcount being non-zero _is_ what defines an fd as being in >> use by qemu (including use by the monitor). Any place you have to close >> an fd before reopening it is dangerous; the safe way is always to open >> with the new permissions before closing the old permissions. > > There is one case I'm aware of where we need to be careful: Before > opening an image, qemu may probe the format. In this case, the image > gets opened twice, and the first close comes before the second open. I'm > not entirely sure how hard it would be to get rid of that behaviour. > > If we can't get rid of it, we have a small window that the refcount > doesn't really cover, and if we weren't careful it would become racy. > This is why I mentioned earlier that maybe we need to defer the refcount > decrease a bit. However, I can't see how the in-use flag would make a > difference there. If the refcount can't cover it, the in-use flag can't > either. Yeah this is a problem. Could we introduce another flag to cover this? -- Regards, Corey ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd 2012-07-05 14:22 ` Corey Bryant @ 2012-07-05 14:51 ` Kevin Wolf 2012-07-05 16:35 ` Corey Bryant 0 siblings, 1 reply; 56+ messages in thread From: Kevin Wolf @ 2012-07-05 14:51 UTC (permalink / raw) To: Corey Bryant Cc: aliguori, stefanha, libvir-list, qemu-devel, Luiz Capitulino, pbonzini, Eric Blake Am 05.07.2012 16:22, schrieb Corey Bryant: >>> For some examples: >>> >>> 1. client calls 'add-fd', qemu is now tracking fd=4 with refcount 1, in >>> use by monitor, as member of fdset1 >>> 2. client crashes, so all tracked fds are visited; fd=4 had not yet been >>> passed to 'remove-fd', so qemu decrements refcount; refcount of fd=4 is >>> now 0 so qemu closes it >> >> Doesn't the refcount belong to an fdset rather than a single fd? As long >> as the fdset has still a reference, it's possible to reach the other fds >> in the set through a reopen. For example: >> >> 1. client calls 'add-fd', qemu is now tracking fd=4 (O_RDWR) as a member >> of fdset1, in use by monitor, refcount 1 >> 2. client calls 'add-fd', qemu is now tracing fd=5 (O_RDONLY) as a >> member of fdset, in use by monitor, refcount is still 1 >> 3. client calls 'device-add' with /dev/fdset/1 as the backing filename >> and r/w, so qemu_open() dup()s fd 4 and increments the refcount to 2 >> 4. client crashes, so all tracked fdsets are visited; fdset1 gets its >> refcount decreased to 1, and both member fds 4 and 5 stay open. >> >> If you had refcounted a single fd, fd 5 would be closed now and qemu >> couldn't switch to O_RDONLY any more. >> > > If the O_RDONLY is for a future command, it would make more sense to add > that fd before that future command. Or are you passing it at step 2 > because it is needed for the probe re-open issue? Come on, requiring realistic examples isn't fair. ;-) Swap steps 2 and 3 in the example, then step 3 is just the preparation for a command that uses the new fd. The problem stays the same. Or a live commit operation could be in flight so that qemu must be able to switch on completion without libvirt interaction. There are enough reasons that an fd in an fdset exists and is not opened, but I can't think of one why it should be dropped. >>>>> In use by whom? If it's still in use in qemu (as in "in-use flag would >>>>> be set") and we have a refcount of zero, then that's a bug. >>>>> >>>> >>>> In use by qemu. I don't think it's a bug. I think there are situations >>>> where refcount gets to zero but qemu is still using the fd. >>> >>> I think the refcount being non-zero _is_ what defines an fd as being in >>> use by qemu (including use by the monitor). Any place you have to close >>> an fd before reopening it is dangerous; the safe way is always to open >>> with the new permissions before closing the old permissions. >> >> There is one case I'm aware of where we need to be careful: Before >> opening an image, qemu may probe the format. In this case, the image >> gets opened twice, and the first close comes before the second open. I'm >> not entirely sure how hard it would be to get rid of that behaviour. >> >> If we can't get rid of it, we have a small window that the refcount >> doesn't really cover, and if we weren't careful it would become racy. >> This is why I mentioned earlier that maybe we need to defer the refcount >> decrease a bit. However, I can't see how the in-use flag would make a >> difference there. If the refcount can't cover it, the in-use flag can't >> either. > > Yeah this is a problem. Could we introduce another flag to cover this? Adding more refcounts or flags is not a problem, but it doesn't solve it either. The hard question is when to set that flag. I believe it may be easier to just change the block layer so that it opens files only once during bdrv_open(). Kevin ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd 2012-07-05 14:51 ` Kevin Wolf @ 2012-07-05 16:35 ` Corey Bryant 2012-07-05 16:37 ` Corey Bryant 2012-07-05 17:00 ` Eric Blake 0 siblings, 2 replies; 56+ messages in thread From: Corey Bryant @ 2012-07-05 16:35 UTC (permalink / raw) To: Kevin Wolf Cc: aliguori, stefanha, libvir-list, qemu-devel, Luiz Capitulino, pbonzini, Eric Blake On 07/05/2012 10:51 AM, Kevin Wolf wrote: > Am 05.07.2012 16:22, schrieb Corey Bryant: >>>> For some examples: >>>> >>>> 1. client calls 'add-fd', qemu is now tracking fd=4 with refcount 1, in >>>> use by monitor, as member of fdset1 >>>> 2. client crashes, so all tracked fds are visited; fd=4 had not yet been >>>> passed to 'remove-fd', so qemu decrements refcount; refcount of fd=4 is >>>> now 0 so qemu closes it >>> >>> Doesn't the refcount belong to an fdset rather than a single fd? As long >>> as the fdset has still a reference, it's possible to reach the other fds >>> in the set through a reopen. For example: >>> >>> 1. client calls 'add-fd', qemu is now tracking fd=4 (O_RDWR) as a member >>> of fdset1, in use by monitor, refcount 1 >>> 2. client calls 'add-fd', qemu is now tracing fd=5 (O_RDONLY) as a >>> member of fdset, in use by monitor, refcount is still 1 >>> 3. client calls 'device-add' with /dev/fdset/1 as the backing filename >>> and r/w, so qemu_open() dup()s fd 4 and increments the refcount to 2 >>> 4. client crashes, so all tracked fdsets are visited; fdset1 gets its >>> refcount decreased to 1, and both member fds 4 and 5 stay open. >>> >>> If you had refcounted a single fd, fd 5 would be closed now and qemu >>> couldn't switch to O_RDONLY any more. >>> >> >> If the O_RDONLY is for a future command, it would make more sense to add >> that fd before that future command. Or are you passing it at step 2 >> because it is needed for the probe re-open issue? > > Come on, requiring realistic examples isn't fair. ;-) Heh :) > > Swap steps 2 and 3 in the example, then step 3 is just the preparation > for a command that uses the new fd. The problem stays the same. Or a > live commit operation could be in flight so that qemu must be able to > switch on completion without libvirt interaction. Good point. I thought it would be useful to run through the examples again with each fdset having a refcount, and each fd having an in-use flag. One difference though is that refcount is only incremented/decremented when qemu_open/qemu_close are called. I think this works and covers Kevin's concerns. Actually it might be a bit simpler too. 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with refcount of 0; fd=4's in-use flag is turned on 2. client calls 'device-add' with /dev/fdset/1 as the backing filename, so qemu_open() increments the refcount of fdset1 to 1 3. client crashes, so all tracked fds are visited; fd=4 had not yet been passed to 'remove-fd', so it's in-use flag is on; in-use flag is turned off and fd=4 is left open because it is still in use by the block device (refcount is 1) 4. client re-establishes QMP connection, so all tracked fds are visited, and in-use flags are turned back on; 'query-fds' lets client learn about fd=4 still being open as part of fdset1 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with refcount of 0; fd=4's in-use flag is turned on 2. client calls 'device-add' with /dev/fdset/1 as the backing filename, so qemu_open() increments the refcount of fdset1 to 1 3. client calls 'remove-fd fdset=1 fd=4', so qemu marks fd=4 as no longer in-use by the monitor, and is left open because it is in use by the block device (refcount is 1) 4. client crashes, so all tracked fds are visited; fd=4 is not in-use but refcount is 1 so it is not closed 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with refcount of 0; fd=4's in-use flag is turned on 2. client calls 'device-add' with /dev/fdset/1 as the backing filename, but the command fails for some other reason, so the refcount is still 0 at the end of the command (although it may have been temporarily incremented then decremented during the command) 3. client calls 'remove-fd fdset=1 fd=4' to deal with the failure (or QMP connection is closed), so qemu marks fd=4 as no longer in-use by the monitor; refcount is 0 so all fds in fdset1 are closed 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with refcount of 0; fd=4's in-use flag is turned on 2. client calls 'add-fd', qemu is now tracking fd=5 in fdset1 with refcount still 0; fd=5's in-use flag is turned on 3. client crashes, so all tracked fds are visited; fd=4 and fd=5 had not yet been passed to 'remove-fd', so their in-use flags are on; in-use flags are turned off; refcount of fdset1 is 0 so qemu closes all fds in fdset1 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with refcount of 0; fd=4's in-use flag is turned on 2. client calls 'add-fd', qemu is now tracking fd=5 in fdset1 with refcount still 0; fd=5's in-use flag is turned on 3. client calls 'remove-fd fdset=1 fd=4', so qemu marks fd=4 as no longer in-use by the monitor, and fd=4 is closed since the refcount is 0; fd=5 remains open 1. client calls 'add-fd', qemu is now tracking fd=4 (O_RDWR) in fdset1 with refcount of 0; fd=4's in-use flag is turned on 2. client calls 'add-fd', qemu is now tracking fd=5 (O_RDONLY) in fdset1 with refcount still 0; fd=5's in-use flag is turned on 3. client calls 'device-add' with /dev/fdset/1 as the backing filename and r/w, so qemu_open() dup()s fd 4 and increments the refcount to 1 4. client calls 'remove-fd fdset=1 fd=4', so qemu marks fd=4 as no longer in-use by the monitor, but fd=4 remains open because refcount of fdset1 is 1 5. client calls 'remove-fd fdset=1 fd=5', so qemu marks fd=5 as no longer in-use by the monitor, and fd=5 remains open because refcount of fdset1 is 1 6. qemu_close(fd=4) is called, refcount of fdset1 is decremented; both fds are closed since refcount is 0 and their in-use flags are off 1. client calls 'add-fd', qemu is now tracking fd=4 (O_RDWR) in fdset1 with refcount of 0; fd=4's in-use flag is turned on 2. client calls 'add-fd', qemu is now tracking fd=5 (O_RDONLY) in fdset1 with refcount still 0; fd=5's in-use flag is turned on 3. client calls 'device-add' with /dev/fdset/1 as the backing filename and r/w, so qemu_open() dup()s fd 4 and increments the refcount to 1 4. client crashes, so all tracked fds are visited; fd=4 and fd=5 have not yet been passed to 'remove-fd', so their in-use flags are on; in-use flags are turned off and both fds are left open because the set is still in use by the block device (refcount is 1) 5. qemu_close(fd=4) is called, refcount of fdset1 is decremented; both fds are closed since refcount is 0 and their in-use flags are off > > There are enough reasons that an fd in an fdset exists and is not > opened, but I can't think of one why it should be dropped. > >>>>>> In use by whom? If it's still in use in qemu (as in "in-use flag would >>>>>> be set") and we have a refcount of zero, then that's a bug. >>>>>> >>>>> >>>>> In use by qemu. I don't think it's a bug. I think there are situations >>>>> where refcount gets to zero but qemu is still using the fd. >>>> >>>> I think the refcount being non-zero _is_ what defines an fd as being in >>>> use by qemu (including use by the monitor). Any place you have to close >>>> an fd before reopening it is dangerous; the safe way is always to open >>>> with the new permissions before closing the old permissions. >>> >>> There is one case I'm aware of where we need to be careful: Before >>> opening an image, qemu may probe the format. In this case, the image >>> gets opened twice, and the first close comes before the second open. I'm >>> not entirely sure how hard it would be to get rid of that behaviour. >>> >>> If we can't get rid of it, we have a small window that the refcount >>> doesn't really cover, and if we weren't careful it would become racy. >>> This is why I mentioned earlier that maybe we need to defer the refcount >>> decrease a bit. However, I can't see how the in-use flag would make a >>> difference there. If the refcount can't cover it, the in-use flag can't >>> either. >> >> Yeah this is a problem. Could we introduce another flag to cover this? > > Adding more refcounts or flags is not a problem, but it doesn't solve it > either. The hard question is when to set that flag. > > I believe it may be easier to just change the block layer so that it > opens files only once during bdrv_open(). > > Kevin > -- Regards, Corey ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd 2012-07-05 16:35 ` Corey Bryant @ 2012-07-05 16:37 ` Corey Bryant 2012-07-06 9:06 ` Kevin Wolf 2012-07-05 17:00 ` Eric Blake 1 sibling, 1 reply; 56+ messages in thread From: Corey Bryant @ 2012-07-05 16:37 UTC (permalink / raw) To: Kevin Wolf Cc: aliguori, stefanha, libvir-list, qemu-devel, Luiz Capitulino, pbonzini, Eric Blake On 07/05/2012 12:35 PM, Corey Bryant wrote: > > > On 07/05/2012 10:51 AM, Kevin Wolf wrote: >> Am 05.07.2012 16:22, schrieb Corey Bryant: >>>>> For some examples: >>>>> >>>>> 1. client calls 'add-fd', qemu is now tracking fd=4 with refcount >>>>> 1, in >>>>> use by monitor, as member of fdset1 >>>>> 2. client crashes, so all tracked fds are visited; fd=4 had not yet >>>>> been >>>>> passed to 'remove-fd', so qemu decrements refcount; refcount of >>>>> fd=4 is >>>>> now 0 so qemu closes it >>>> >>>> Doesn't the refcount belong to an fdset rather than a single fd? As >>>> long >>>> as the fdset has still a reference, it's possible to reach the other >>>> fds >>>> in the set through a reopen. For example: >>>> >>>> 1. client calls 'add-fd', qemu is now tracking fd=4 (O_RDWR) as a >>>> member >>>> of fdset1, in use by monitor, refcount 1 >>>> 2. client calls 'add-fd', qemu is now tracing fd=5 (O_RDONLY) as a >>>> member of fdset, in use by monitor, refcount is still 1 >>>> 3. client calls 'device-add' with /dev/fdset/1 as the backing filename >>>> and r/w, so qemu_open() dup()s fd 4 and increments the refcount to 2 >>>> 4. client crashes, so all tracked fdsets are visited; fdset1 gets its >>>> refcount decreased to 1, and both member fds 4 and 5 stay open. >>>> >>>> If you had refcounted a single fd, fd 5 would be closed now and qemu >>>> couldn't switch to O_RDONLY any more. >>>> >>> >>> If the O_RDONLY is for a future command, it would make more sense to add >>> that fd before that future command. Or are you passing it at step 2 >>> because it is needed for the probe re-open issue? >> >> Come on, requiring realistic examples isn't fair. ;-) > > Heh :) > >> >> Swap steps 2 and 3 in the example, then step 3 is just the preparation >> for a command that uses the new fd. The problem stays the same. Or a >> live commit operation could be in flight so that qemu must be able to >> switch on completion without libvirt interaction. > > Good point. > > I thought it would be useful to run through the examples again with each > fdset having a refcount, and each fd having an in-use flag. One > difference though is that refcount is only incremented/decremented when > qemu_open/qemu_close are called. I think this works and covers Kevin's > concerns. Actually it might be a bit simpler too. > > 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with > refcount of 0; fd=4's in-use flag is turned on > 2. client calls 'device-add' with /dev/fdset/1 as the backing filename, > so qemu_open() increments the refcount of fdset1 to 1 > 3. client crashes, so all tracked fds are visited; fd=4 had not yet been > passed to 'remove-fd', so it's in-use flag is on; in-use flag is turned > off and fd=4 is left open because it is still in use by the block device > (refcount is 1) > 4. client re-establishes QMP connection, so all tracked fds are visited, > and in-use flags are turned back on; 'query-fds' lets client learn about > fd=4 still being open as part of fdset1 > > 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with > refcount of 0; fd=4's in-use flag is turned on > 2. client calls 'device-add' with /dev/fdset/1 as the backing filename, > so qemu_open() increments the refcount of fdset1 to 1 > 3. client calls 'remove-fd fdset=1 fd=4', so qemu marks fd=4 as no > longer in-use by the monitor, and is left open because it is in use by > the block device (refcount is 1) > 4. client crashes, so all tracked fds are visited; fd=4 is not in-use > but refcount is 1 so it is not closed > > 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with > refcount of 0; fd=4's in-use flag is turned on > 2. client calls 'device-add' with /dev/fdset/1 as the backing filename, > but the command fails for some other reason, so the refcount is still 0 > at the end of the command (although it may have been temporarily > incremented then decremented during the command) > 3. client calls 'remove-fd fdset=1 fd=4' to deal with the failure (or > QMP connection is closed), so qemu marks fd=4 as no longer in-use by the > monitor; refcount is 0 so all fds in fdset1 are closed > > 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with > refcount of 0; fd=4's in-use flag is turned on > 2. client calls 'add-fd', qemu is now tracking fd=5 in fdset1 with > refcount still 0; fd=5's in-use flag is turned on > 3. client crashes, so all tracked fds are visited; fd=4 and fd=5 had not > yet been passed to 'remove-fd', so their in-use flags are on; in-use > flags are turned off; refcount of fdset1 is 0 so qemu closes all fds in > fdset1 > > 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with > refcount of 0; fd=4's in-use flag is turned on > 2. client calls 'add-fd', qemu is now tracking fd=5 in fdset1 with > refcount still 0; fd=5's in-use flag is turned on > 3. client calls 'remove-fd fdset=1 fd=4', so qemu marks fd=4 as no > longer in-use by the monitor, and fd=4 is closed since the refcount is > 0; fd=5 remains open > > 1. client calls 'add-fd', qemu is now tracking fd=4 (O_RDWR) in fdset1 > with refcount of 0; fd=4's in-use flag is turned on > 2. client calls 'add-fd', qemu is now tracking fd=5 (O_RDONLY) in fdset1 > with refcount still 0; fd=5's in-use flag is turned on > 3. client calls 'device-add' with /dev/fdset/1 as the backing filename > and r/w, so qemu_open() dup()s fd 4 and increments the refcount to 1 > 4. client calls 'remove-fd fdset=1 fd=4', so qemu marks fd=4 as no > longer in-use by the monitor, but fd=4 remains open because refcount of > fdset1 is 1 > 5. client calls 'remove-fd fdset=1 fd=5', so qemu marks fd=5 as no > longer in-use by the monitor, and fd=5 remains open because refcount of > fdset1 is 1 > 6. qemu_close(fd=4) is called, refcount of fdset1 is decremented; both > fds are closed since refcount is 0 and their in-use flags are off > > 1. client calls 'add-fd', qemu is now tracking fd=4 (O_RDWR) in fdset1 > with refcount of 0; fd=4's in-use flag is turned on > 2. client calls 'add-fd', qemu is now tracking fd=5 (O_RDONLY) in fdset1 > with refcount still 0; fd=5's in-use flag is turned on > 3. client calls 'device-add' with /dev/fdset/1 as the backing filename > and r/w, so qemu_open() dup()s fd 4 and increments the refcount to 1 > 4. client crashes, so all tracked fds are visited; fd=4 and fd=5 have > not yet been passed to 'remove-fd', so their in-use flags are on; in-use > flags are turned off and both fds are left open because the set is still > in use by the block device (refcount is 1) > 5. qemu_close(fd=4) is called, refcount of fdset1 is decremented; both > fds are closed since refcount is 0 and their in-use flags are off > >> >> There are enough reasons that an fd in an fdset exists and is not >> opened, but I can't think of one why it should be dropped. >> >>>>>>> In use by whom? If it's still in use in qemu (as in "in-use flag >>>>>>> would >>>>>>> be set") and we have a refcount of zero, then that's a bug. >>>>>>> >>>>>> >>>>>> In use by qemu. I don't think it's a bug. I think there are >>>>>> situations >>>>>> where refcount gets to zero but qemu is still using the fd. >>>>> >>>>> I think the refcount being non-zero _is_ what defines an fd as >>>>> being in >>>>> use by qemu (including use by the monitor). Any place you have to >>>>> close >>>>> an fd before reopening it is dangerous; the safe way is always to open >>>>> with the new permissions before closing the old permissions. >>>> >>>> There is one case I'm aware of where we need to be careful: Before >>>> opening an image, qemu may probe the format. In this case, the image >>>> gets opened twice, and the first close comes before the second open. >>>> I'm >>>> not entirely sure how hard it would be to get rid of that behaviour. >>>> >>>> If we can't get rid of it, we have a small window that the refcount >>>> doesn't really cover, and if we weren't careful it would become racy. >>>> This is why I mentioned earlier that maybe we need to defer the >>>> refcount >>>> decrease a bit. However, I can't see how the in-use flag would make a >>>> difference there. If the refcount can't cover it, the in-use flag can't >>>> either. >>> >>> Yeah this is a problem. Could we introduce another flag to cover this? >> >> Adding more refcounts or flags is not a problem, but it doesn't solve it >> either. The hard question is when to set that flag. >> >> I believe it may be easier to just change the block layer so that it >> opens files only once during bdrv_open(). >> Can this fix be delivered after the fd passing patch series? -- Regards, Corey ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd 2012-07-05 16:37 ` Corey Bryant @ 2012-07-06 9:06 ` Kevin Wolf 0 siblings, 0 replies; 56+ messages in thread From: Kevin Wolf @ 2012-07-06 9:06 UTC (permalink / raw) To: Corey Bryant Cc: aliguori, stefanha, libvir-list, qemu-devel, Luiz Capitulino, pbonzini, Eric Blake Am 05.07.2012 18:37, schrieb Corey Bryant: >>>>> There is one case I'm aware of where we need to be careful: Before >>>>> opening an image, qemu may probe the format. In this case, the image >>>>> gets opened twice, and the first close comes before the second open. >>>>> I'm >>>>> not entirely sure how hard it would be to get rid of that behaviour. >>>>> >>>>> If we can't get rid of it, we have a small window that the refcount >>>>> doesn't really cover, and if we weren't careful it would become racy. >>>>> This is why I mentioned earlier that maybe we need to defer the >>>>> refcount >>>>> decrease a bit. However, I can't see how the in-use flag would make a >>>>> difference there. If the refcount can't cover it, the in-use flag can't >>>>> either. >>>> >>>> Yeah this is a problem. Could we introduce another flag to cover this? >>> >>> Adding more refcounts or flags is not a problem, but it doesn't solve it >>> either. The hard question is when to set that flag. >>> >>> I believe it may be easier to just change the block layer so that it >>> opens files only once during bdrv_open(). > > Can this fix be delivered after the fd passing patch series? Sure, we can't fix everything at once. Kevin ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd 2012-07-05 16:35 ` Corey Bryant 2012-07-05 16:37 ` Corey Bryant @ 2012-07-05 17:00 ` Eric Blake 2012-07-05 17:36 ` Corey Bryant 2012-07-06 9:11 ` Kevin Wolf 1 sibling, 2 replies; 56+ messages in thread From: Eric Blake @ 2012-07-05 17:00 UTC (permalink / raw) To: Corey Bryant Cc: Kevin Wolf, aliguori, stefanha, libvir-list, qemu-devel, Luiz Capitulino, pbonzini [-- Attachment #1: Type: text/plain, Size: 2946 bytes --] On 07/05/2012 10:35 AM, Corey Bryant wrote: > 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with > refcount of 0; fd=4's in-use flag is turned on > 2. client calls 'device-add' with /dev/fdset/1 as the backing filename, > so qemu_open() increments the refcount of fdset1 to 1 > 3. client crashes, so all tracked fds are visited; fd=4 had not yet been > passed to 'remove-fd', so it's in-use flag is on; in-use flag is turned > off and fd=4 is left open because it is still in use by the block device > (refcount is 1) > 4. client re-establishes QMP connection, so all tracked fds are visited, > and in-use flags are turned back on; 'query-fds' lets client learn about > fd=4 still being open as part of fdset1 This says that an in-use fd comes back into use after a crash... > > 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with > refcount of 0; fd=4's in-use flag is turned on > 2. client calls 'device-add' with /dev/fdset/1 as the backing filename, > so qemu_open() increments the refcount of fdset1 to 1 > 3. client calls 'remove-fd fdset=1 fd=4', so qemu marks fd=4 as no > longer in-use by the monitor, and is left open because it is in use by > the block device (refcount is 1) > 4. client crashes, so all tracked fds are visited; fd=4 is not in-use > but refcount is 1 so it is not closed 5. client re-establishes QMP connection, so all tracked fds are visited. What happens to the fd=4 in-use flag? ...but what are the semantics here? If we _always_ turn the in-use flag back on, then that says that even though libvirt successfully asked to close the fd, the reconnection means that libvirt once again has to ask to close things. If we _never_ turn the in-use flag back on, then we've broken the first case above where we want an in-use fd to come back into use after a crash. Maybe that argues for two flags per fd: an in-use flag (there is currently a monitor connection that can manipulate the fd, either because it passed in the fd or because it reconnected) and a removed flag (a monitor called remove-fd, and no longer wants to know about the fd, even if it crashes and reconnects). An fd is then closed when 'refcount == 0 && (!inuse || removed)'. On monitor disconnect, the inuse flag is cleared, and on reconnect, it is set; but that does not impact the removed flag. And the 'query-fds' command would omit any fd with the 'removed' flag, even when the fd is still kept open internally thanks to the refcount being nonzero. But I'm definitely agreeing that tying the refcount to the set rather than to individual fds within the set makes sense; you still avoid the fd leak in that all fds in the set are closed when both the monitor has disavowed the set and when qemu_close() has finished using any of the fds. -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 620 bytes --] ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd 2012-07-05 17:00 ` Eric Blake @ 2012-07-05 17:36 ` Corey Bryant 2012-07-06 9:11 ` Kevin Wolf 1 sibling, 0 replies; 56+ messages in thread From: Corey Bryant @ 2012-07-05 17:36 UTC (permalink / raw) To: Eric Blake Cc: Kevin Wolf, aliguori, stefanha, libvir-list, qemu-devel, Luiz Capitulino, pbonzini On 07/05/2012 01:00 PM, Eric Blake wrote: > On 07/05/2012 10:35 AM, Corey Bryant wrote: > >> 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with >> refcount of 0; fd=4's in-use flag is turned on >> 2. client calls 'device-add' with /dev/fdset/1 as the backing filename, >> so qemu_open() increments the refcount of fdset1 to 1 >> 3. client crashes, so all tracked fds are visited; fd=4 had not yet been >> passed to 'remove-fd', so it's in-use flag is on; in-use flag is turned >> off and fd=4 is left open because it is still in use by the block device >> (refcount is 1) >> 4. client re-establishes QMP connection, so all tracked fds are visited, >> and in-use flags are turned back on; 'query-fds' lets client learn about >> fd=4 still being open as part of fdset1 > > This says that an in-use fd comes back into use after a crash... > >> >> 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with >> refcount of 0; fd=4's in-use flag is turned on >> 2. client calls 'device-add' with /dev/fdset/1 as the backing filename, >> so qemu_open() increments the refcount of fdset1 to 1 >> 3. client calls 'remove-fd fdset=1 fd=4', so qemu marks fd=4 as no >> longer in-use by the monitor, and is left open because it is in use by >> the block device (refcount is 1) >> 4. client crashes, so all tracked fds are visited; fd=4 is not in-use >> but refcount is 1 so it is not closed > 5. client re-establishes QMP connection, so all tracked fds are visited. > What happens to the fd=4 in-use flag? > > ...but what are the semantics here? > > If we _always_ turn the in-use flag back on, then that says that even > though libvirt successfully asked to close the fd, the reconnection > means that libvirt once again has to ask to close things. > Ah, yeah I missed that. > If we _never_ turn the in-use flag back on, then we've broken the first > case above where we want an in-use fd to come back into use after a crash. > > Maybe that argues for two flags per fd: an in-use flag (there is > currently a monitor connection that can manipulate the fd, either > because it passed in the fd or because it reconnected) and a removed > flag (a monitor called remove-fd, and no longer wants to know about the > fd, even if it crashes and reconnects). An fd is then closed when > 'refcount == 0 && (!inuse || removed)'. On monitor disconnect, the > inuse flag is cleared, and on reconnect, it is set; but that does not > impact the removed flag. And the 'query-fds' command would omit any fd > with the 'removed' flag, even when the fd is still kept open internally > thanks to the refcount being nonzero. > I agree. Having the 2 flags makes sense and solves the issues you mentioned above. -- Regards, Corey > But I'm definitely agreeing that tying the refcount to the set rather > than to individual fds within the set makes sense; you still avoid the > fd leak in that all fds in the set are closed when both the monitor has > disavowed the set and when qemu_close() has finished using any of the fds. > ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd 2012-07-05 17:00 ` Eric Blake 2012-07-05 17:36 ` Corey Bryant @ 2012-07-06 9:11 ` Kevin Wolf 2012-07-06 17:14 ` Corey Bryant 2012-07-06 17:40 ` Corey Bryant 1 sibling, 2 replies; 56+ messages in thread From: Kevin Wolf @ 2012-07-06 9:11 UTC (permalink / raw) To: Eric Blake Cc: aliguori, stefanha, libvir-list, Corey Bryant, qemu-devel, Luiz Capitulino, pbonzini Am 05.07.2012 19:00, schrieb Eric Blake: > On 07/05/2012 10:35 AM, Corey Bryant wrote: >> 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with >> refcount of 0; fd=4's in-use flag is turned on >> 2. client calls 'device-add' with /dev/fdset/1 as the backing filename, >> so qemu_open() increments the refcount of fdset1 to 1 >> 3. client calls 'remove-fd fdset=1 fd=4', so qemu marks fd=4 as no >> longer in-use by the monitor, and is left open because it is in use by >> the block device (refcount is 1) >> 4. client crashes, so all tracked fds are visited; fd=4 is not in-use >> but refcount is 1 so it is not closed > 5. client re-establishes QMP connection, so all tracked fds are visited. > What happens to the fd=4 in-use flag? > > ...but what are the semantics here? > > If we _always_ turn the in-use flag back on, then that says that even > though libvirt successfully asked to close the fd, the reconnection > means that libvirt once again has to ask to close things. > > If we _never_ turn the in-use flag back on, then we've broken the first > case above where we want an in-use fd to come back into use after a crash. > > Maybe that argues for two flags per fd: an in-use flag (there is > currently a monitor connection that can manipulate the fd, either > because it passed in the fd or because it reconnected) and a removed > flag (a monitor called remove-fd, and no longer wants to know about the > fd, even if it crashes and reconnects). I was in fact just going to suggest a removed flag as well, however combined with including the monitor connections in the refcount instead of an additional flag. This would also allow to have (the currently mostly hypothetical case of) multiple QMP monitors without interesting things happening. Maybe I'm missing some point that the inuse flag would allow and including monitors in the refcount doesn't. Is there one? Kevin ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd 2012-07-06 9:11 ` Kevin Wolf @ 2012-07-06 17:14 ` Corey Bryant 2012-07-06 17:15 ` Corey Bryant 2012-07-06 17:40 ` Corey Bryant 1 sibling, 1 reply; 56+ messages in thread From: Corey Bryant @ 2012-07-06 17:14 UTC (permalink / raw) To: Kevin Wolf Cc: aliguori, stefanha, libvir-list, qemu-devel, Luiz Capitulino, pbonzini, Eric Blake On 07/06/2012 05:11 AM, Kevin Wolf wrote: > Am 05.07.2012 19:00, schrieb Eric Blake: >> On 07/05/2012 10:35 AM, Corey Bryant wrote: >>> 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with >>> refcount of 0; fd=4's in-use flag is turned on >>> 2. client calls 'device-add' with /dev/fdset/1 as the backing filename, >>> so qemu_open() increments the refcount of fdset1 to 1 >>> 3. client calls 'remove-fd fdset=1 fd=4', so qemu marks fd=4 as no >>> longer in-use by the monitor, and is left open because it is in use by >>> the block device (refcount is 1) >>> 4. client crashes, so all tracked fds are visited; fd=4 is not in-use >>> but refcount is 1 so it is not closed >> 5. client re-establishes QMP connection, so all tracked fds are visited. >> What happens to the fd=4 in-use flag? >> >> ...but what are the semantics here? >> >> If we _always_ turn the in-use flag back on, then that says that even >> though libvirt successfully asked to close the fd, the reconnection >> means that libvirt once again has to ask to close things. >> >> If we _never_ turn the in-use flag back on, then we've broken the first >> case above where we want an in-use fd to come back into use after a crash. >> >> Maybe that argues for two flags per fd: an in-use flag (there is >> currently a monitor connection that can manipulate the fd, either >> because it passed in the fd or because it reconnected) and a removed >> flag (a monitor called remove-fd, and no longer wants to know about the >> fd, even if it crashes and reconnects). > > I was in fact just going to suggest a removed flag as well, however > combined with including the monitor connections in the refcount instead > of an additional flag. This would also allow to have (the currently > mostly hypothetical case of) multiple QMP monitors without interesting > things happening. > > Maybe I'm missing some point that the inuse flag would allow and > including monitors in the refcount doesn't. Is there one? > > Kevin > I think we need the granularity of having an in-use flag per fd. Here's an example of why: 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with refcount of 1; fd=4's remove flag is initialized to off 2. client calls 'device-add' with /dev/fdset/1 as the backing filename; qemu_open() increments the refcount of fdset1 to 2 3. client crashes, so all fdsets are visited; fd=4 had not yet been passed to 'remove-fd', so it's remove flag is off; refcount for fdset1 is decremented to 1; fd=4 is left open because it is still in use by the block device (refcount is 1) 4. client re-establishes QMP connection, refcount for fdset1 is incremented to 2; 'query-fds' lets client learn about fd=4 still being open as part of fdset1 5. qemu_close is called for fd=4; refcount for fdset1 is decremented to 1; fd=4 remains open as part of fdset1 6. QMP disconnects; refcount for fdset is decremented to zero; fd=4 is closed and fdset1 is freed 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with refcount of 1 (because of monitor reference); fd=4's remove flag is initialized to off 2. client calls 'device-add' with /dev/fdset/1 as the backing filename, but the command fails for some other reason, so the refcount is still 1 at the end of the command (although it may have been temporarily incremented then decremented during the command) 3. client calls 'remove-fd fdset=1 fd=4' to deal with the failure (or QMP connection is closed), so qemu turns on the remove flag for fd=4; refcount is 0 so all fds in fdset1 are closed I think we need the granularity of having an in-use flag per fd. If we track monitor in-use in the reference count, then we are tracking it for the fdset and I think this could cause leaks. In the following example, we have a refcount for the fdset, an in-use flag per fd, and a remove flag per fd. We're only incrementing/decrementing refcount in qemu_open/qemu_close. In the following example, we have a refcount for the fdset, and a remove flag per fd. We're incrementing refcount when the fdset is first created, when QMP re-connects, and in qemu_open. We're decrementing refcount when QMP disconnects, and in qemu_close. 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with refcount of 1; fd=4's remove flag is initialized to off 2. client crashes, so all tracked fdsets are visited; fd=4 has not yet been passed to 'remove-fd', so its remove flag is off; in-use flags are turned off and both fds are left open because the set is still in use by the block device (refcount is 1) 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with refcount of 1; fd=4's remove flag is initialized to off 2. client calls 'device-add' with /dev/fdset/1 as the backing filename; qemu_open() increments the refcount of fdset1 to 2 3. client crashes, so all fdsets are visited; fd=4 had not yet been passed to 'remove-fd', so it's remove flag is off; refcount for fdset1 is decremented to 1; fd=4 is left open because it is still in use by the block device (refcount is 1) 4. client re-establishes QMP connection, refcount for fdset1 is incremented to 2; 'query-fds' lets client learn about fd=4 still being open as part of fdset1 5. qemu_close is called for fd=4; refcount for fdset1 is decremented to 1; fd=4 remains open as part of fdset1 6. QMP disconnects; refcount for fdset is decremented to zero; fd=4 is closed and fdset1 is freed 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with refcount of 0; fd=4's in-use flag is turned on 2. client calls 'device-add' with /dev/fdset/1 as the backing filename, so qemu_open() increments the refcount of fdset1 to 1 3. client calls 'remove-fd fdset=1 fd=4', so qemu marks fd=4 as no longer in-use by the monitor, and is left open because it is in use by the block device (refcount is 1) 4. client crashes, so all tracked fds are visited; fd=4 is not in-use but refcount is 1 so it is not closed 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with refcount of 0; fd=4's in-use flag is turned on 2. client calls 'device-add' with /dev/fdset/1 as the backing filename, but the command fails for some other reason, so the refcount is still 0 at the end of the command (although it may have been temporarily incremented then decremented during the command) 3. client calls 'remove-fd fdset=1 fd=4' to deal with the failure (or QMP connection is closed), so qemu marks fd=4 as no longer in-use by the monitor; refcount is 0 so all fds in fdset1 are closed 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with refcount of 0; fd=4's in-use flag is turned on 2. client calls 'add-fd', qemu is now tracking fd=5 in fdset1 with refcount still 0; fd=5's in-use flag is turned on 3. client crashes, so all tracked fds are visited; fd=4 and fd=5 had not yet been passed to 'remove-fd', so their in-use flags are on; in-use flags are turned off; refcount of fdset1 is 0 so qemu closes all fds in fdset1 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with refcount of 0; fd=4's in-use flag is turned on 2. client calls 'add-fd', qemu is now tracking fd=5 in fdset1 with refcount still 0; fd=5's in-use flag is turned on 3. client calls 'remove-fd fdset=1 fd=4', so qemu marks fd=4 as no longer in-use by the monitor, and fd=4 is closed since the refcount is 0; fd=5 remains open 1. client calls 'add-fd', qemu is now tracking fd=4 (O_RDWR) in fdset1 with refcount of 0; fd=4's in-use flag is turned on 2. client calls 'add-fd', qemu is now tracking fd=5 (O_RDONLY) in fdset1 with refcount still 0; fd=5's in-use flag is turned on 3. client calls 'device-add' with /dev/fdset/1 as the backing filename and r/w, so qemu_open() dup()s fd 4 and increments the refcount to 1 4. client calls 'remove-fd fdset=1 fd=4', so qemu marks fd=4 as no longer in-use by the monitor, but fd=4 remains open because refcount of fdset1 is 1 5. client calls 'remove-fd fdset=1 fd=5', so qemu marks fd=5 as no longer in-use by the monitor, and fd=5 remains open because refcount of fdset1 is 1 6. qemu_close(fd=4) is called, refcount of fdset1 is decremented; both fds are closed since refcount is 0 and their in-use flags are off 1. client calls 'add-fd', qemu is now tracking fd=4 (O_RDWR) in fdset1 with refcount of 0; fd=4's in-use flag is turned on 2. client calls 'add-fd', qemu is now tracking fd=5 (O_RDONLY) in fdset1 with refcount still 0; fd=5's in-use flag is turned on 3. client calls 'device-add' with /dev/fdset/1 as the backing filename and r/w, so qemu_open() dup()s fd 4 and increments the refcount to 1 4. client crashes, so all tracked fds are visited; fd=4 and fd=5 have not yet been passed to 'remove-fd', so their in-use flags are on; in-use flags are turned off and both fds are left open because the set is still in use by the block device (refcount is 1) 5. qemu_close(fd=4) is called, refcount of fdset1 is decremented; both fds are closed since refcount is 0 and their in-use flags are off -- Regards, Corey ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd 2012-07-06 17:14 ` Corey Bryant @ 2012-07-06 17:15 ` Corey Bryant 0 siblings, 0 replies; 56+ messages in thread From: Corey Bryant @ 2012-07-06 17:15 UTC (permalink / raw) To: Kevin Wolf Cc: aliguori, stefanha, libvir-list, qemu-devel, Luiz Capitulino, pbonzini, Eric Blake Ugh... please disregard this. I hit send accidentally. On 07/06/2012 01:14 PM, Corey Bryant wrote: > > > On 07/06/2012 05:11 AM, Kevin Wolf wrote: >> Am 05.07.2012 19:00, schrieb Eric Blake: >>> On 07/05/2012 10:35 AM, Corey Bryant wrote: >>>> 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with >>>> refcount of 0; fd=4's in-use flag is turned on >>>> 2. client calls 'device-add' with /dev/fdset/1 as the backing filename, >>>> so qemu_open() increments the refcount of fdset1 to 1 >>>> 3. client calls 'remove-fd fdset=1 fd=4', so qemu marks fd=4 as no >>>> longer in-use by the monitor, and is left open because it is in use by >>>> the block device (refcount is 1) >>>> 4. client crashes, so all tracked fds are visited; fd=4 is not in-use >>>> but refcount is 1 so it is not closed >>> 5. client re-establishes QMP connection, so all tracked fds are visited. >>> What happens to the fd=4 in-use flag? >>> >>> ...but what are the semantics here? >>> >>> If we _always_ turn the in-use flag back on, then that says that even >>> though libvirt successfully asked to close the fd, the reconnection >>> means that libvirt once again has to ask to close things. >>> >>> If we _never_ turn the in-use flag back on, then we've broken the first >>> case above where we want an in-use fd to come back into use after a >>> crash. >>> >>> Maybe that argues for two flags per fd: an in-use flag (there is >>> currently a monitor connection that can manipulate the fd, either >>> because it passed in the fd or because it reconnected) and a removed >>> flag (a monitor called remove-fd, and no longer wants to know about the >>> fd, even if it crashes and reconnects). >> >> I was in fact just going to suggest a removed flag as well, however >> combined with including the monitor connections in the refcount instead >> of an additional flag. This would also allow to have (the currently >> mostly hypothetical case of) multiple QMP monitors without interesting >> things happening. >> >> Maybe I'm missing some point that the inuse flag would allow and >> including monitors in the refcount doesn't. Is there one? >> >> Kevin >> > > I think we need the granularity of having an in-use flag per fd. Here's > an example of why: > > 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with > refcount of 1; fd=4's remove flag is initialized to off > 2. client calls 'device-add' with /dev/fdset/1 as the backing filename; > qemu_open() increments the refcount of fdset1 to 2 > 3. client crashes, so all fdsets are visited; fd=4 had not yet been > passed to 'remove-fd', so it's remove flag is off; refcount for fdset1 > is decremented to 1; fd=4 is left open because it is still in use by the > block device (refcount is 1) > 4. client re-establishes QMP connection, refcount for fdset1 is > incremented to 2; 'query-fds' lets client learn about fd=4 still being > open as part of fdset1 > 5. qemu_close is called for fd=4; refcount for fdset1 is decremented to > 1; fd=4 remains open as part of fdset1 > 6. QMP disconnects; refcount for fdset is decremented to zero; fd=4 is > closed and fdset1 is freed > > 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with > refcount of 1 (because of monitor reference); fd=4's remove flag is > initialized to off > 2. client calls 'device-add' with /dev/fdset/1 as the backing filename, > but the command fails for some other reason, so the refcount is still 1 > at the end of the command (although it may have been temporarily > incremented then decremented during the command) > 3. client calls 'remove-fd fdset=1 fd=4' to deal with the failure (or > QMP connection is closed), so qemu turns on the remove flag for fd=4; > refcount is 0 so all fds in fdset1 are closed > > > I think we need the granularity of having an in-use flag per fd. If we > track monitor in-use in the reference count, then we are tracking it for > the fdset and I think this could cause leaks. > > In the following example, we have a refcount for the fdset, an in-use > flag per fd, and a remove flag per fd. We're only > incrementing/decrementing refcount in qemu_open/qemu_close. > > > > > > In the following example, we have a refcount for the fdset, and a remove > flag per fd. We're incrementing refcount when the fdset is first > created, when QMP re-connects, and in qemu_open. We're decrementing > refcount when QMP disconnects, and in qemu_close. > > 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with > refcount of 1; fd=4's remove flag is initialized to off > 2. client crashes, so all tracked fdsets are visited; fd=4 has > not yet been passed to 'remove-fd', so its remove flag is off; in-use > flags are turned off and both fds are left open because the set is still > in use by the block device (refcount is 1) > > > > > > > > > 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with > refcount of 1; fd=4's remove flag is initialized to off > 2. client calls 'device-add' with /dev/fdset/1 as the backing filename; > qemu_open() increments the refcount of fdset1 to 2 > 3. client crashes, so all fdsets are visited; fd=4 had not yet been > passed to 'remove-fd', so it's remove flag is off; refcount for fdset1 > is decremented to 1; fd=4 is left open because it is still in use by the > block device (refcount is 1) > 4. client re-establishes QMP connection, refcount for fdset1 is > incremented to 2; 'query-fds' lets client learn about fd=4 still being > open as part of fdset1 > 5. qemu_close is called for fd=4; refcount for fdset1 is decremented to > 1; fd=4 remains open as part of fdset1 > 6. QMP disconnects; refcount for fdset is decremented to zero; fd=4 is > closed and fdset1 is freed > > 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with > refcount of 0; fd=4's in-use flag is turned on > 2. client calls 'device-add' with /dev/fdset/1 as the backing filename, > so qemu_open() increments the refcount of fdset1 to 1 > 3. client calls 'remove-fd fdset=1 fd=4', so qemu marks fd=4 as no > longer in-use by the monitor, and is left open because it is in use by > the block device (refcount is 1) > 4. client crashes, so all tracked fds are visited; fd=4 is not in-use > but refcount is 1 so it is not closed > > 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with > refcount of 0; fd=4's in-use flag is turned on > 2. client calls 'device-add' with /dev/fdset/1 as the backing filename, > but the command fails for some other reason, so the refcount is still 0 > at the end of the command (although it may have been temporarily > incremented then decremented during the command) > 3. client calls 'remove-fd fdset=1 fd=4' to deal with the failure (or > QMP connection is closed), so qemu marks fd=4 as no longer in-use by the > monitor; refcount is 0 so all fds in fdset1 are closed > > 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with > refcount of 0; fd=4's in-use flag is turned on > 2. client calls 'add-fd', qemu is now tracking fd=5 in fdset1 with > refcount still 0; fd=5's in-use flag is turned on > 3. client crashes, so all tracked fds are visited; fd=4 and fd=5 had not > yet been passed to 'remove-fd', so their in-use flags are on; in-use > flags are turned off; refcount of fdset1 is 0 so qemu closes all fds in > fdset1 > > 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with > refcount of 0; fd=4's in-use flag is turned on > 2. client calls 'add-fd', qemu is now tracking fd=5 in fdset1 with > refcount still 0; fd=5's in-use flag is turned on > 3. client calls 'remove-fd fdset=1 fd=4', so qemu marks fd=4 as no > longer in-use by the monitor, and fd=4 is closed since the refcount is > 0; fd=5 remains open > > 1. client calls 'add-fd', qemu is now tracking fd=4 (O_RDWR) in fdset1 > with refcount of 0; fd=4's in-use flag is turned on > 2. client calls 'add-fd', qemu is now tracking fd=5 (O_RDONLY) in fdset1 > with refcount still 0; fd=5's in-use flag is turned on > 3. client calls 'device-add' with /dev/fdset/1 as the backing filename > and r/w, so qemu_open() dup()s fd 4 and increments the refcount to 1 > 4. client calls 'remove-fd fdset=1 fd=4', so qemu marks fd=4 as no > longer in-use by the monitor, but fd=4 remains open because refcount of > fdset1 is 1 > 5. client calls 'remove-fd fdset=1 fd=5', so qemu marks fd=5 as no > longer in-use by the monitor, and fd=5 remains open because refcount of > fdset1 is 1 > 6. qemu_close(fd=4) is called, refcount of fdset1 is decremented; both > fds are closed since refcount is 0 and their in-use flags are off > > 1. client calls 'add-fd', qemu is now tracking fd=4 (O_RDWR) in fdset1 > with refcount of 0; fd=4's in-use flag is turned on > 2. client calls 'add-fd', qemu is now tracking fd=5 (O_RDONLY) in fdset1 > with refcount still 0; fd=5's in-use flag is turned on > 3. client calls 'device-add' with /dev/fdset/1 as the backing filename > and r/w, so qemu_open() dup()s fd 4 and increments the refcount to 1 > 4. client crashes, so all tracked fds are visited; fd=4 and fd=5 have > not yet been passed to 'remove-fd', so their in-use flags are on; in-use > flags are turned off and both fds are left open because the set is still > in use by the block device (refcount is 1) > 5. qemu_close(fd=4) is called, refcount of fdset1 is decremented; both > fds are closed since refcount is 0 and their in-use flags are off > > -- Regards, Corey ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd 2012-07-06 9:11 ` Kevin Wolf 2012-07-06 17:14 ` Corey Bryant @ 2012-07-06 17:40 ` Corey Bryant 2012-07-06 18:19 ` [Qemu-devel] [libvirt] " Corey Bryant 2012-07-09 14:04 ` [Qemu-devel] " Kevin Wolf 1 sibling, 2 replies; 56+ messages in thread From: Corey Bryant @ 2012-07-06 17:40 UTC (permalink / raw) To: Kevin Wolf Cc: aliguori, stefanha, libvir-list, qemu-devel, Luiz Capitulino, pbonzini, Eric Blake On 07/06/2012 05:11 AM, Kevin Wolf wrote: > Am 05.07.2012 19:00, schrieb Eric Blake: >> On 07/05/2012 10:35 AM, Corey Bryant wrote: >>> 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with >>> refcount of 0; fd=4's in-use flag is turned on >>> 2. client calls 'device-add' with /dev/fdset/1 as the backing filename, >>> so qemu_open() increments the refcount of fdset1 to 1 >>> 3. client calls 'remove-fd fdset=1 fd=4', so qemu marks fd=4 as no >>> longer in-use by the monitor, and is left open because it is in use by >>> the block device (refcount is 1) >>> 4. client crashes, so all tracked fds are visited; fd=4 is not in-use >>> but refcount is 1 so it is not closed >> 5. client re-establishes QMP connection, so all tracked fds are visited. >> What happens to the fd=4 in-use flag? >> >> ...but what are the semantics here? >> >> If we _always_ turn the in-use flag back on, then that says that even >> though libvirt successfully asked to close the fd, the reconnection >> means that libvirt once again has to ask to close things. >> >> If we _never_ turn the in-use flag back on, then we've broken the first >> case above where we want an in-use fd to come back into use after a crash. >> >> Maybe that argues for two flags per fd: an in-use flag (there is >> currently a monitor connection that can manipulate the fd, either >> because it passed in the fd or because it reconnected) and a removed >> flag (a monitor called remove-fd, and no longer wants to know about the >> fd, even if it crashes and reconnects). > > I was in fact just going to suggest a removed flag as well, however > combined with including the monitor connections in the refcount instead > of an additional flag. This would also allow to have (the currently > mostly hypothetical case of) multiple QMP monitors without interesting > things happening. > > Maybe I'm missing some point that the inuse flag would allow and > including monitors in the refcount doesn't. Is there one? > > Kevin > Ok let me try this again. I was going through some of the examples and I think we need a separate in-use flag. Otherwise, when refcount gets to 1, we don't know if it is because of a monitor reference or a block device reference. I think it would cause fds to sit on the monitor until refcount gets to zero (monitor disconnects). Here's an example without the in-use flag: 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with refcount of 1 (incremented because of monitor reference); fd=4's remove flag is initialized to off 2. client calls 'device-add' with /dev/fdset/1 as the backing filename; qemu_open() increments the refcount of fdset1 to 2 3. client crashes, so all fdsets are visited; fd=4 had not yet been passed to 'remove-fd', so it's remove flag is off; refcount for fdset1 is decremented to 1; fd=4 is left open because it is still in use by the block device (refcount is 1) 4. client re-establishes QMP connection, refcount for fdset1 is incremented to 2; 'query-fds' lets client learn about fd=4 still being open as part of fdset1 5. client calls 'remove-fd fdset=1 fd=4'; qemu turns on remove flag for fd=4; but fd=4 remains open because refcount of fdset1 is 2 6. qemu_close is called for fd=4; refcount for fdset1 is decremented to 1; fd=4 remains open because monitor still references fdset1 (refcount of fdset1 is 1) 7. Sometime later.. QMP disconnects; refcount for fdset is decremented to zero; fd=4 is closed In the following case, we have an in-use and remove flag per fd and we only increment/decrement refcount on qemu_open/qemu_close: 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with refcount of 0; fd=4's remove flag is initialized to off and in-use flag is initialized to on 2. client calls 'device-add' with /dev/fdset/1 as the backing filename; qemu_open() increments the refcount of fdset1 to 1 3. client crashes, so all fdsets are visited; fd=4 had not yet been passed to 'remove-fd', so it's remove flag is off; fd=4's in-use flag is turned off; fd=4 is left open because it is still in-use by the block device (refcount is still 1) 4. client re-establishes QMP connection, refcount for fdset1 is still 1; fd=4's in-use flag is turned on; 'query-fds' lets client learn about fd=4 still being open as part of fdset1 5. client calls 'remove-fd fdset=1 fd=4'; qemu turns on remove flag for fd=4; but fd=4 remains open because refcount of fdset1 is 1 6. qemu_close is called for fd=4; refcount for fdset1 is decremented to 0; fd=4 is closed because (refcount == 0 && (!inuse || removed)) is true 7. Sometime later.. QMP disconnects -- Regards, Corey ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [libvirt] [PATCH v4 0/7] file descriptor passing using pass-fd 2012-07-06 17:40 ` Corey Bryant @ 2012-07-06 18:19 ` Corey Bryant 2012-07-09 14:04 ` [Qemu-devel] " Kevin Wolf 1 sibling, 0 replies; 56+ messages in thread From: Corey Bryant @ 2012-07-06 18:19 UTC (permalink / raw) To: Kevin Wolf; +Cc: libvir-list, pbonzini, aliguori, stefanha, qemu-devel On 07/06/2012 01:40 PM, Corey Bryant wrote: > > > On 07/06/2012 05:11 AM, Kevin Wolf wrote: >> Am 05.07.2012 19:00, schrieb Eric Blake: >>> On 07/05/2012 10:35 AM, Corey Bryant wrote: >>>> 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with >>>> refcount of 0; fd=4's in-use flag is turned on >>>> 2. client calls 'device-add' with /dev/fdset/1 as the backing filename, >>>> so qemu_open() increments the refcount of fdset1 to 1 >>>> 3. client calls 'remove-fd fdset=1 fd=4', so qemu marks fd=4 as no >>>> longer in-use by the monitor, and is left open because it is in use by >>>> the block device (refcount is 1) >>>> 4. client crashes, so all tracked fds are visited; fd=4 is not in-use >>>> but refcount is 1 so it is not closed >>> 5. client re-establishes QMP connection, so all tracked fds are visited. >>> What happens to the fd=4 in-use flag? >>> >>> ...but what are the semantics here? >>> >>> If we _always_ turn the in-use flag back on, then that says that even >>> though libvirt successfully asked to close the fd, the reconnection >>> means that libvirt once again has to ask to close things. >>> >>> If we _never_ turn the in-use flag back on, then we've broken the first >>> case above where we want an in-use fd to come back into use after a >>> crash. >>> >>> Maybe that argues for two flags per fd: an in-use flag (there is >>> currently a monitor connection that can manipulate the fd, either >>> because it passed in the fd or because it reconnected) and a removed >>> flag (a monitor called remove-fd, and no longer wants to know about the >>> fd, even if it crashes and reconnects). >> >> I was in fact just going to suggest a removed flag as well, however >> combined with including the monitor connections in the refcount instead >> of an additional flag. This would also allow to have (the currently >> mostly hypothetical case of) multiple QMP monitors without interesting >> things happening. >> >> Maybe I'm missing some point that the inuse flag would allow and >> including monitors in the refcount doesn't. Is there one? >> >> Kevin >> > > Ok let me try this again. I was going through some of the examples and I > think we need a separate in-use flag. Otherwise, when refcount gets to > 1, we don't know if it is because of a monitor reference or a block > device reference. I think it would cause fds to sit on the monitor > until refcount gets to zero (monitor disconnects). Here's an example > without the in-use flag: > > 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with > refcount of 1 (incremented because of monitor reference); fd=4's remove > flag is initialized to off > 2. client calls 'device-add' with /dev/fdset/1 as the backing filename; > qemu_open() increments the refcount of fdset1 to 2 > 3. client crashes, so all fdsets are visited; fd=4 had not yet been > passed to 'remove-fd', so it's remove flag is off; refcount for fdset1 > is decremented to 1; fd=4 is left open because it is still in use by the > block device (refcount is 1) > 4. client re-establishes QMP connection, refcount for fdset1 is > incremented to 2; 'query-fds' lets client learn about fd=4 still being > open as part of fdset1 > 5. client calls 'remove-fd fdset=1 fd=4'; qemu turns on remove flag for > fd=4; but fd=4 remains open because refcount of fdset1 is 2 > 6. qemu_close is called for fd=4; refcount for fdset1 is decremented to > 1; fd=4 remains open because monitor still references fdset1 (refcount > of fdset1 is 1) > 7. Sometime later.. QMP disconnects; refcount for fdset is decremented > to zero; fd=4 is closed > > In the following case, we have an in-use and remove flag per fd and we > only increment/decrement refcount on qemu_open/qemu_close: > > 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with > refcount of 0; fd=4's remove flag is initialized to off and in-use flag > is initialized to on > 2. client calls 'device-add' with /dev/fdset/1 as the backing filename; > qemu_open() increments the refcount of fdset1 to 1 > 3. client crashes, so all fdsets are visited; fd=4 had not yet been > passed to 'remove-fd', so it's remove flag is off; fd=4's in-use flag is > turned off; fd=4 is left open because it is still in-use by the block > device (refcount is still 1) > 4. client re-establishes QMP connection, refcount for fdset1 is still 1; > fd=4's in-use flag is turned on; 'query-fds' lets client learn about > fd=4 still being open as part of fdset1 > 5. client calls 'remove-fd fdset=1 fd=4'; qemu turns on remove flag for > fd=4; but fd=4 remains open because refcount of fdset1 is 1 > 6. qemu_close is called for fd=4; refcount for fdset1 is decremented to > 0; fd=4 is closed because (refcount == 0 && (!inuse || removed)) is true > 7. Sometime later.. QMP disconnects > I have one modification to this. It looks the inuse flag could be at the fdset level, rather than an inuse flag per fd. Although.. you did mention multiple monitors, so perhaps inuse should be a counter? -- Regards, Corey ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd 2012-07-06 17:40 ` Corey Bryant 2012-07-06 18:19 ` [Qemu-devel] [libvirt] " Corey Bryant @ 2012-07-09 14:04 ` Kevin Wolf 2012-07-09 15:23 ` Corey Bryant 1 sibling, 1 reply; 56+ messages in thread From: Kevin Wolf @ 2012-07-09 14:04 UTC (permalink / raw) To: Corey Bryant Cc: aliguori, stefanha, libvir-list, qemu-devel, Luiz Capitulino, pbonzini, Eric Blake Am 06.07.2012 19:40, schrieb Corey Bryant: > > > On 07/06/2012 05:11 AM, Kevin Wolf wrote: >> Am 05.07.2012 19:00, schrieb Eric Blake: >>> On 07/05/2012 10:35 AM, Corey Bryant wrote: >>>> 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with >>>> refcount of 0; fd=4's in-use flag is turned on >>>> 2. client calls 'device-add' with /dev/fdset/1 as the backing filename, >>>> so qemu_open() increments the refcount of fdset1 to 1 >>>> 3. client calls 'remove-fd fdset=1 fd=4', so qemu marks fd=4 as no >>>> longer in-use by the monitor, and is left open because it is in use by >>>> the block device (refcount is 1) >>>> 4. client crashes, so all tracked fds are visited; fd=4 is not in-use >>>> but refcount is 1 so it is not closed >>> 5. client re-establishes QMP connection, so all tracked fds are visited. >>> What happens to the fd=4 in-use flag? >>> >>> ...but what are the semantics here? >>> >>> If we _always_ turn the in-use flag back on, then that says that even >>> though libvirt successfully asked to close the fd, the reconnection >>> means that libvirt once again has to ask to close things. >>> >>> If we _never_ turn the in-use flag back on, then we've broken the first >>> case above where we want an in-use fd to come back into use after a crash. >>> >>> Maybe that argues for two flags per fd: an in-use flag (there is >>> currently a monitor connection that can manipulate the fd, either >>> because it passed in the fd or because it reconnected) and a removed >>> flag (a monitor called remove-fd, and no longer wants to know about the >>> fd, even if it crashes and reconnects). >> >> I was in fact just going to suggest a removed flag as well, however >> combined with including the monitor connections in the refcount instead >> of an additional flag. This would also allow to have (the currently >> mostly hypothetical case of) multiple QMP monitors without interesting >> things happening. >> >> Maybe I'm missing some point that the inuse flag would allow and >> including monitors in the refcount doesn't. Is there one? >> >> Kevin >> > > Ok let me try this again. I was going through some of the examples and I > think we need a separate in-use flag. Otherwise, when refcount gets to > 1, we don't know if it is because of a monitor reference or a block > device reference. Does it matter? > I think it would cause fds to sit on the monitor > until refcount gets to zero (monitor disconnects). Here's an example > without the in-use flag: > > 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with > refcount of 1 (incremented because of monitor reference); fd=4's remove > flag is initialized to off > 2. client calls 'device-add' with /dev/fdset/1 as the backing filename; > qemu_open() increments the refcount of fdset1 to 2 > 3. client crashes, so all fdsets are visited; fd=4 had not yet been > passed to 'remove-fd', so it's remove flag is off; refcount for fdset1 > is decremented to 1; fd=4 is left open because it is still in use by the > block device (refcount is 1) > 4. client re-establishes QMP connection, refcount for fdset1 is > incremented to 2; 'query-fds' lets client learn about fd=4 still being > open as part of fdset1 > 5. client calls 'remove-fd fdset=1 fd=4'; qemu turns on remove flag for > fd=4; but fd=4 remains open because refcount of fdset1 is 2 It also decreases the reference count because the monitor doesn't use it any more. > 6. qemu_close is called for fd=4; refcount for fdset1 is decremented to > 1; fd=4 remains open because monitor still references fdset1 (refcount > of fdset1 is 1) So here the refcount becomes 0 and the fdset is closed. > 7. Sometime later.. QMP disconnects; refcount for fdset is decremented > to zero; fd=4 is closed The only question that is a bit unclear to me is whether a remove-fd on one monitor drops the refcount only for this monitor or for all monitors. However, both options can be implemented without additional flags or counters. Kevin ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd 2012-07-09 14:04 ` [Qemu-devel] " Kevin Wolf @ 2012-07-09 15:23 ` Corey Bryant 2012-07-09 15:30 ` Kevin Wolf 0 siblings, 1 reply; 56+ messages in thread From: Corey Bryant @ 2012-07-09 15:23 UTC (permalink / raw) To: Kevin Wolf Cc: aliguori, stefanha, libvir-list, qemu-devel, Luiz Capitulino, pbonzini, Eric Blake On 07/09/2012 10:04 AM, Kevin Wolf wrote: > Am 06.07.2012 19:40, schrieb Corey Bryant: >> >> >> On 07/06/2012 05:11 AM, Kevin Wolf wrote: >>> Am 05.07.2012 19:00, schrieb Eric Blake: >>>> On 07/05/2012 10:35 AM, Corey Bryant wrote: >>>>> 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with >>>>> refcount of 0; fd=4's in-use flag is turned on >>>>> 2. client calls 'device-add' with /dev/fdset/1 as the backing filename, >>>>> so qemu_open() increments the refcount of fdset1 to 1 >>>>> 3. client calls 'remove-fd fdset=1 fd=4', so qemu marks fd=4 as no >>>>> longer in-use by the monitor, and is left open because it is in use by >>>>> the block device (refcount is 1) >>>>> 4. client crashes, so all tracked fds are visited; fd=4 is not in-use >>>>> but refcount is 1 so it is not closed >>>> 5. client re-establishes QMP connection, so all tracked fds are visited. >>>> What happens to the fd=4 in-use flag? >>>> >>>> ...but what are the semantics here? >>>> >>>> If we _always_ turn the in-use flag back on, then that says that even >>>> though libvirt successfully asked to close the fd, the reconnection >>>> means that libvirt once again has to ask to close things. >>>> >>>> If we _never_ turn the in-use flag back on, then we've broken the first >>>> case above where we want an in-use fd to come back into use after a crash. >>>> >>>> Maybe that argues for two flags per fd: an in-use flag (there is >>>> currently a monitor connection that can manipulate the fd, either >>>> because it passed in the fd or because it reconnected) and a removed >>>> flag (a monitor called remove-fd, and no longer wants to know about the >>>> fd, even if it crashes and reconnects). >>> >>> I was in fact just going to suggest a removed flag as well, however >>> combined with including the monitor connections in the refcount instead >>> of an additional flag. This would also allow to have (the currently >>> mostly hypothetical case of) multiple QMP monitors without interesting >>> things happening. >>> >>> Maybe I'm missing some point that the inuse flag would allow and >>> including monitors in the refcount doesn't. Is there one? >>> >>> Kevin >>> >> >> Ok let me try this again. I was going through some of the examples and I >> think we need a separate in-use flag. Otherwise, when refcount gets to >> 1, we don't know if it is because of a monitor reference or a block >> device reference. > > Does it matter? >> I think it would cause fds to sit on the monitor >> until refcount gets to zero (monitor disconnects). Here's an example >> without the in-use flag: >> >> 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with >> refcount of 1 (incremented because of monitor reference); fd=4's remove >> flag is initialized to off >> 2. client calls 'device-add' with /dev/fdset/1 as the backing filename; >> qemu_open() increments the refcount of fdset1 to 2 >> 3. client crashes, so all fdsets are visited; fd=4 had not yet been >> passed to 'remove-fd', so it's remove flag is off; refcount for fdset1 >> is decremented to 1; fd=4 is left open because it is still in use by the >> block device (refcount is 1) >> 4. client re-establishes QMP connection, refcount for fdset1 is >> incremented to 2; 'query-fds' lets client learn about fd=4 still being >> open as part of fdset1 >> 5. client calls 'remove-fd fdset=1 fd=4'; qemu turns on remove flag for >> fd=4; but fd=4 remains open because refcount of fdset1 is 2 > > It also decreases the reference count because the monitor doesn't use it > any more. I don't think that will work because refcount is for the entire fdset. So we can't decrement the refcount for every fd that is removed from the fdset. I think it is much simpler if we only increment refcount for an fdset on qemu_open, and only decrement refcount on qemu_close. > >> 6. qemu_close is called for fd=4; refcount for fdset1 is decremented to >> 1; fd=4 remains open because monitor still references fdset1 (refcount >> of fdset1 is 1) > > So here the refcount becomes 0 and the fdset is closed. > >> 7. Sometime later.. QMP disconnects; refcount for fdset is decremented >> to zero; fd=4 is closed > > The only question that is a bit unclear to me is whether a remove-fd on > one monitor drops the refcount only for this monitor or for all > monitors. However, both options can be implemented without additional > flags or counters. Before we go back and forth on this thread, would you mind taking a look at the last email I sent to Luiz? It includes all the design points that I'm currently working from. I think it's a good level set and we can work off that thread if there are still any issues. -- Regards, Corey ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd 2012-07-09 15:23 ` Corey Bryant @ 2012-07-09 15:30 ` Kevin Wolf 0 siblings, 0 replies; 56+ messages in thread From: Kevin Wolf @ 2012-07-09 15:30 UTC (permalink / raw) To: Corey Bryant Cc: aliguori, stefanha, libvir-list, qemu-devel, Luiz Capitulino, pbonzini, Eric Blake Am 09.07.2012 17:23, schrieb Corey Bryant: >>> I think it would cause fds to sit on the monitor >>> until refcount gets to zero (monitor disconnects). Here's an example >>> without the in-use flag: >>> >>> 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with >>> refcount of 1 (incremented because of monitor reference); fd=4's remove >>> flag is initialized to off >>> 2. client calls 'device-add' with /dev/fdset/1 as the backing filename; >>> qemu_open() increments the refcount of fdset1 to 2 >>> 3. client crashes, so all fdsets are visited; fd=4 had not yet been >>> passed to 'remove-fd', so it's remove flag is off; refcount for fdset1 >>> is decremented to 1; fd=4 is left open because it is still in use by the >>> block device (refcount is 1) >>> 4. client re-establishes QMP connection, refcount for fdset1 is >>> incremented to 2; 'query-fds' lets client learn about fd=4 still being >>> open as part of fdset1 >>> 5. client calls 'remove-fd fdset=1 fd=4'; qemu turns on remove flag for >>> fd=4; but fd=4 remains open because refcount of fdset1 is 2 >> >> It also decreases the reference count because the monitor doesn't use it >> any more. > > I don't think that will work because refcount is for the entire fdset. > So we can't decrement the refcount for every fd that is removed from the > fdset. > > I think it is much simpler if we only increment refcount for an fdset on > qemu_open, and only decrement refcount on qemu_close. Ah right... So this would only work if we had explicit fdset-create/close commands, where the former would increase the refcount and the latter decrease it (fdset-open would be optional but I like symmetry) Maybe we need (or want) that anyway, but I need to think more about it first. >>> 6. qemu_close is called for fd=4; refcount for fdset1 is decremented to >>> 1; fd=4 remains open because monitor still references fdset1 (refcount >>> of fdset1 is 1) >> >> So here the refcount becomes 0 and the fdset is closed. >> > >>> 7. Sometime later.. QMP disconnects; refcount for fdset is decremented >>> to zero; fd=4 is closed >> >> The only question that is a bit unclear to me is whether a remove-fd on >> one monitor drops the refcount only for this monitor or for all >> monitors. However, both options can be implemented without additional >> flags or counters. > > Before we go back and forth on this thread, would you mind taking a look > at the last email I sent to Luiz? It includes all the design points > that I'm currently working from. I think it's a good level set and we > can work off that thread if there are still any issues. Ok, I'll have a look. Kevin ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd [not found] ` <20120626091004.GA14451@redhat.com> [not found] ` <4FE9A0F0.2050809@redhat.com> @ 2012-07-09 18:40 ` Anthony Liguori 2012-07-09 19:00 ` Luiz Capitulino 2012-07-10 7:58 ` Kevin Wolf 1 sibling, 2 replies; 56+ messages in thread From: Anthony Liguori @ 2012-07-09 18:40 UTC (permalink / raw) To: Daniel P. Berrange Cc: kwolf, stefanha, libvir-list, Corey Bryant, qemu-devel, lcapitulino, pbonzini, eblake On 06/26/2012 04:10 AM, Daniel P. Berrange wrote: > On Fri, Jun 22, 2012 at 02:36:07PM -0400, Corey Bryant wrote: >> libvirt's sVirt security driver provides SELinux MAC isolation for >> Qemu guest processes and their corresponding image files. In other >> words, sVirt uses SELinux to prevent a QEMU process from opening >> files that do not belong to it. >> >> sVirt provides this support by labeling guests and resources with >> security labels that are stored in file system extended attributes. >> Some file systems, such as NFS, do not support the extended >> attribute security namespace, and therefore cannot support sVirt >> isolation. >> >> A solution to this problem is to provide fd passing support, where >> libvirt opens files and passes file descriptors to QEMU. This, >> along with SELinux policy to prevent QEMU from opening files, can >> provide image file isolation for NFS files stored on the same NFS >> mount. >> >> This patch series adds the pass-fd QMP monitor command, which allows >> an fd to be passed via SCM_RIGHTS, and returns the received file >> descriptor. Support is also added to the block layer to allow QEMU >> to dup the fd when the filename is of the /dev/fd/X format. This >> is useful if MAC policy prevents QEMU from opening specific types >> of files. > > I was thinking about some of the sources complexity when using > FD passing from libvirt and wanted to raise one idea for discussion > before we continue. > > With this proposed series, we have usage akin to: > > 1. pass_fd FDSET={M} -> returns a string "/dev/fd/N" showing QEMU's > view of the FD > 2. drive_add file=/dev/fd/N > 3. if failure: > close_fd "/dev/fd/N" > > My problem is that none of this FD passing is "transactional". My original patch series did not suffer from this problem. QEMU owned the file descriptor once it received it from libvirt. I don't think the cited problem (QEMU failing an operation if libvirt was down) is really an actual problem since it would be libvirt that would be issuing the command in the first place (so the command would just fail which libvirt would have to assume anyway if it crashed). I really dislike where this thread has headed with /dev/fdset. This has become extremely complex and cumbersome. Perhaps we should reconsider using an RPC for QEMU to request an fd as this solves all the cited problems in a much simpler fashion. Regards, Anthony Liguori ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd 2012-07-09 18:40 ` Anthony Liguori @ 2012-07-09 19:00 ` Luiz Capitulino 2012-07-10 8:54 ` Daniel P. Berrange 2012-07-10 7:58 ` Kevin Wolf 1 sibling, 1 reply; 56+ messages in thread From: Luiz Capitulino @ 2012-07-09 19:00 UTC (permalink / raw) To: Anthony Liguori Cc: kwolf, stefanha, libvir-list, Corey Bryant, qemu-devel, pbonzini, eblake On Mon, 09 Jul 2012 13:40:34 -0500 Anthony Liguori <aliguori@us.ibm.com> wrote: > On 06/26/2012 04:10 AM, Daniel P. Berrange wrote: > > On Fri, Jun 22, 2012 at 02:36:07PM -0400, Corey Bryant wrote: > >> libvirt's sVirt security driver provides SELinux MAC isolation for > >> Qemu guest processes and their corresponding image files. In other > >> words, sVirt uses SELinux to prevent a QEMU process from opening > >> files that do not belong to it. > >> > >> sVirt provides this support by labeling guests and resources with > >> security labels that are stored in file system extended attributes. > >> Some file systems, such as NFS, do not support the extended > >> attribute security namespace, and therefore cannot support sVirt > >> isolation. > >> > >> A solution to this problem is to provide fd passing support, where > >> libvirt opens files and passes file descriptors to QEMU. This, > >> along with SELinux policy to prevent QEMU from opening files, can > >> provide image file isolation for NFS files stored on the same NFS > >> mount. > >> > >> This patch series adds the pass-fd QMP monitor command, which allows > >> an fd to be passed via SCM_RIGHTS, and returns the received file > >> descriptor. Support is also added to the block layer to allow QEMU > >> to dup the fd when the filename is of the /dev/fd/X format. This > >> is useful if MAC policy prevents QEMU from opening specific types > >> of files. > > > > I was thinking about some of the sources complexity when using > > FD passing from libvirt and wanted to raise one idea for discussion > > before we continue. > > > > With this proposed series, we have usage akin to: > > > > 1. pass_fd FDSET={M} -> returns a string "/dev/fd/N" showing QEMU's > > view of the FD > > 2. drive_add file=/dev/fd/N > > 3. if failure: > > close_fd "/dev/fd/N" > > > > My problem is that none of this FD passing is "transactional". > > My original patch series did not suffer from this problem. > > QEMU owned the file descriptor once it received it from libvirt. > > I don't think the cited problem (QEMU failing an operation if libvirt was down) > is really an actual problem since it would be libvirt that would be issuing the > command in the first place (so the command would just fail which libvirt would > have to assume anyway if it crashed). > > I really dislike where this thread has headed with /dev/fdset. This has become > extremely complex and cumbersome. I agree, maybe it's time to start over and discuss the original problem again. > > Perhaps we should reconsider using an RPC for QEMU to request an fd as this > solves all the cited problems in a much simpler fashion. > > Regards, > > Anthony Liguori > ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd 2012-07-09 19:00 ` Luiz Capitulino @ 2012-07-10 8:54 ` Daniel P. Berrange 0 siblings, 0 replies; 56+ messages in thread From: Daniel P. Berrange @ 2012-07-10 8:54 UTC (permalink / raw) To: Luiz Capitulino Cc: kwolf, Anthony Liguori, stefanha, libvir-list, Corey Bryant, qemu-devel, pbonzini, eblake On Mon, Jul 09, 2012 at 04:00:37PM -0300, Luiz Capitulino wrote: > On Mon, 09 Jul 2012 13:40:34 -0500 > Anthony Liguori <aliguori@us.ibm.com> wrote: > > > On 06/26/2012 04:10 AM, Daniel P. Berrange wrote: > > > On Fri, Jun 22, 2012 at 02:36:07PM -0400, Corey Bryant wrote: > > >> libvirt's sVirt security driver provides SELinux MAC isolation for > > >> Qemu guest processes and their corresponding image files. In other > > >> words, sVirt uses SELinux to prevent a QEMU process from opening > > >> files that do not belong to it. > > >> > > >> sVirt provides this support by labeling guests and resources with > > >> security labels that are stored in file system extended attributes. > > >> Some file systems, such as NFS, do not support the extended > > >> attribute security namespace, and therefore cannot support sVirt > > >> isolation. > > >> > > >> A solution to this problem is to provide fd passing support, where > > >> libvirt opens files and passes file descriptors to QEMU. This, > > >> along with SELinux policy to prevent QEMU from opening files, can > > >> provide image file isolation for NFS files stored on the same NFS > > >> mount. > > >> > > >> This patch series adds the pass-fd QMP monitor command, which allows > > >> an fd to be passed via SCM_RIGHTS, and returns the received file > > >> descriptor. Support is also added to the block layer to allow QEMU > > >> to dup the fd when the filename is of the /dev/fd/X format. This > > >> is useful if MAC policy prevents QEMU from opening specific types > > >> of files. > > > > > > I was thinking about some of the sources complexity when using > > > FD passing from libvirt and wanted to raise one idea for discussion > > > before we continue. > > > > > > With this proposed series, we have usage akin to: > > > > > > 1. pass_fd FDSET={M} -> returns a string "/dev/fd/N" showing QEMU's > > > view of the FD > > > 2. drive_add file=/dev/fd/N > > > 3. if failure: > > > close_fd "/dev/fd/N" > > > > > > My problem is that none of this FD passing is "transactional". > > > > My original patch series did not suffer from this problem. > > > > QEMU owned the file descriptor once it received it from libvirt. > > > > I don't think the cited problem (QEMU failing an operation if libvirt was down) > > is really an actual problem since it would be libvirt that would be issuing the > > command in the first place (so the command would just fail which libvirt would > > have to assume anyway if it crashed). > > > > I really dislike where this thread has headed with /dev/fdset. This has become > > extremely complex and cumbersome. > > I agree, maybe it's time to start over and discuss the original problem again. I must say, I'm not entirely sure of all the problems we're trying to solve anymore. I don't think we've ever clearly stated in this thread what all the requirements/problems are, so I'm finding it hard to see what the optimal solution is. Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd 2012-07-09 18:40 ` Anthony Liguori 2012-07-09 19:00 ` Luiz Capitulino @ 2012-07-10 7:58 ` Kevin Wolf 1 sibling, 0 replies; 56+ messages in thread From: Kevin Wolf @ 2012-07-10 7:58 UTC (permalink / raw) To: Anthony Liguori Cc: stefanha, libvir-list, Corey Bryant, qemu-devel, lcapitulino, pbonzini, eblake Am 09.07.2012 20:40, schrieb Anthony Liguori: > On 06/26/2012 04:10 AM, Daniel P. Berrange wrote: >> On Fri, Jun 22, 2012 at 02:36:07PM -0400, Corey Bryant wrote: >>> libvirt's sVirt security driver provides SELinux MAC isolation for >>> Qemu guest processes and their corresponding image files. In other >>> words, sVirt uses SELinux to prevent a QEMU process from opening >>> files that do not belong to it. >>> >>> sVirt provides this support by labeling guests and resources with >>> security labels that are stored in file system extended attributes. >>> Some file systems, such as NFS, do not support the extended >>> attribute security namespace, and therefore cannot support sVirt >>> isolation. >>> >>> A solution to this problem is to provide fd passing support, where >>> libvirt opens files and passes file descriptors to QEMU. This, >>> along with SELinux policy to prevent QEMU from opening files, can >>> provide image file isolation for NFS files stored on the same NFS >>> mount. >>> >>> This patch series adds the pass-fd QMP monitor command, which allows >>> an fd to be passed via SCM_RIGHTS, and returns the received file >>> descriptor. Support is also added to the block layer to allow QEMU >>> to dup the fd when the filename is of the /dev/fd/X format. This >>> is useful if MAC policy prevents QEMU from opening specific types >>> of files. >> >> I was thinking about some of the sources complexity when using >> FD passing from libvirt and wanted to raise one idea for discussion >> before we continue. >> >> With this proposed series, we have usage akin to: >> >> 1. pass_fd FDSET={M} -> returns a string "/dev/fd/N" showing QEMU's >> view of the FD >> 2. drive_add file=/dev/fd/N >> 3. if failure: >> close_fd "/dev/fd/N" >> >> My problem is that none of this FD passing is "transactional". > > My original patch series did not suffer from this problem. > > QEMU owned the file descriptor once it received it from libvirt. > > I don't think the cited problem (QEMU failing an operation if libvirt was down) > is really an actual problem since it would be libvirt that would be issuing the > command in the first place (so the command would just fail which libvirt would > have to assume anyway if it crashed). > > I really dislike where this thread has headed with /dev/fdset. This has become > extremely complex and cumbersome. What exactly is complex about the interface we're going to provide? A long discussion about how to get the details implemented best doesn't mean at all that the result is complex. > Perhaps we should reconsider using an RPC for QEMU to request an fd as this > solves all the cited problems in a much simpler fashion. NACK. RPC is wrong and no way easier to handle for management. Kevin ^ permalink raw reply [flat|nested] 56+ messages in thread
end of thread, other threads:[~2012-07-11 18:52 UTC | newest] Thread overview: 56+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-06-22 18:36 [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd Corey Bryant 2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 1/7] qemu-char: Add MSG_CMSG_CLOEXEC flag to recvmsg Corey Bryant 2012-06-22 19:31 ` Eric Blake 2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 2/7] qapi: Convert getfd and closefd Corey Bryant 2012-07-11 18:51 ` Luiz Capitulino 2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 3/7] qapi: Add pass-fd QMP command Corey Bryant 2012-06-22 20:24 ` Eric Blake 2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 4/7] qapi: Re-arrange monitor.c functions Corey Bryant 2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 5/7] block: Prevent /dev/fd/X filename from being detected as floppy Corey Bryant 2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 6/7] block: Convert open calls to qemu_open Corey Bryant 2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 7/7] osdep: Enable qemu_open to dup pre-opened fd Corey Bryant 2012-06-22 19:58 ` Eric Blake [not found] ` <20120626091004.GA14451@redhat.com> [not found] ` <4FE9A0F0.2050809@redhat.com> [not found] ` <20120626175045.2c7011b3@doriath.home> [not found] ` <4FEA37A9.10707@linux.vnet.ibm.com> [not found] ` <4FEA3D9C.8080205@redhat.com> 2012-07-02 22:02 ` [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd Corey Bryant 2012-07-02 22:31 ` Eric Blake 2012-07-03 9:07 ` Daniel P. Berrange 2012-07-03 9:40 ` Kevin Wolf 2012-07-03 13:42 ` Corey Bryant 2012-07-03 15:40 ` Corey Bryant 2012-07-03 15:59 ` Kevin Wolf 2012-07-03 16:25 ` Corey Bryant 2012-07-03 17:03 ` Eric Blake 2012-07-03 17:46 ` Corey Bryant 2012-07-03 18:00 ` Eric Blake 2012-07-03 18:21 ` Corey Bryant 2012-07-04 8:09 ` Kevin Wolf 2012-07-05 15:06 ` Corey Bryant 2012-07-09 14:05 ` Luiz Capitulino 2012-07-09 15:05 ` Corey Bryant 2012-07-09 15:46 ` Kevin Wolf 2012-07-09 16:18 ` Luiz Capitulino 2012-07-09 17:59 ` Corey Bryant 2012-07-09 17:35 ` Corey Bryant 2012-07-09 17:48 ` Luiz Capitulino 2012-07-09 18:02 ` Corey Bryant 2012-07-10 7:53 ` Kevin Wolf 2012-07-09 18:20 ` Corey Bryant 2012-07-04 8:00 ` Kevin Wolf 2012-07-05 14:22 ` Corey Bryant 2012-07-05 14:51 ` Kevin Wolf 2012-07-05 16:35 ` Corey Bryant 2012-07-05 16:37 ` Corey Bryant 2012-07-06 9:06 ` Kevin Wolf 2012-07-05 17:00 ` Eric Blake 2012-07-05 17:36 ` Corey Bryant 2012-07-06 9:11 ` Kevin Wolf 2012-07-06 17:14 ` Corey Bryant 2012-07-06 17:15 ` Corey Bryant 2012-07-06 17:40 ` Corey Bryant 2012-07-06 18:19 ` [Qemu-devel] [libvirt] " Corey Bryant 2012-07-09 14:04 ` [Qemu-devel] " Kevin Wolf 2012-07-09 15:23 ` Corey Bryant 2012-07-09 15:30 ` Kevin Wolf 2012-07-09 18:40 ` Anthony Liguori 2012-07-09 19:00 ` Luiz Capitulino 2012-07-10 8:54 ` Daniel P. Berrange 2012-07-10 7:58 ` Kevin Wolf
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).