qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v5 00/31] vhost-user reconnect fixes
@ 2016-07-21  8:57 marcandre.lureau
  2016-07-21  8:57 ` [Qemu-devel] [PATCH v5 01/31] misc: indentation marcandre.lureau
                   ` (30 more replies)
  0 siblings, 31 replies; 39+ messages in thread
From: marcandre.lureau @ 2016-07-21  8:57 UTC (permalink / raw)
  To: qemu-devel
  Cc: mukawa, yuanhan.liu, victork, jonshin, mst,
	Marc-André Lureau

From: Marc-André Lureau <marcandre.lureau@redhat.com>

Hi,

Since 'vhost-user: simple reconnection support' has been merged, it is
possible to disconnect and reconnect a vhost-user backend. However,
many code paths in qemu may trigger assert() when the backend is
disconnected.

There are also code paths that are wrong, see "don't assume opaque is
a fd" patch for an example. Once those patches are reviewed & merged,
they are good candidates for stable too.

Some assert() could simply be replaced by error_report() or silently
fail since they are recoverable cases. Some missing error checks can
also help prevent later issues. The errors are reported up to vhost.c,
as the vhost-user backend alone doesn't handle disconnected state
transparently so far. There are still problematic code paths left
after this series, for example, starting a migration with a
disconnected backend will abort(). It is likely that other problematic
code path exists (vhost_scsi_start failure is fatal, but there are no
vhost-user backend that I know yet).

In many cases, the code assumes get_vhost_net() will be non-NULL after
a succesful connection, so I changed it to stay after a disconnect
until the new connection comes (as suggested by Michael).

Since there is feature checks on reconnection, qemu should wait for
the initial connection feature negotiation to complete. The test added
demonstrates this. Additionally, a regression was found during v2,
which could have been spotted with a multiqueue test, so add a basic
one that would have exhibited the issue.

For convenience, the series is also available on:
https://github.com/elmarco/qemu, branch vhost-user-reconnect

v5:
- rebased
- use a VHOST_OPS_DEBUG macro to print vhost_ops errors
- replace assert(foo != NULL) with assert(foo)
- add "RFC: vhost: do not update last avail idx"

v4:
- change notify_migration_done() patch to be VHOST_BACKEND_TYPE_USER
  specific, to avoid having to handle the case where the backend
  doesn't implement the callback
- change vhost_dev_cleanup() to assert on empty log, instead of
  adding a call to vhost_log_put()
- made "keep vhost_net after a disconnection" more robust, got rid of
  the acked_features field
- improve commit log, and some patch reorganization for clarity

v3:
- add vhost-user multiqueue test, which would have helped to find the
  following fix
- fix waiting on vhost-user connection with multiqueue (found by
  Yuanhan Liu)
- merge vhost_user_{read,write}() error checking patches
- add error reporting to vhost_user_read() (similar to
  vhost_user_write())
- add a vhost_net_set_backend() wrapper to help with potential crash
- some leak fixes

v2:
- some patch ordering: minor fix, close(fd) fix,
  assert/fprintf->error_report, check and return error,
  vhost_dev_cleanup() fixes, keep vhost_net after a disconnect, wait
  until connection is ready
- merge read/write error checks
- do not rely on link state to check vhost-user init completed

Marc-André Lureau (31):
  misc: indentation
  vhost-user: minor simplification
  vhost: don't assume opaque is a fd, use backend cleanup
  vhost: make vhost_log_put() idempotent
  vhost: assert the log was cleaned up
  vhost: fix cleanup on not fully initialized device
  vhost: make vhost_dev_cleanup() idempotent
  vhost-net: always call vhost_dev_cleanup() on failure
  vhost: fix calling vhost_dev_cleanup() after vhost_dev_init()
  vhost: do not assert() on vhost_ops failure
  vhost: add missing VHOST_OPS_DEBUG
  vhost: use error_report() instead of fprintf(stderr,...)
  qemu-char: fix qemu_chr_fe_set_msgfds() crash when disconnected
  vhost-user: call set_msgfds unconditionally
  vhost-user: check qemu_chr_fe_set_msgfds() return value
  vhost-user: check vhost_user_{read,write}() return value
  vhost-user: keep vhost_net after a disconnection
  vhost-user: add get_vhost_net() assertions
  Revert "vhost-net: do not crash if backend is not present"
  vhost-net: vhost_migration_done is vhost-user specific
  vhost: add assert() to check runtime behaviour
  char: add chr_wait_connected callback
  char: add and use tcp_chr_wait_connected
  vhost-user: wait until backend init is completed
  tests: plug some leaks in virtio-net-test
  tests: fix vhost-user-test leak
  tests: add /vhost-user/connect-fail test
  tests: add a simple /vhost-user/multiqueue test
  vhost-user: add error report in vhost_user_write()
  vhost: add vhost_net_set_backend()
  RFC: vhost: do not update last avail idx on get_vring_base() failure

 hw/net/vhost_net.c        |  34 ++++------
 hw/virtio/vhost-user.c    |  67 ++++++++++++++------
 hw/virtio/vhost.c         | 158 ++++++++++++++++++++++++++++++----------------
 include/hw/virtio/vhost.h |   4 ++
 include/sysemu/char.h     |   8 +++
 net/tap.c                 |   1 +
 net/vhost-user.c          |  59 ++++++++++-------
 qemu-char.c               |  82 +++++++++++++++++-------
 tests/Makefile.include    |   2 +-
 tests/vhost-user-test.c   | 147 +++++++++++++++++++++++++++++++++++++++++-
 tests/virtio-net-test.c   |  12 +++-
 11 files changed, 422 insertions(+), 152 deletions(-)

-- 
2.9.0

^ permalink raw reply	[flat|nested] 39+ messages in thread

end of thread, other threads:[~2016-07-25 13:14 UTC | newest]

Thread overview: 39+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-07-21  8:57 [Qemu-devel] [PATCH v5 00/31] vhost-user reconnect fixes marcandre.lureau
2016-07-21  8:57 ` [Qemu-devel] [PATCH v5 01/31] misc: indentation marcandre.lureau
2016-07-21  8:57 ` [Qemu-devel] [PATCH v5 02/31] vhost-user: minor simplification marcandre.lureau
2016-07-21  8:57 ` [Qemu-devel] [PATCH v5 03/31] vhost: don't assume opaque is a fd, use backend cleanup marcandre.lureau
2016-07-21  8:57 ` [Qemu-devel] [PATCH v5 04/31] vhost: make vhost_log_put() idempotent marcandre.lureau
2016-07-21  8:57 ` [Qemu-devel] [PATCH v5 05/31] vhost: assert the log was cleaned up marcandre.lureau
2016-07-21  8:57 ` [Qemu-devel] [PATCH v5 06/31] vhost: fix cleanup on not fully initialized device marcandre.lureau
2016-07-21  8:57 ` [Qemu-devel] [PATCH v5 07/31] vhost: make vhost_dev_cleanup() idempotent marcandre.lureau
2016-07-21  8:57 ` [Qemu-devel] [PATCH v5 08/31] vhost-net: always call vhost_dev_cleanup() on failure marcandre.lureau
2016-07-21  8:57 ` [Qemu-devel] [PATCH v5 09/31] vhost: fix calling vhost_dev_cleanup() after vhost_dev_init() marcandre.lureau
2016-07-25 12:33   ` [Qemu-devel] [v5, " Ilya Maximets
2016-07-25 12:45     ` Marc-André Lureau
2016-07-25 12:52       ` Ilya Maximets
2016-07-25 13:05         ` Marc-André Lureau
2016-07-25 13:14           ` Ilya Maximets
2016-07-21  8:57 ` [Qemu-devel] [PATCH v5 10/31] vhost: do not assert() on vhost_ops failure marcandre.lureau
2016-07-21  8:57 ` [Qemu-devel] [PATCH v5 11/31] vhost: add missing VHOST_OPS_DEBUG marcandre.lureau
2016-07-21  8:57 ` [Qemu-devel] [PATCH v5 12/31] vhost: use error_report() instead of fprintf(stderr, ...) marcandre.lureau
2016-07-21  8:57 ` [Qemu-devel] [PATCH v5 13/31] qemu-char: fix qemu_chr_fe_set_msgfds() crash when disconnected marcandre.lureau
2016-07-21  8:57 ` [Qemu-devel] [PATCH v5 14/31] vhost-user: call set_msgfds unconditionally marcandre.lureau
2016-07-21  8:57 ` [Qemu-devel] [PATCH v5 15/31] vhost-user: check qemu_chr_fe_set_msgfds() return value marcandre.lureau
2016-07-21  8:57 ` [Qemu-devel] [PATCH v5 16/31] vhost-user: check vhost_user_{read, write}() " marcandre.lureau
2016-07-21  8:57 ` [Qemu-devel] [PATCH v5 17/31] vhost-user: keep vhost_net after a disconnection marcandre.lureau
2016-07-25 12:48   ` [Qemu-devel] [v5, " Ilya Maximets
2016-07-25 13:09     ` Marc-André Lureau
2016-07-21  8:57 ` [Qemu-devel] [PATCH v5 18/31] vhost-user: add get_vhost_net() assertions marcandre.lureau
2016-07-21  8:57 ` [Qemu-devel] [PATCH v5 19/31] Revert "vhost-net: do not crash if backend is not present" marcandre.lureau
2016-07-21  8:57 ` [Qemu-devel] [PATCH v5 20/31] vhost-net: vhost_migration_done is vhost-user specific marcandre.lureau
2016-07-21  8:57 ` [Qemu-devel] [PATCH v5 21/31] vhost: add assert() to check runtime behaviour marcandre.lureau
2016-07-21  8:57 ` [Qemu-devel] [PATCH v5 22/31] char: add chr_wait_connected callback marcandre.lureau
2016-07-21  8:57 ` [Qemu-devel] [PATCH v5 23/31] char: add and use tcp_chr_wait_connected marcandre.lureau
2016-07-21  8:57 ` [Qemu-devel] [PATCH v5 24/31] vhost-user: wait until backend init is completed marcandre.lureau
2016-07-21  8:57 ` [Qemu-devel] [PATCH v5 25/31] tests: plug some leaks in virtio-net-test marcandre.lureau
2016-07-21  8:57 ` [Qemu-devel] [PATCH v5 26/31] tests: fix vhost-user-test leak marcandre.lureau
2016-07-21  8:57 ` [Qemu-devel] [PATCH v5 27/31] tests: add /vhost-user/connect-fail test marcandre.lureau
2016-07-21  8:57 ` [Qemu-devel] [PATCH v5 28/31] tests: add a simple /vhost-user/multiqueue test marcandre.lureau
2016-07-21  8:57 ` [Qemu-devel] [PATCH v5 29/31] vhost-user: add error report in vhost_user_write() marcandre.lureau
2016-07-21  8:57 ` [Qemu-devel] [PATCH v5 30/31] vhost: add vhost_net_set_backend() marcandre.lureau
2016-07-21  8:57 ` [Qemu-devel] [PATCH v5 31/31] RFC: vhost: do not update last avail idx on get_vring_base() failure marcandre.lureau

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).