From: Dima Stepanov <dimastep@yandex-team.ru>
To: qemu-devel@nongnu.org
Cc: kwolf@redhat.com, qemu-block@nongnu.org, mst@redhat.com,
jasowang@redhat.com, dgilbert@redhat.com, mreitz@redhat.com,
fengli@smartx.com, yc-core@yandex-team.ru,
marcandre.lureau@redhat.com, pbonzini@redhat.com,
raphael.norwitz@nutanix.com
Subject: [PATCH v4 0/2] vhost-user reconnect issues during vhost initialization
Date: Thu, 28 May 2020 12:11:17 +0300 [thread overview]
Message-ID: <cover.1590396396.git.dimastep@yandex-team.ru> (raw)
Changes is v4:
- Update the "[PATCH v4 2/2] vhost-user-blk: delay
vhost_user_blk_disconnect" patch based on Raphael's comment and Li
Feng previous commit:
https://lists.gnu.org/archive/html/qemu-devel/2020-04/msg02255.html
Don't change the vhost_user_blk_device_realize() function. Update the
comment for the CHR_EVENT_CLOSED event.
Changes in v3:
- "[PATCH v3 1/2] char-socket: return -1 in case of disconnect during
tcp_chr_write" made a small cleanup suggested by Li Feng. Added
"Reviewed-by: Marc-André Lureau"
- Rework the vhost_user_blk_disconnect call logic to delay it.
- Remove the migration patch from the patch set, since we are still
having some discussion about it. In general the current idea is good,
but need to make some more investigation of how to handle reconnect
during migration properly
Changes in v2:
- Add to CC list: Li Feng <fengli@smartx.com>, since it looks like that we
are working on pretty similar issues
- Remove [RFC PATCH v1 1/7] contrib/vhost-user-blk: add option to simulate
disconnect on init. Going to send this functionality in the separate
patch, with the LIBVHOST_USER_DEBUG rework. Need to think how to reuse
this option and silence the messages first.
- Remove [RFC PATCH v1 3/7] char-socket: initialize reconnect timer only if
close is emitted. This will be handled in the separate patchset:
[PATCH 3/4] char-socket: avoid double call tcp_chr_free_connection by Li
Feng
v1:
During vhost-user reconnect functionality we hit several issues, if
vhost-user-blk daemon is "crashed" or made disconnect during vhost
initialization. The general scenario is as follows:
- vhost start routine is called
- vhost write failed due to SIGPIPE
- this call the disconnect routine and vhost_dev_cleanup routine
which set to 0 all the field of the vhost_dev structure
- return back to vhost start routine with the error
- on the fail path vhost start routine tries to rollback the changes
by using vhost_dev struct fields which were already reset
- sometimes this leads to SIGSEGV, sometimes to SIGABRT
Before revising the vhost-user initialization code, we suggest adding
the sanity checks to be aware of the possible disconnect event and that
the vhost_dev structure can be in "uninitialized" state.
The vhost-user-blk daemon is updated with the additional
"--simulate-disconnect-stage=CASENUM" argument to simulate disconnect during
VHOST device initialization. For instance:
1. $ ./vhost-user-blk -s ./vhost.sock -b test-img.raw --simulate-disconnect-stage=1
This command will simulate disconnect in the SET_VRING_CALL handler.
In this case the vhost device in QEMU is not set the started field to
true.
2. $ ./vhost-user-blk -s ./vhost.sock -b test-img.raw --simulate-disconnect-stage=2
This command will simulate disconnect in the SET_VRING_NUM handler.
In this case the started field is set to true.
These two cases test different QEMU parts. Also to trigger different code paths
disconnect should be simulated in two ways:
- before any successful initialization
- make successful initialization once and try to simulate disconnects
Also we catch SIGABRT on the migration start if vhost-user daemon disconnected
during vhost-user set log commands communication.
Dima Stepanov (2):
char-socket: return -1 in case of disconnect during tcp_chr_write
vhost-user-blk: delay vhost_user_blk_disconnect
chardev/char-socket.c | 7 ++++---
hw/block/vhost-user-blk.c | 38 +++++++++++++++++++++++++++++++++++++-
2 files changed, 41 insertions(+), 4 deletions(-)
--
2.7.4
next reply other threads:[~2020-05-28 9:13 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-05-28 9:11 Dima Stepanov [this message]
2020-05-28 9:11 ` [PATCH v4 1/2] char-socket: return -1 in case of disconnect during tcp_chr_write Dima Stepanov
2020-05-28 9:11 ` [PATCH v4 2/2] vhost-user-blk: delay vhost_user_blk_disconnect Dima Stepanov
2020-05-31 0:55 ` Raphael Norwitz
2020-06-02 3:16 ` Li Feng
2020-06-02 8:31 ` Dima Stepanov
2020-09-14 13:40 ` [PATCH v4 0/2] vhost-user reconnect issues during vhost initialization Stefan Hajnoczi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cover.1590396396.git.dimastep@yandex-team.ru \
--to=dimastep@yandex-team.ru \
--cc=dgilbert@redhat.com \
--cc=fengli@smartx.com \
--cc=jasowang@redhat.com \
--cc=kwolf@redhat.com \
--cc=marcandre.lureau@redhat.com \
--cc=mreitz@redhat.com \
--cc=mst@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=raphael.norwitz@nutanix.com \
--cc=yc-core@yandex-team.ru \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).