qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Yajun Wu <yajunw@nvidia.com>
Cc: qemu-devel@nongnu.org, parav@nvidia.com
Subject: Re: [PATCH v3 0/2] vhost-user: Support vhost_dev_start
Date: Fri, 4 Nov 2022 03:58:59 -0400	[thread overview]
Message-ID: <20221104035133-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <20221104025833-mutt-send-email-mst@kernel.org>

On Fri, Nov 04, 2022 at 03:11:44AM -0400, Michael S. Tsirkin wrote:
> On Mon, Oct 17, 2022 at 02:44:50PM +0800, Yajun Wu wrote:
> > The motivation of adding vhost-user vhost_dev_start support is to
> > improve backend configuration speed and reduce live migration VM
> > downtime.
> > 
> > Today VQ configuration is issued one by one. For virtio net with
> > multi-queue support, backend needs to update RSS (Receive side
> > scaling) on every rx queue enable. Updating RSS is time-consuming
> > (typical time like 7ms).
> > 
> > Implement already defined vhost status and message in the vhost
> > specification [1].
> > (a) VHOST_USER_PROTOCOL_F_STATUS
> > (b) VHOST_USER_SET_STATUS
> > (c) VHOST_USER_GET_STATUS
> > 
> > Send message VHOST_USER_SET_STATUS with VIRTIO_CONFIG_S_DRIVER_OK for
> > device start and reset(0) for device stop.
> > 
> > On reception of the DRIVER_OK message, backend can apply the needed setting
> > only once (instead of incremental) and also utilize parallelism on enabling
> > queues.
> > 
> > This improves QEMU's live migration downtime with vhost user backend
> > implementation by great margin, specially for the large number of VQs of 64
> > from 800 msec to 250 msec.
> > 
> > Another change is to move the device start routines after finishing all the
> > necessary device and VQ configuration, further aligning to the virtio
> > specification for "device initialization sequence".
> > 
> > [1] https://qemu-project.gitlab.io/qemu/interop/vhost-user.html#introduction
> 
> 
> This patchset seems to trip up ubsan.
> https://gitlab.com/mitsirkin/qemu/-/pipelines/684763327/failures
> 
> 
> specifially:
> 
> passes before this patchset:
> https://gitlab.com/mitsirkin/qemu/-/jobs/3269302594
> 
> fails with this patchset:
> https://gitlab.com/mitsirkin/qemu/-/pipelines/684763327/failures
> 
> (there's one patch on top but it seems unrelated)
> 

And this is with just this patchset:
https://gitlab.com/mitsirkin/qemu/-/pipelines/685419082/failures
in fact it's just the 1st patch:
https://gitlab.com/mitsirkin/qemu/-/pipelines/685417959/failures

log here:
https://gitlab.com/mitsirkin/qemu/-/jobs/3273601845

emu-system-arm: Failed to write msg. Wrote -1 instead of 20.
qemu-system-arm: vhost VQ 0 ring restore failed: -22: Invalid argument (22)
qemu-system-arm: Failed to set msg fds.
qemu-system-arm: vhost VQ 1 ring restore failed: -22: Invalid argument (22)
qemu-system-arm: -chardev socket,id=chr-reconnect,path=/tmp/vhost-test-U5G0U1/reconnect.sock,server=on: info: QEMU waiting for connection on: disconnected:unix:/tmp/vhost-test-U5G0U1/reconnect.sock,server=on
qemu-system-arm: Failed to write msg. Wrote -1 instead of 20.
qemu-system-arm: vhost VQ 0 ring restore failed: -22: Invalid argument (22)
qemu-system-arm: Failed to set msg fds.
qemu-system-arm: vhost VQ 1 ring restore failed: -22: Invalid argument (22)
qemu-system-arm: -chardev socket,id=chr-connect-fail,path=/tmp/vhost-test-CN7ZU1/connect-fail.sock,server=on: info: QEMU waiting for connection on: disconnected:unix:/tmp/vhost-test-CN7ZU1/connect-fail.sock,server=on
qemu-system-arm: -netdev vhost-user,id=hs0,chardev=chr-connect-fail,vhostforce=on: Failed to read msg header. Read 0 instead of 12. Original request 1.
qemu-system-arm: -netdev vhost-user,id=hs0,chardev=chr-connect-fail,vhostforce=on: vhost_backend_init failed: Protocol error
qemu-system-arm: -netdev vhost-user,id=hs0,chardev=chr-connect-fail,vhostforce=on: failed to init vhost_net for queue 0
qemu-system-arm: -netdev vhost-user,id=hs0,chardev=chr-connect-fail,vhostforce=on: info: QEMU waiting for connection on: disconnected:unix:/tmp/vhost-test-CN7ZU1/connect-fail.sock,server=on
qemu-system-arm: Failed to write msg. Wrote -1 instead of 20.
qemu-system-arm: vhost VQ 0 ring restore failed: -22: Invalid argument (22)
qemu-system-arm: Failed to set msg fds.
qemu-system-arm: vhost VQ 1 ring restore failed: -22: Invalid argument (22)
qemu-system-arm: -chardev socket,id=chr-flags-mismatch,path=/tmp/vhost-test-4KRYU1/flags-mismatch.sock,server=on: info: QEMU waiting for connection on: disconnected:unix:/tmp/vhost-test-4KRYU1/flags-mismatch.sock,server=on
qemu-system-arm: Failed to write msg. Wrote -1 instead of 52.
qemu-system-arm: vhost_set_mem_table failed: Invalid argument (22)
qemu-system-arm: Failed to set msg fds.
qemu-system-arm: vhost VQ 0 ring restore failed: -22: Invalid argument (22)
UndefinedBehaviorSanitizer:DEADLYSIGNAL
==8621==ERROR: UndefinedBehaviorSanitizer: SEGV on unknown address 0x000000000000 (pc 0x55c91aca0620 bp 0x000000000000 sp 0x7ffd150b8600 T8621)
==8621==The signal is caused by a READ memory access.
==8621==Hint: address points to the zero page.
    #0 0x55c91aca0620 in ldl_he_p /builds/mitsirkin/qemu/include/qemu/bswap.h:301:5
    #1 0x55c91aca0620 in ldn_he_p /builds/mitsirkin/qemu/include/qemu/bswap.h:440:1
    #2 0x55c91aca0620 in flatview_write_continue /builds/mitsirkin/qemu/build/../softmmu/physmem.c:2824:19
    #3 0x55c91ac9da91 in flatview_write /builds/mitsirkin/qemu/build/../softmmu/physmem.c:2867:12
    #4 0x55c91ac9da91 in address_space_write /builds/mitsirkin/qemu/build/../softmmu/physmem.c:2963:18
    #5 0x55c91ac9e857 in address_space_unmap /builds/mitsirkin/qemu/build/../softmmu/physmem.c:3306:9
    #6 0x55c91ac461ec in vhost_memory_unmap /builds/mitsirkin/qemu/build/../hw/virtio/vhost.c:342:9
    #7 0x55c91ac461ec in vhost_virtqueue_stop /builds/mitsirkin/qemu/build/../hw/virtio/vhost.c:1242:5
    #8 0x55c91ac46484 in vhost_dev_stop /builds/mitsirkin/qemu/build/../hw/virtio/vhost.c:1882:9
    #9 0x55c91a664cc4 in vhost_net_stop_one /builds/mitsirkin/qemu/build/../hw/net/vhost_net.c:329:5
    #10 0x55c91a6646a6 in vhost_net_start /builds/mitsirkin/qemu/build/../hw/net/vhost_net.c:402:13
    #11 0x55c91abdfda6 in virtio_net_vhost_status /builds/mitsirkin/qemu/build/../hw/net/virtio-net.c:297:13
    #12 0x55c91abdfda6 in virtio_net_set_status /builds/mitsirkin/qemu/build/../hw/net/virtio-net.c:378:5
    #13 0x55c91ac322a9 in virtio_set_status /builds/mitsirkin/qemu/build/../hw/virtio/virtio.c:2442:9
    #14 0x55c91a7f69a0 in virtio_mmio_write /builds/mitsirkin/qemu/build/../hw/virtio/virtio-mmio.c:428:9
    #15 0x55c91ac882b6 in memory_region_write_accessor /builds/mitsirkin/qemu/build/../softmmu/memory.c:492:5
    #16 0x55c91ac8809a in access_with_adjusted_size /builds/mitsirkin/qemu/build/../softmmu/memory.c:554:18
    #17 0x55c91ac87e3d in memory_region_dispatch_write /builds/mitsirkin/qemu/build/../softmmu/memory.c
    #18 0x55c91aca0661 in flatview_write_continue /builds/mitsirkin/qemu/build/../softmmu/physmem.c:2825:23
    #19 0x55c91ac9da91 in flatview_write /builds/mitsirkin/qemu/build/../softmmu/physmem.c:2867:12
    #20 0x55c91ac9da91 in address_space_write /builds/mitsirkin/qemu/build/../softmmu/physmem.c:2963:18
    #21 0x55c91aca4766 in qtest_process_command /builds/mitsirkin/qemu/build/../softmmu/qtest.c
    #22 0x55c91aca3bfd in qtest_process_inbuf /builds/mitsirkin/qemu/build/../softmmu/qtest.c:796:9
    #23 0x55c91aedd672 in tcp_chr_read /builds/mitsirkin/qemu/build/../chardev/char-socket.c:508:13
    #24 0x7f199db120ae in g_main_context_dispatch (/lib64/libglib-2.0.so.0+0x550ae)
    #25 0x55c91afcfd2c in glib_pollfds_poll /builds/mitsirkin/qemu/build/../util/main-loop.c:297:9
    #26 0x55c91afcfd2c in os_host_main_loop_wait /builds/mitsirkin/qemu/build/../util/main-loop.c:320:5
    #27 0x55c91afcfd2c in main_loop_wait /builds/mitsirkin/qemu/build/../util/main-loop.c:596:11
    #28 0x55c91a826d36 in qemu_main_loop /builds/mitsirkin/qemu/build/../softmmu/runstate.c:739:9
    #29 0x55c91a3df4e5 in qemu_default_main /builds/mitsirkin/qemu/build/../softmmu/main.c:37:14
    #30 0x7f199b85aeaf in __libc_start_call_main (/lib64/libc.so.6+0x3feaf)
    #31 0x7f199b85af5f in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x3ff5f)
    #32 0x55c91a3b6084 in _start (/builds/mitsirkin/qemu/build/qemu-system-arm+0xc16084)
UndefinedBehaviorSanitizer can not provide additional info.
SUMMARY: UndefinedBehaviorSanitizer: SEGV /builds/mitsirkin/qemu/include/qemu/bswap.h:301:5 in ldl_he_p
==8621==ABORTING
Broken pipe
../tests/qtest/libqtest.c:179: kill_qemu() tried to terminate QEMU process but encountered exit status 1 (expected 0)
**
ERROR:../tests/qtest/qos-test.c:191:subprocess_run_one_test: child process (/arm/virt/virtio-mmio/virtio-bus/virtio-net-device/virtio-net/virtio-net-tests/vhost-user/flags-mismatch/subprocess [8612]) failed unexpectedly
(test program exited with status code -6)







> Seems hard to debug, only reproduced under gitlab though Stefan reports
> reproducing this locally.
> I'm thinking of dropping this patchset for now, deferring to next
> release - thoughts?
> 
> 
> 
> > v3:
> > - rebase
> > 
> > v2:
> > - add setting status bit VIRTIO_CONFIG_S_FEATURES_OK
> > - avoid adding status bits already set
> > 
> > Yajun Wu (2):
> >   vhost: Change the sequence of device start
> >   vhost-user: Support vhost_dev_start
> > 
> >  hw/block/vhost-user-blk.c | 18 ++++++----
> >  hw/net/vhost_net.c        | 12 +++----
> >  hw/virtio/vhost-user.c    | 74 ++++++++++++++++++++++++++++++++++++++-
> >  3 files changed, 90 insertions(+), 14 deletions(-)
> > 
> > -- 
> > 2.27.0



  reply	other threads:[~2022-11-04  7:59 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-29  2:25 [PATCH 0/2] vhost-user: Support vhost_dev_start Yajun Wu
2022-06-29  2:25 ` [PATCH 1/2] vhost: Change the sequence of device start Yajun Wu
2022-06-29  2:25 ` [PATCH 2/2] vhost-user: Support vhost_dev_start Yajun Wu
2022-07-12 10:54 ` [PATCH v2 0/2] " Yajun Wu
2022-07-12 10:55   ` [PATCH v2 1/2] vhost: Change the sequence of device start Yajun Wu
2022-07-12 10:55   ` [PATCH v2 2/2] vhost-user: Support vhost_dev_start Yajun Wu
2022-07-26 14:56   ` [PATCH v2 0/2] " Michael S. Tsirkin
2022-09-05  4:58     ` Yajun Wu
2022-10-12 10:10       ` Yajun Wu
2022-10-17  6:44 ` [PATCH v3 " Yajun Wu
2022-10-17  6:44   ` [PATCH v3 1/2] vhost: Change the sequence of device start Yajun Wu
2022-11-05 16:43     ` Michael S. Tsirkin
2022-10-17  6:44   ` [PATCH v3 2/2] vhost-user: Support vhost_dev_start Yajun Wu
2022-11-04  7:11   ` [PATCH v3 0/2] " Michael S. Tsirkin
2022-11-04  7:58     ` Michael S. Tsirkin [this message]
2022-11-07  4:08 ` [PATCH v4 " Yajun Wu
2022-11-07  4:08   ` [PATCH v4 1/2] vhost: Change the sequence of device start Yajun Wu
2022-11-07  4:08   ` [PATCH v4 2/2] vhost-user: Support vhost_dev_start Yajun Wu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20221104035133-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=parav@nvidia.com \
    --cc=qemu-devel@nongnu.org \
    --cc=yajunw@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).