From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([209.51.188.92]:58688) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gjFt4-0002yp-AA for qemu-devel@nongnu.org; Mon, 14 Jan 2019 22:59:39 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gjFt3-0008EF-5M for qemu-devel@nongnu.org; Mon, 14 Jan 2019 22:59:38 -0500 Received: from mx1.redhat.com ([209.132.183.28]:52250) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gjFt2-0008Dq-Si for qemu-devel@nongnu.org; Mon, 14 Jan 2019 22:59:37 -0500 Date: Mon, 14 Jan 2019 22:59:29 -0500 From: "Michael S. Tsirkin" Message-ID: <20190114225732-mutt-send-email-mst@kernel.org> References: <20180927143338.GA56570@bogon> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline In-Reply-To: Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [PATCH] vhost-user: fix qemu crash caused by failed backend List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: =?iso-8859-1?Q?Marc-Andr=E9?= Lureau Cc: maxime.coquelin@redhat.com, Jason Wang , QEMU On Tue, Oct 02, 2018 at 01:54:25PM +0400, Marc-Andr=E9 Lureau wrote: > Hi >=20 > On Thu, Sep 27, 2018 at 7:37 PM Liang Li wr= ote: > > > > During live migration, when stopping vhost-user device, 'vhost_dev_st= op' > > will be called, 'vhost_dev_stop' will call a batch of 'vhost_user_rea= d' > > and 'vhost_user_write'. If a previous 'vhost_user_read' or 'vhost_use= r_write' > > failed because the vhost user backend failed, the 'CHR_EVENT_CLOSED' = event > > will be triggerd, followed by the call chain chr_closed_bh()->vhost_u= ser_stop()-> > > vhost_net_cleanup()->vhost_dev_cleanup() > > > > vhost_dev_cleanup will clear vhost_dev struct, so the later 'vhost_us= er_read' > > or 'vhost_user_read' will reference null pointer and cause qemu crash >=20 > Do you have a backtrace to help understand the issue? > thanks Marc-Andr=E9, Maxime any input on this patch? I agree flags like break_down don't exactly look elegant ... > > > > Signed-off-by: Liang Li > > --- > > hw/net/vhost_net.c | 6 ++++++ > > hw/virtio/vhost-user.c | 15 +++++++++++++-- > > include/hw/virtio/vhost.h | 1 + > > include/net/vhost_net.h | 1 + > > net/vhost-user.c | 3 +++ > > 5 files changed, 24 insertions(+), 2 deletions(-) > > > > diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c > > index e037db6..77994e9 100644 > > --- a/hw/net/vhost_net.c > > +++ b/hw/net/vhost_net.c > > @@ -113,6 +113,11 @@ uint64_t vhost_net_get_features(struct vhost_net= *net, uint64_t features) > > features); > > } > > > > +void vhost_net_mark_break_down(struct vhost_net *net) > > +{ > > + net->dev.break_down =3D true; > > +} > > + > > void vhost_net_ack_features(struct vhost_net *net, uint64_t features= ) > > { > > net->dev.acked_features =3D net->dev.backend_features; > > @@ -156,6 +161,7 @@ struct vhost_net *vhost_net_init(VhostNetOptions = *options) > > net->dev.max_queues =3D 1; > > net->dev.nvqs =3D 2; > > net->dev.vqs =3D net->vqs; > > + net->dev.break_down =3D false; > > > > if (backend_kernel) { > > r =3D vhost_net_get_fd(options->net_backend); > > diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c > > index b041343..1394719 100644 > > --- a/hw/virtio/vhost-user.c > > +++ b/hw/virtio/vhost-user.c > > @@ -213,14 +213,20 @@ static bool ioeventfd_enabled(void) > > static int vhost_user_read(struct vhost_dev *dev, VhostUserMsg *msg) > > { > > struct vhost_user *u =3D dev->opaque; > > - CharBackend *chr =3D u->user->chr; > > + CharBackend *chr; > > uint8_t *p =3D (uint8_t *) msg; > > int r, size =3D VHOST_USER_HDR_SIZE; > > > > + if (dev->break_down) { > > + goto fail; > > + } > > + > > + chr =3D u->user->chr; > > r =3D qemu_chr_fe_read_all(chr, p, size); > > if (r !=3D size) { > > error_report("Failed to read msg header. Read %d instead of = %d." > > " Original request %d.", r, size, msg->hdr.requ= est); > > + dev->break_down =3D true; > > goto fail; > > } > > > > @@ -299,9 +305,12 @@ static int vhost_user_write(struct vhost_dev *de= v, VhostUserMsg *msg, > > int *fds, int fd_num) > > { > > struct vhost_user *u =3D dev->opaque; > > - CharBackend *chr =3D u->user->chr; > > + CharBackend *chr; > > int ret, size =3D VHOST_USER_HDR_SIZE + msg->hdr.size; > > > > + if (dev->break_down) { > > + return -1; > > + } > > /* > > * For non-vring specific requests, like VHOST_USER_SET_MEM_TABL= E, > > * we just need send it once in the first time. For later such > > @@ -312,6 +321,7 @@ static int vhost_user_write(struct vhost_dev *dev= , VhostUserMsg *msg, > > return 0; > > } > > > > + chr =3D u->user->chr; > > if (qemu_chr_fe_set_msgfds(chr, fds, fd_num) < 0) { > > error_report("Failed to set msg fds."); > > return -1; > > @@ -319,6 +329,7 @@ static int vhost_user_write(struct vhost_dev *dev= , VhostUserMsg *msg, > > > > ret =3D qemu_chr_fe_write_all(chr, (const uint8_t *) msg, size); > > if (ret !=3D size) { > > + dev->break_down =3D true; > > error_report("Failed to write msg." > > " Wrote %d instead of %d.", ret, size); > > return -1; > > diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h > > index a7f449f..86d0dc5 100644 > > --- a/include/hw/virtio/vhost.h > > +++ b/include/hw/virtio/vhost.h > > @@ -74,6 +74,7 @@ struct vhost_dev { > > bool started; > > bool log_enabled; > > uint64_t log_size; > > + bool break_down; > > Error *migration_blocker; > > const VhostOps *vhost_ops; > > void *opaque; > > diff --git a/include/net/vhost_net.h b/include/net/vhost_net.h > > index 77e4739..06f2c08 100644 > > --- a/include/net/vhost_net.h > > +++ b/include/net/vhost_net.h > > @@ -27,6 +27,7 @@ void vhost_net_cleanup(VHostNetState *net); > > > > uint64_t vhost_net_get_features(VHostNetState *net, uint64_t feature= s); > > void vhost_net_ack_features(VHostNetState *net, uint64_t features); > > +void vhost_net_mark_break_down(VHostNetState *net); > > > > bool vhost_net_virtqueue_pending(VHostNetState *net, int n); > > void vhost_net_virtqueue_mask(VHostNetState *net, VirtIODevice *dev, > > diff --git a/net/vhost-user.c b/net/vhost-user.c > > index a39f9c9..b99e20b 100644 > > --- a/net/vhost-user.c > > +++ b/net/vhost-user.c > > @@ -270,6 +270,9 @@ static void net_vhost_user_event(void *opaque, in= t event) > > if (s->watch) { > > AioContext *ctx =3D qemu_get_current_aio_context(); > > > > + if (s->vhost_net) { > > + vhost_net_mark_break_down(s->vhost_net); > > + } > > g_source_remove(s->watch); > > s->watch =3D 0; > > qemu_chr_fe_set_handlers(&s->chr, NULL, NULL, NULL, NULL= , > > -- > > 1.8.3.1 > > > > >=20 >=20 > --=20 > Marc-Andr=E9 Lureau