All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Halil Pasic <pasic@linux.vnet.ibm.com>
Cc: Jason Wang <jasowang@redhat.com>,
	kvm@vger.kernel.org, Greg Kurz <groug@kaod.org>,
	"virtualization@lists.linux-foundation.org"
	<virtualization@lists.linux-foundation.org>,
	netdev@vger.kernel.org, Cornelia Huck <cornelia.huck@de.ibm.com>
Subject: Re: [BUG/RFC] vhost: net: big endian viring access despite virtio 1
Date: Thu, 26 Jan 2017 21:20:03 +0200	[thread overview]
Message-ID: <20170126211511-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <a08f491a-1320-607e-f450-d32ce2c9e2f5@linux.vnet.ibm.com>

On Thu, Jan 26, 2017 at 06:39:14PM +0100, Halil Pasic wrote:
> 
> Hi!
> 
> Recently I have been investigating some strange migration problems on
> s390x.
> 
> It turned out under certain circumstances vhost_net corrupts avail.idx by
> using wrong endianness.
> 
> I managed to track the problem down (I'm pretty sure). It boils down to
> the following.
> 
> When stopping vhost userspace (QEMU) calls vhost_net_set_backend with
> the fd argument set to -1, this leads to is_le being reset in
> vhost_vq_init_access.  On a BE system resetting to legacy means resetting
> to BE. Usually this is not a problem, but in the case when oldubufs is not
> zero (which is not likely if no network stress applied) it is a problem.
> That code path needs to write avail.idx, and ends up using wrong
> endianness when doing that (but only on a BE system).
> 
> That is the story in prose, now let's see the corresponding code annotated
> with some comments.
> 
> from drivers/vhost/net.c:
> static long vhost_net_set_backend(struct vhost_net *n, unsigned index, int fd)
> {
> /* [..] some not too interesting stuff  */
>         sock = get_socket(fd);
> /* fd == -1 --> sock == NULL */
>         if (IS_ERR(sock)) {
>                 r = PTR_ERR(sock);
>                 goto err_vq;
>         }
> 
>         /* start polling new socket */
>         oldsock = vq->private_data;
>         if (sock != oldsock) {
>                 ubufs = vhost_net_ubuf_alloc(vq,
>                                              sock && vhost_sock_zcopy(sock));
>                 if (IS_ERR(ubufs)) {
>                         r = PTR_ERR(ubufs);
>                         goto err_ubufs;
>                 }
> 
>                 vhost_net_disable_vq(n, vq);
> ==>             vq->private_data = sock; 
> /* now vq->private_data is NULL */
> ==>             r = vhost_vq_init_access(vq);
>                 if (r)
>                         goto err_used;
> /* vq endianness has been reset to BE on s390 */
>                 r = vhost_net_enable_vq(n, vq);
>                 if (r)
>                         goto err_used;
> 
> ==>             oldubufs = nvq->ubufs;
> /* here oldubufs might become != 0 */               
>                 nvq->ubufs = ubufs;
> 
>                 n->tx_packets = 0;
>                 n->tx_zcopy_err = 0;
>                 n->tx_flush = false;
>         }
>         mutex_unlock(&vq->mutex);
> 
>         if (oldubufs) {
>                 vhost_net_ubuf_put_wait_and_free(oldubufs);
>                 mutex_lock(&vq->mutex);
> ==>             vhost_zerocopy_signal_used(n, vq);
> /* tries to update virtqueue structures; endianness is BE on s390
>  * now (but should be LE for virtio-1) */
>                 mutex_unlock(&vq->mutex);
>         }
> /*[..] rest of the function */
> }

> 
> from drivers/vhost/vhost.c:
> 
> int vhost_vq_init_access(struct vhost_virtqueue *vq)
> {
>         __virtio16 last_used_idx;
>         int r;
>         bool is_le = vq->is_le;
> 
>         if (!vq->private_data) {
> ==>             vhost_reset_is_le(vq);
> /* resets to native endianness and returns */
>                 return 0;
>         }
> 
> ==>      vhost_init_is_le(vq);
> /* here we init is_le */
> 
>         r = vhost_update_used_flags(vq);
>         if (r)
>                 goto err;
>         vq->signalled_used_valid = false;
>         if (!vq->iotlb &&
>             !access_ok(VERIFY_READ, &vq->used->idx, sizeof vq->used->idx)) {
>                 r = -EFAULT;
>                 goto err;
>         }
>         r = vhost_get_user(vq, last_used_idx, &vq->used->idx);
>         if (r) {
>                 vq_err(vq, "Can't access used idx at %p\n",
>                        &vq->used->idx);
>                 goto err;
>         }
>         vq->last_used_idx = vhost16_to_cpu(vq, last_used_idx);
>         return 0;
> 
> err:
>         vq->is_le = is_le;
>         return r;
> }
> 
> AFAIU this can be fixed very simply by omitting the reset. Below the
> patch. I'm not sure though, the endianness handling ain't simple in
> vhost. Am I going in the right direction?
> 
> We have a test (on s390x only) running for a couple of hours now and so
> far so good but it's a bit early to announce it is tested  for s390x.
> If the patch is reasonable, I'm intend to do a version with proper
> attribution and stuff.
> 
> By the way the reset was first introduced by 
> https://lkml.org/lkml/2015/4/10/226 (dug it up in the hope that reasons
> for the reset were discussed -- but no enlightenment for me).
> 
> Finally I would like to credit Dave Gilbert for hinting that the
> unreasonable avail.idx may be due to an endianness problem and Michael A.
> Tebolt for reporting the bug and testing.
> 
> -------------------------8<--------------
> >From b26e2bbdc03832a0204ee2b42967a1b49a277dc8 Mon Sep 17 00:00:00 2001
> From: Halil Pasic <pasic@linux.vnet.ibm.com>
> Date: Thu, 26 Jan 2017 00:06:15 +0100
> Subject: [PATCH] vhost: remove useless/dangerous reset of is_le
> 
> The reset of is_le does no good, but it contributes its fair share to a
> bug in vhost_net, which occurs if we have some oldubufs when stopping and
> setting a fd = -1 as a backend. Instead of doing something convoluted in
> vhost_net, let's just get rid of the reset.
> 
> Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
> Fixes: commit 2751c9882b94 
> ---
>  drivers/vhost/vhost.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> index d643260..08072a2 100644
> --- a/drivers/vhost/vhost.c
> +++ b/drivers/vhost/vhost.c
> @@ -1714,10 +1714,8 @@ int vhost_vq_init_access(struct vhost_virtqueue *vq)
>         int r;
>         bool is_le = vq->is_le;
> 
> -       if (!vq->private_data) {
> -               vhost_reset_is_le(vq);
> +       if (!vq->private_data)
>                 return 0;
> -       }
> 
>         vhost_init_is_le(vq);


I think you do need to reset it, just maybe within vhost_init_is_le.

        if (vhost_has_feature(vq, VIRTIO_F_VERSION_1))
                vq->is_le = true;
	else
		vhost_reset_is_le(vq);



> -- 
> 2.8.4

  reply	other threads:[~2017-01-26 19:20 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-01-26 17:39 [BUG/RFC] vhost: net: big endian viring access despite virtio 1 Halil Pasic
2017-01-26 17:39 ` Halil Pasic
2017-01-26 19:20 ` Michael S. Tsirkin [this message]
2017-01-27 12:24   ` Halil Pasic
2017-01-27 13:37     ` Greg Kurz
2017-01-29  0:35       ` Michael S. Tsirkin
2017-01-30 11:25         ` Halil Pasic
2017-01-27 13:37     ` Greg Kurz
2017-01-26 19:20 ` Michael S. Tsirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170126211511-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=cornelia.huck@de.ibm.com \
    --cc=groug@kaod.org \
    --cc=jasowang@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pasic@linux.vnet.ibm.com \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.