From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:48922)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <mst@redhat.com>) id 1YVxEM-0000xe-Bk
	for qemu-devel@nongnu.org; Thu, 12 Mar 2015 03:08:31 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <mst@redhat.com>) id 1YVxEJ-0002Ug-3I
	for qemu-devel@nongnu.org; Thu, 12 Mar 2015 03:08:30 -0400
Date: Thu, 12 Mar 2015 08:08:15 +0100
From: "Michael S. Tsirkin" <mst@redhat.com>
Message-ID: <20150312075922-mutt-send-email-mst@redhat.com>
References: <20150311180212.8119.14734.stgit@bahia.local>
	<20150311210322-mutt-send-email-mst@redhat.com>
	<20150311230314.08e51843@bahia.local>
MIME-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: inline
In-Reply-To: <20150311230314.08e51843@bahia.local>
Content-Transfer-Encoding: quoted-printable
Subject: Re: [Qemu-devel] [PATCH] virtio-pci: fix host notifiers on
	bi-endian architectures
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Greg Kurz <gkurz@linux.vnet.ibm.com>
Cc: pbonzini@redhat.com, Cedric Le Goater <clg@fr.ibm.com>, qemu-ppc@nongnu.org, qemu-devel@nongnu.org, virtualization@lists.linux-foundation.org

On Wed, Mar 11, 2015 at 11:03:14PM +0100, Greg Kurz wrote:
> On Wed, 11 Mar 2015 21:06:05 +0100
> "Michael S. Tsirkin" <mst@redhat.com> wrote:
>=20
> > On Wed, Mar 11, 2015 at 07:04:38PM +0100, Greg Kurz wrote:
> > > vhost is seriously broken with ppc64le guests, even in the supposed=
ly
> > > supported case where the host is ppc64le and we don't need cross-en=
dian
> > > support.
> > >=20
> > > The TX virtqueue fails to be handled by vhost and falls back to QEM=
U.
> > > Despite this unexpected scenario where RX is vhost and TX is QEMU, =
the
> > > guest runs well with reduced upload performances... until you reboo=
t,
> > > migrate, managed save or in fact any operation that causes vhost_ne=
t
> > > to be re-started. Network connectivity is then permanantly lost for
> > > the guest.
> > >=20
> > > TX falling back to QEMU is the result of a failed MMIO store emulat=
ion
> > > in KVM. Debugging shows that:
> > >=20
> > > kvmppc_emulate_mmio()
> > >  |
> > >  +-> kvmppc_handle_store()
> > >       |
> > >       +-> kvm_io_bus_write()
> > >            |
> > >            +-> __kvm_io_bus_write() returns -EOPNOTSUPP
> > >=20
> > > This happens because no matching device was found:
> > >=20
> > > __kvm_io_bus_write()
> > >  |
> > >  +->kvm_iodevice_write()
> > >      |
> > >      +->ioeventfd_write()
> > >          |
> > >          +->ioeventfd_in_range() returns false for all registered v=
rings
> > >=20
> > > Extra debugging shows that the TX vring number (16-bit) is supposed=
 to
> > > be 0x0100 but QEMU passes 0x0001 to KVM... This happens *again* bec=
ause
> > > QEMU still assumes powerpc is big endian (TARGET_WORDS_BIGENDIAN) b=
y
> > > default.
> > >=20
> > > This patch adds an extra swap in virtio_pci_set_host_notifier_inter=
nal()
> > > to negate the one that is done in adjust_endianness(). Since this i=
s not
> > > a hot path and we want to keep virtio-pci.o in common-obj, we don't=
 care
> > > whether the guest is bi-endian or not.
> > >=20
> > > Reported-by: C=E9dric Le Goater <clg@fr.ibm.com>
> > > Suggested-by: Michael Roth <mdroth@linux.vnet.ibm.com>
> > > Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
> >=20
> > I am confused.
> > The value that notifications use is always LE.
>=20
> True but adjust_endianness() does swap unconditionally for ppc64
> because of TARGET_WORDS_BIGENDIAN.
>=20
> > Can't we avoid multiple swaps?
>=20
> That would mean adding an extra endianness argument down to
> memory_region_wrong_endianness()... not sure we want to do that.
>=20
> > They make my head spin.
> >=20
>=20
> I understand that the current fixed target endianness paradigm
> is best suited for most architectures. Extra swaps in specific
> non-critical locations allows to support odd beasts like ppc64le
> and arm64be without trashing more common paths. Maybe I can add a
> comment for better clarity (see below).

But common header format is simple, it's always LE.
It does not depend on target.
To me this looks like a bug in memory_region_add_eventfd,
it should do the right thing depending on device
endian-ness.


> > > ---
> > >=20
> > > I guess it is also a fix for virtio-1 but I didn't check.
> > >=20
> > >  hw/virtio/virtio-pci.c |   11 +++++++++--
> > >  1 file changed, 9 insertions(+), 2 deletions(-)
> > >=20
> > > diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
> > > index e7baf7b..62b04c9 100644
> > > --- a/hw/virtio/virtio-pci.c
> > > +++ b/hw/virtio/virtio-pci.c
> > > @@ -133,6 +133,11 @@ static int virtio_pci_load_queue(DeviceState *=
d, int n, QEMUFile *f)
> > >      return 0;
> > >  }
> > > =20
>=20
> /* The host notifier will be swapped in adjust_endianness() according t=
o the
>  * target default endianness. We need to negate this swap if the device=
 uses
>  * an endianness that is not the default (ppc64le for example).
>  */
>=20
> > > +static uint16_t cpu_to_host_notifier16(VirtIODevice *vdev, uint16_=
t val)
> > > +{
> > > +    return virtio_is_big_endian(vdev) ? val : bswap16(val);
> > > +}
> > > +
> > >  static int virtio_pci_set_host_notifier_internal(VirtIOPCIProxy *p=
roxy,
> > >                                                   int n, bool assig=
n, bool set_handler)
> > >  {
> > > @@ -150,10 +155,12 @@ static int virtio_pci_set_host_notifier_inter=
nal(VirtIOPCIProxy *proxy,
> > >          }
> > >          virtio_queue_set_host_notifier_fd_handler(vq, true, set_ha=
ndler);
> > >          memory_region_add_eventfd(&proxy->bar, VIRTIO_PCI_QUEUE_NO=
TIFY, 2,
> > > -                                  true, n, notifier);
> > > +                                  true, cpu_to_host_notifier16(vde=
v, n),
> > > +                                  notifier);
> > >      } else {
> > >          memory_region_del_eventfd(&proxy->bar, VIRTIO_PCI_QUEUE_NO=
TIFY, 2,
> > > -                                  true, n, notifier);
> > > +                                  true, cpu_to_host_notifier16(vde=
v, n),
> > > +                                  notifier);
> > >          virtio_queue_set_host_notifier_fd_handler(vq, false, false=
);
> > >          event_notifier_cleanup(notifier);
> > >      }
> >=20