From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:48922) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YVxEM-0000xe-Bk for qemu-devel@nongnu.org; Thu, 12 Mar 2015 03:08:31 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YVxEJ-0002Ug-3I for qemu-devel@nongnu.org; Thu, 12 Mar 2015 03:08:30 -0400 Date: Thu, 12 Mar 2015 08:08:15 +0100 From: "Michael S. Tsirkin" Message-ID: <20150312075922-mutt-send-email-mst@redhat.com> References: <20150311180212.8119.14734.stgit@bahia.local> <20150311210322-mutt-send-email-mst@redhat.com> <20150311230314.08e51843@bahia.local> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline In-Reply-To: <20150311230314.08e51843@bahia.local> Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [PATCH] virtio-pci: fix host notifiers on bi-endian architectures List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Greg Kurz Cc: pbonzini@redhat.com, Cedric Le Goater , qemu-ppc@nongnu.org, qemu-devel@nongnu.org, virtualization@lists.linux-foundation.org On Wed, Mar 11, 2015 at 11:03:14PM +0100, Greg Kurz wrote: > On Wed, 11 Mar 2015 21:06:05 +0100 > "Michael S. Tsirkin" wrote: >=20 > > On Wed, Mar 11, 2015 at 07:04:38PM +0100, Greg Kurz wrote: > > > vhost is seriously broken with ppc64le guests, even in the supposed= ly > > > supported case where the host is ppc64le and we don't need cross-en= dian > > > support. > > >=20 > > > The TX virtqueue fails to be handled by vhost and falls back to QEM= U. > > > Despite this unexpected scenario where RX is vhost and TX is QEMU, = the > > > guest runs well with reduced upload performances... until you reboo= t, > > > migrate, managed save or in fact any operation that causes vhost_ne= t > > > to be re-started. Network connectivity is then permanantly lost for > > > the guest. > > >=20 > > > TX falling back to QEMU is the result of a failed MMIO store emulat= ion > > > in KVM. Debugging shows that: > > >=20 > > > kvmppc_emulate_mmio() > > > | > > > +-> kvmppc_handle_store() > > > | > > > +-> kvm_io_bus_write() > > > | > > > +-> __kvm_io_bus_write() returns -EOPNOTSUPP > > >=20 > > > This happens because no matching device was found: > > >=20 > > > __kvm_io_bus_write() > > > | > > > +->kvm_iodevice_write() > > > | > > > +->ioeventfd_write() > > > | > > > +->ioeventfd_in_range() returns false for all registered v= rings > > >=20 > > > Extra debugging shows that the TX vring number (16-bit) is supposed= to > > > be 0x0100 but QEMU passes 0x0001 to KVM... This happens *again* bec= ause > > > QEMU still assumes powerpc is big endian (TARGET_WORDS_BIGENDIAN) b= y > > > default. > > >=20 > > > This patch adds an extra swap in virtio_pci_set_host_notifier_inter= nal() > > > to negate the one that is done in adjust_endianness(). Since this i= s not > > > a hot path and we want to keep virtio-pci.o in common-obj, we don't= care > > > whether the guest is bi-endian or not. > > >=20 > > > Reported-by: C=E9dric Le Goater > > > Suggested-by: Michael Roth > > > Signed-off-by: Greg Kurz > >=20 > > I am confused. > > The value that notifications use is always LE. >=20 > True but adjust_endianness() does swap unconditionally for ppc64 > because of TARGET_WORDS_BIGENDIAN. >=20 > > Can't we avoid multiple swaps? >=20 > That would mean adding an extra endianness argument down to > memory_region_wrong_endianness()... not sure we want to do that. >=20 > > They make my head spin. > >=20 >=20 > I understand that the current fixed target endianness paradigm > is best suited for most architectures. Extra swaps in specific > non-critical locations allows to support odd beasts like ppc64le > and arm64be without trashing more common paths. Maybe I can add a > comment for better clarity (see below). But common header format is simple, it's always LE. It does not depend on target. To me this looks like a bug in memory_region_add_eventfd, it should do the right thing depending on device endian-ness. > > > --- > > >=20 > > > I guess it is also a fix for virtio-1 but I didn't check. > > >=20 > > > hw/virtio/virtio-pci.c | 11 +++++++++-- > > > 1 file changed, 9 insertions(+), 2 deletions(-) > > >=20 > > > diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c > > > index e7baf7b..62b04c9 100644 > > > --- a/hw/virtio/virtio-pci.c > > > +++ b/hw/virtio/virtio-pci.c > > > @@ -133,6 +133,11 @@ static int virtio_pci_load_queue(DeviceState *= d, int n, QEMUFile *f) > > > return 0; > > > } > > > =20 >=20 > /* The host notifier will be swapped in adjust_endianness() according t= o the > * target default endianness. We need to negate this swap if the device= uses > * an endianness that is not the default (ppc64le for example). > */ >=20 > > > +static uint16_t cpu_to_host_notifier16(VirtIODevice *vdev, uint16_= t val) > > > +{ > > > + return virtio_is_big_endian(vdev) ? val : bswap16(val); > > > +} > > > + > > > static int virtio_pci_set_host_notifier_internal(VirtIOPCIProxy *p= roxy, > > > int n, bool assig= n, bool set_handler) > > > { > > > @@ -150,10 +155,12 @@ static int virtio_pci_set_host_notifier_inter= nal(VirtIOPCIProxy *proxy, > > > } > > > virtio_queue_set_host_notifier_fd_handler(vq, true, set_ha= ndler); > > > memory_region_add_eventfd(&proxy->bar, VIRTIO_PCI_QUEUE_NO= TIFY, 2, > > > - true, n, notifier); > > > + true, cpu_to_host_notifier16(vde= v, n), > > > + notifier); > > > } else { > > > memory_region_del_eventfd(&proxy->bar, VIRTIO_PCI_QUEUE_NO= TIFY, 2, > > > - true, n, notifier); > > > + true, cpu_to_host_notifier16(vde= v, n), > > > + notifier); > > > virtio_queue_set_host_notifier_fd_handler(vq, false, false= ); > > > event_notifier_cleanup(notifier); > > > } > >=20