From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:53123) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1alon9-0000XF-PA for qemu-devel@nongnu.org; Thu, 31 Mar 2016 22:26:33 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1alon5-00024r-G3 for qemu-devel@nongnu.org; Thu, 31 Mar 2016 22:26:31 -0400 Date: Fri, 1 Apr 2016 12:25:56 +1100 From: David Gibson Message-ID: <20160401012556.GH416@voom.redhat.com> References: <1459424825-15446-1-git-send-email-thuth@redhat.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="2oox5VnwalALFvA7" Content-Disposition: inline In-Reply-To: <1459424825-15446-1-git-send-email-thuth@redhat.com> Subject: Re: [Qemu-devel] [PATCH for-2.7] hw/net/spapr_llan: Delay flushing of the RX queue while adding new RX buffers List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Thomas Huth Cc: Jason Wang , qemu-ppc@nongnu.org, Alexander Graf , qemu-devel@nongnu.org --2oox5VnwalALFvA7 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Mar 31, 2016 at 01:47:05PM +0200, Thomas Huth wrote: > Currently, the spapr-vlan device is trying to flush the RX queue > after each RX buffer that has been added by the guest via the > H_ADD_LOGICAL_LAN_BUFFER hypercall. In case the receive buffer pool > was empty before, we only pass single packets to the guest this > way. This can cause very bad performance if a sender is trying > to stream fragmented UDP packets to the guest. For example when > using the UDP_STREAM test from netperf with UDP packets that are > much bigger than the MTU size, almost all UDP packets are dropped > in the guest since the chances are quite high that at least one of > the fragments got lost on the way. >=20 > When flushing the receive queue, it's much better if we'd have > a bunch of receive buffers available already, so that fragmented > packets can be passed to the guest in one go. To do this, the > spapr_vlan_receive() function should return 0 instead of -1 if there > are no more receive buffers available, so that receive_disabled =3D 1 > gets temporarily set for the receive queue, and we have to delay > the queue flushing at the end of h_add_logical_lan_buffer() a little > bit by using a timer, so that the guest gets a chance to add multiple > RX buffers before we flush the queue again. >=20 > This improves the UDP_STREAM test with the spapr-vlan device a lot: > Running > netserver -p 44444 -L -f -D -4 > in the guest, and > netperf -p 44444 -L -H -t UDP_STREAM -l 60 -- -m 16384 > in the host, I get the following values _without_ this patch: >=20 > Socket Message Elapsed Messages > Size Size Time Okay Errors Throughput > bytes bytes secs # # 10^6bits/sec >=20 > 229376 16384 60.00 1738970 0 3798.83 > 229376 60.00 23 0.05 >=20 > That "0.05" means that almost all UDP packets got lost/discarded > at the receiving side. > With this patch applied, the value look much better: >=20 > Socket Message Elapsed Messages > Size Size Time Okay Errors Throughput > bytes bytes secs # # 10^6bits/sec >=20 > 229376 16384 60.00 1789104 0 3908.35 > 229376 60.00 22818 49.85 >=20 > Signed-off-by: Thomas Huth > --- > As a reference: When running the same test with a "e1000" NIC > instead of "spapr-vlan", I get a throughput of ~85 up to 100 MBits/s > ... so the spapr-vlan is still not as good as other NICs here, but > at least it's much better than before. Nice work! Applied to ppc-for-2.7. >=20 > hw/net/spapr_llan.c | 28 +++++++++++++++++++++++++--- > 1 file changed, 25 insertions(+), 3 deletions(-) >=20 > diff --git a/hw/net/spapr_llan.c b/hw/net/spapr_llan.c > index a647f25..d604d55 100644 > --- a/hw/net/spapr_llan.c > +++ b/hw/net/spapr_llan.c > @@ -109,6 +109,7 @@ typedef struct VIOsPAPRVLANDevice { > target_ulong buf_list; > uint32_t add_buf_ptr, use_buf_ptr, rx_bufs; > target_ulong rxq_ptr; > + QEMUTimer *rxp_timer; > uint32_t compat_flags; /* Compatability flags for migrat= ion */ > RxBufPool *rx_pool[RX_MAX_POOLS]; /* Receive buffer descriptor pool= s */ > } VIOsPAPRVLANDevice; > @@ -205,7 +206,7 @@ static ssize_t spapr_vlan_receive(NetClientState *nc,= const uint8_t *buf, > } > =20 > if (!dev->rx_bufs) { > - return -1; > + return 0; > } > =20 > if (dev->compat_flags & SPAPRVLAN_FLAG_RX_BUF_POOLS) { > @@ -214,7 +215,7 @@ static ssize_t spapr_vlan_receive(NetClientState *nc,= const uint8_t *buf, > bd =3D spapr_vlan_get_rx_bd_from_page(dev, size); > } > if (!bd) { > - return -1; > + return 0; > } > =20 > dev->rx_bufs--; > @@ -265,6 +266,13 @@ static NetClientInfo net_spapr_vlan_info =3D { > .receive =3D spapr_vlan_receive, > }; > =20 > +static void spapr_vlan_flush_rx_queue(void *opaque) > +{ > + VIOsPAPRVLANDevice *dev =3D opaque; > + > + qemu_flush_queued_packets(qemu_get_queue(dev->nic)); > +} > + > static void spapr_vlan_reset_rx_pool(RxBufPool *rxp) > { > /* > @@ -301,6 +309,9 @@ static void spapr_vlan_realize(VIOsPAPRDevice *sdev, = Error **errp) > dev->nic =3D qemu_new_nic(&net_spapr_vlan_info, &dev->nicconf, > object_get_typename(OBJECT(sdev)), sdev->qde= v.id, dev); > qemu_format_nic_info_str(qemu_get_queue(dev->nic), dev->nicconf.maca= ddr.a); > + > + dev->rxp_timer =3D timer_new_us(QEMU_CLOCK_VIRTUAL, spapr_vlan_flush= _rx_queue, > + dev); > } > =20 > static void spapr_vlan_instance_init(Object *obj) > @@ -331,6 +342,11 @@ static void spapr_vlan_instance_finalize(Object *obj) > dev->rx_pool[i] =3D NULL; > } > } > + > + if (dev->rxp_timer) { > + timer_del(dev->rxp_timer); > + timer_free(dev->rxp_timer); > + } > } > =20 > void spapr_vlan_create(VIOsPAPRBus *bus, NICInfo *nd) > @@ -628,7 +644,13 @@ static target_ulong h_add_logical_lan_buffer(PowerPC= CPU *cpu, > =20 > dev->rx_bufs++; > =20 > - qemu_flush_queued_packets(qemu_get_queue(dev->nic)); > + /* > + * Give guest some more time to add additional RX buffers before we > + * flush the receive queue, so that e.g. fragmented IP packets can > + * be passed to the guest in one go later (instead of passing single > + * fragments if there is only one receive buffer available). > + */ > + timer_mod(dev->rxp_timer, qemu_clock_get_us(QEMU_CLOCK_VIRTUAL) + 50= 0); > =20 > return H_SUCCESS; > } --=20 David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson --2oox5VnwalALFvA7 Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAEBAgAGBQJW/c4kAAoJEGw4ysog2bOS/xQQANiRQTpBOk0ylYiudjtkxQMh cKK035wZCxNJVgZW80PVln9dtn7RKDX6KwyIeJKSfvKMfANNLCeI/Ifo2Uhf8n03 Jyt1Y5XPk8W3MWmugZgExILbhx6pGheqrB7Cm59G726/Qpu6HdM+0ZJyY66REWvp Yj5QzTIHUtJCMnrmC6alkjW9J6HNBfSTFwCPi8oAEaTFKXPqC4Kq3TI0H6H9f3Rz M4hSuWwatcc6DiBclm5Wc07njok+GvXf/d5Wcp1JSPfKi46UVT09x3l9briCo6Gq z5HU76Cn59VxCURxXR7QunrBvZhJQ4WfkV7FpgaAlvN3tzyTfjqvdXTwymNxISvL 8G0wwrPbbI4rdvMzJ6l0SKYNLbm+TEHU/C0iQtxNbDp0ut8XZlqJKvDpRPf78mIX SbZKFIK5LJhQpB4bPIbWVeiVxVoGlxNSn9JavvZTOtFKoyjn7dYBcpMT6IZNajwm fTw21YEFSY1R7g+yEQRjN9BJ8WZFKrBZLQ05byN8RpgOGdDeQNmblhUZOWYU/88n cUwO3rjRPlPyeQNaWFbhMu5AJHsb4R9k6esu2hBqj7XtDhuvxZpd9JwYCnXR2dVg trXStvdT/h5uoTVLYH/PVHKLfFDjjCl445sEjL51DGodWafGmtbtaWTcNL001V2A c3FGSNSZOvJ85Tc1cKNz =GLup -----END PGP SIGNATURE----- --2oox5VnwalALFvA7--