From mboxrd@z Thu Jan 1 00:00:00 1970 From: Stefan Bader Subject: Re: [Xen-devel] xen-netfront possibly rides the rocket too often Date: Thu, 15 May 2014 10:58:38 +0200 Message-ID: <537481BE.2010303@canonical.com> References: <537262AB.5010408@canonical.com> <5373C8D4.2010803@citrix.com> <1400143605.1006.1.camel@kazak.uk.xensource.com> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="nskFHjFCNpuITB6SV92xihOiJCAOgVvFM" Cc: xen-devel@lists.xenproject.org, Wei Liu , netdev To: Ian Campbell , Zoltan Kiss Return-path: Received: from youngberry.canonical.com ([91.189.89.112]:38088 "EHLO youngberry.canonical.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750997AbaEOI6v (ORCPT ); Thu, 15 May 2014 04:58:51 -0400 In-Reply-To: <1400143605.1006.1.camel@kazak.uk.xensource.com> Sender: netdev-owner@vger.kernel.org List-ID: This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --nskFHjFCNpuITB6SV92xihOiJCAOgVvFM Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable On 15.05.2014 10:46, Ian Campbell wrote: > On Wed, 2014-05-14 at 20:49 +0100, Zoltan Kiss wrote: >> On 13/05/14 19:21, Stefan Bader wrote: >>> We had reports about this message being seen on EC2 for a while but f= inally a >>> reporter did notice some details about the guests and was able to pro= vide a >>> simple way to reproduce[1]. >>> >>> For my local experiments I use a Xen-4.2.2 based host (though I would= say the >>> host versions are not important). The host has one NIC which is used = as the >>> outgoing port of a Linux based (not openvswitch) bridge. And the PV g= uests use >>> that bridge. I set the mtu to 9001 (which was seen on affected instan= ce types) >>> and also inside the guests. As described in the report one guests run= s >>> redis-server and the other nodejs through two scripts (for me I had t= o do the >>> two sub.js calls in separate shells). After a bit the error messages = appear on >>> the guest running the redis-server. >>> >>> I added some debug printk's to show a bit more detail about the skb a= nd got the >>> following (@): >>> >>> [ 698.108119] xen_netfront: xennet: skb rides the rocket: 19 slots >>> [ 698.108134] header 1490@238 -> 1 slots >>> [ 698.108139] frag #0 1614@2164 -> + 1 pages >>> [ 698.108143] frag #1 3038@1296 -> + 2 pages >>> [ 698.108147] frag #2 6076@1852 -> + 2 pages >>> [ 698.108151] frag #3 6076@292 -> + 2 pages >>> [ 698.108156] frag #4 6076@2828 -> + 3 pages >>> [ 698.108160] frag #5 3038@1268 -> + 2 pages >>> [ 698.108164] frag #6 2272@1824 -> + 1 pages >>> [ 698.108168] frag #7 3804@0 -> + 1 pages >>> [ 698.108172] frag #8 6076@264 -> + 2 pages >>> [ 698.108177] frag #9 3946@2800 -> + 2 pages >>> [ 698.108180] frags adding 18 slots >>> >>> Since I am not deeply familiar with the networking code, I wonder abo= ut two things: >>> - is there something that should limit the skb data length from all f= rags >>> to stay below the 64K which the definition of MAX_SKB_FRAGS hints?= >> I think netfront should be able to handle 64K packets at most. >=20 > Ah, maybe this relates to this fix from Wei? That indeed should (imo) limit the overall size. And if that would happen= still, we would see the different message. The problem I see is not the overall = size but the layout of the frags. Since multiple start at an offset high enoug= h, the count of slots goes too high. Now I have to read that code in more detail, but there also has been some= changes which introduce a make frags function. I wonder whether maybe the= re is already some kind of shifting (or copying) going on and whether its just = the check that needs to improve. But right now that is speculation. For the fun I actually found an even simpler way to trigger it and (need = to confirm it yet without mucking around with mtu before). It looked like mt= u actually did not make a difference. One only needed the redis server on o= ne guest and run redis-benchmark from the other with (I think -d 1000, or wh= atever it is that sets the datasize). Then on the range tests this happens quite= often. -Stefan >=20 > commit 9ecd1a75d977e2e8c48139c7d3efed183f898d94 > Author: Wei Liu > Date: Mon Apr 22 02:20:41 2013 +0000 >=20 > xen-netfront: reduce gso_max_size to account for max TCP header > =20 > The maximum packet including header that can be handled by netfront= / netback > wire format is 65535. Reduce gso_max_size accordingly. > =20 > Drop skb and print warning when skb->len > 65535. This can 1) save = the effort > to send malformed packet to netback, 2) help spotting misconfigurat= ion of > netfront in the future. > =20 > Signed-off-by: Wei Liu > Acked-by: Ian Campbell > Signed-off-by: David S. Miller >=20 > diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c > index 1bb2e20..1db10141 100644 > --- a/drivers/net/xen-netfront.c > +++ b/drivers/net/xen-netfront.c > @@ -36,7 +36,7 @@ > #include > #include > #include > -#include > +#include > #include > #include > #include > @@ -547,6 +547,16 @@ static int xennet_start_xmit(struct sk_buff *skb, = struct net_device *dev) > unsigned int len =3D skb_headlen(skb); > unsigned long flags; > =20 > + /* If skb->len is too big for wire format, drop skb and alert > + * user about misconfiguration. > + */ > + if (unlikely(skb->len > XEN_NETIF_MAX_TX_SIZE)) { > + net_alert_ratelimited( > + "xennet: skb->len =3D %u, too big for wire format\n", > + skb->len); > + goto drop; > + } > + > slots =3D DIV_ROUND_UP(offset + len, PAGE_SIZE) + > xennet_count_skb_frag_slots(skb); > if (unlikely(slots > MAX_SKB_FRAGS + 1)) { > @@ -1058,7 +1068,8 @@ err: > =20 > static int xennet_change_mtu(struct net_device *dev, int mtu) > { > - int max =3D xennet_can_sg(dev) ? 65535 - ETH_HLEN : ETH_DATA_LEN; > + int max =3D xennet_can_sg(dev) ? > + XEN_NETIF_MAX_TX_SIZE - MAX_TCP_HEADER : ETH_DATA_LEN; > =20 > if (mtu > max) > return -EINVAL; > @@ -1362,6 +1373,8 @@ static struct net_device *xennet_create_dev(struc= t xenbus_device *dev) > SET_ETHTOOL_OPS(netdev, &xennet_ethtool_ops); > SET_NETDEV_DEV(netdev, &dev->dev); > =20 > + netif_set_gso_max_size(netdev, XEN_NETIF_MAX_TX_SIZE - MAX_TCP_HEADER= ); > + > np->netdev =3D netdev; > =20 > netif_carrier_off(netdev); > diff --git a/include/xen/interface/io/netif.h b/include/xen/interface/i= o/netif.h > index 9dfc120..58fadca 100644 > --- a/include/xen/interface/io/netif.h > +++ b/include/xen/interface/io/netif.h > @@ -47,6 +47,7 @@ > #define _XEN_NETTXF_extra_info (3) > #define XEN_NETTXF_extra_info (1U<<_XEN_NETTXF_extra_info) > =20 > +#define XEN_NETIF_MAX_TX_SIZE 0xFFFF > struct xen_netif_tx_request { > grant_ref_t gref; /* Reference to buffer page */ > uint16_t offset; /* Offset within buffer page */ >=20 >=20 >=20 > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel >=20 --nskFHjFCNpuITB6SV92xihOiJCAOgVvFM Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQIcBAEBCgAGBQJTdIHGAAoJEOhnXe7L7s6jWPMQAIC5UgvHMFJYMwdsqGniFrCx I1c0uhTJnjZpGVEeczXIdK8cBwidBc3/PbA0P2x9omneI0g3x8VJHvJQ17H2MX4e e+IyOt7TAfnV0mcgWKz/BCHB4ygwJW1Nn4oyHq/z9US02hHIA+y49CN/qVS0SLit v3O4o/S7xHGl7NU9ZxZjissAWmtmRujL3eMy7aJXAM55EGKaKrXjwzX+rBEnlrEm 8TYg0fyx/SJVHC0+t+eY2OBhPaRJeVg/1mXi7rxvlrPbGWp2vvWa95Lp7E/V6Ktu 6HRdIgU5iZgO/5u/toopgywE9GduZyb8xaciIm/9mmfe6mfxxqlO2e+p8JylqwKG 6RC/lHbdzBKCwY1tL8hhD99s6rxUrslj63uk3xqFPyosdYX55SJFOWRlHdbmcFer MUVXVXWgQ2ynMOzXjSe2k4anjSR2VOWrnKzSM0fLrkXp1KhYrJF2XPiSlbKG7k1V 8Pogm8ddmoS0ygcbjsUUGpWkAQJ1rCe+IHnEDsaKkfM7ydPomuH4ZqZuEIea41yf xoxdL92ofjCdVcCQOjhDB4FxVd2fjXxBIc869lFt7SeqWjVTTxAPBpQ/4HqNrmOY ai66QTLdtT/M6ECyIv7ry+OGVeBGG6E8FXKzHRRe+78CrCAULGmxL7d77JlLjEGY UeBZzsKkBCHevywmjqRj =Rct1 -----END PGP SIGNATURE----- --nskFHjFCNpuITB6SV92xihOiJCAOgVvFM--