Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH v2 net 4/4] bridge: Fix updating FDB entries when the PVID is applied
From: Vlad Yasevich @ 2013-10-16 16:11 UTC (permalink / raw)
  To: Stephen Hemminger, Toshiaki Makita
  Cc: David S . Miller, netdev, Toshiaki Makita
In-Reply-To: <20131016085715.3442e8c3@nehalam.linuxnetplumber.net>

On 10/16/2013 11:57 AM, Stephen Hemminger wrote:
> On Wed, 16 Oct 2013 17:07:16 +0900
> Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> wrote:
>
>> We currently set the value that variable vid is pointing, which will be
>> used in FDB later, to 0 at br_allowed_ingress() when we receive untagged
>> or priority-tagged frames, even though the PVID is valid.
>> This leads to FDB updates in such a wrong way that they are learned with
>> VID 0.
>> Update the value to that of PVID if the PVID is applied.
>>
>> Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
>> Reviewed-by: Vlad Yasevich <vyasevic@redhat.com>
>> ---
>>   net/bridge/br_vlan.c | 1 +
>>   1 file changed, 1 insertion(+)
>>
>> diff --git a/net/bridge/br_vlan.c b/net/bridge/br_vlan.c
>> index 5a9c44a..53f0990 100644
>> --- a/net/bridge/br_vlan.c
>> +++ b/net/bridge/br_vlan.c
>> @@ -217,6 +217,7 @@ bool br_allowed_ingress(struct net_bridge *br, struct net_port_vlans *v,
>>   		/* PVID is set on this port.  Any untagged or priority-tagged
>>   		 * ingress frame is considered to belong to this vlan.
>>   		 */
>> +		*vid = pvid;
>>   		if (likely(err))
>>   			/* Untagged Frame. */
>>   			__vlan_hwaccel_put_tag(skb, htons(ETH_P_8021Q), pvid);
>
>
> Ok, but side-effects seem like an indication of poor code logic
> flow design. Not your fault but part of the the per-vlan filtering code.
>

I'll see if I can re-work the code to get rid of the side-effects.

-vlad

^ permalink raw reply

* Re: [Xen-devel] [PATCH net-next v4 2/5] xen-netback: add support for IPv6 checksum offload from guest
From: Ian Campbell @ 2013-10-16 16:11 UTC (permalink / raw)
  To: Paul Durrant
  Cc: David Vrabel, Wei Liu, netdev@vger.kernel.org,
	xen-devel@lists.xen.org
In-Reply-To: <9AAE0902D5BC7E449B7C8E4E778ABCD0138D3C@AMSPEX01CL01.citrite.net>

On Mon, 2013-10-14 at 13:34 +0100, Paul Durrant wrote:
> > -----Original Message-----
> > From: David Vrabel
> > Sent: 14 October 2013 13:19
> > To: Wei Liu
> > Cc: Paul Durrant; netdev@vger.kernel.org; Ian Campbell; David Vrabel; xen-
> > devel@lists.xen.org
> > Subject: Re: [Xen-devel] [PATCH net-next v4 2/5] xen-netback: add support
> > for IPv6 checksum offload from guest
> > 
> > On 14/10/13 11:55, Wei Liu wrote:
> > > On Mon, Oct 14, 2013 at 11:49:20AM +0100, Paul Durrant wrote:
> > >>> -----Original Message-----
> > >>> From: Wei Liu [mailto:wei.liu2@citrix.com]
> > >>> Sent: 14 October 2013 11:43
> > >>> To: Paul Durrant
> > >>> Cc: xen-devel@lists.xen.org; netdev@vger.kernel.org; Wei Liu; David
> > Vrabel;
> > >>> Ian Campbell
> > >>> Subject: Re: [PATCH net-next v4 2/5] xen-netback: add support for IPv6
> > >>> checksum offload from guest
> > >>>
> > >>> On Fri, Oct 11, 2013 at 04:06:19PM +0100, Paul Durrant wrote:
> > >>> [...]
> > >>>> -/*
> > >>>> - * This is the amount of packet we copy rather than map, so that the
> > >>>> - * guest can't fiddle with the contents of the headers while we do
> > >>>> - * packet processing on them (netfilter, routing, etc).
> > >>>> +/* This is a miniumum size for the linear area to avoid lots of
> > >>>> + * calls to __pskb_pull_tail() as we set up checksum offsets.
> > >>>>   */
> > >>>
> > >>> You seem to forget to explain why 128 is chosen. :-)
> > >>
> > >> Is that not sufficient explanation? What sort of thing are you looking for?
> > >>
> > >
> > >>From the second version of this patch, we had a conversation.
> > >
> > >> Where does 128 come from?
> > >>
> > >
> > > "It's just an arbitrary power of 2 that was chosen because it seems to
> > > cover most likely v6 headers and all v4 headers."
> > >
> > > So something like: "We choose 128 which is likely to cover most V6
> > > headers and all V4 headers" would be sufficeint.
> > 
> > Is "most IPv6 headers" actually good enough?  Don't we need to ensure
> > netback copies all IP headers?
> > 
> 
> It will do if checksum offload is in use, but perhaps the pull as far
> as the transport header needs to be done anyway? I'm unsure of the
> expectations of other code.

I've always been under the impression that transport headers needed
pulling up too, for the benefit of netfilter perhaps?

AIUI the frags should be pure "payload". I may be wrong about that
though...

Ian.

^ permalink raw reply

* Re: [PATCH 0/5] rfkill-gpio: ACPI support
From: Rhyland Klein @ 2013-10-16 16:13 UTC (permalink / raw)
  To: Heikki Krogerus
  Cc: John W. Linville, Johannes Berg, Rafael J. Wysocki,
	linux-acpi@vger.kernel.org, linux-wireless@vger.kernel.org,
	netdev@vger.kernel.org
In-Reply-To: <1381920823-15403-1-git-send-email-heikki.krogerus@linux.intel.com>

On 10/16/2013 6:53 AM, Heikki Krogerus wrote:
> Hi,
> 
> The first patches prepare the driver for the support. The last patch
> can then add the support quite easily. With these patches, adding DT
> support later will be quite straight forward if someone needs it.
> 
> 
> Heikki Krogerus (5):
>   net: rfkill: gpio: convert to resource managed allocation
>   net: rfkill: gpio: clean up clock handling
>   net: rfkill: gpio: spinlock-safe GPIO access
>   net: rfkill: gpio: prepare for DT and ACPI support
>   net: rfkill: gpio: add ACPI support
> 
>  net/rfkill/Kconfig       |   2 +-
>  net/rfkill/rfkill-gpio.c | 211 ++++++++++++++++++++++-------------------------
>  2 files changed, 99 insertions(+), 114 deletions(-)
> 

Strictly speaking, duplicating the pdata fields into the
rfkill_gpio_data structure isn't really necessary. Many drivers simply
have the dt parsing or in this case ACPI parsing generate a
platform_data structure which would then be saved in the
rfkill_gpio_data structure.

In this case, since it is only a few fields, I am not too worried and I
am fine either way.

Acked-by: Rhyland Klein <rklein@nvidia.com>

-- 
nvpublic

^ permalink raw reply

* Re: [PATCH net-next v4 4/5] xen-netback: handle IPv6 TCP GSO packets from the guest
From: Ian Campbell @ 2013-10-16 16:15 UTC (permalink / raw)
  To: Paul Durrant; +Cc: xen-devel, netdev, Wei Liu, David Vrabel
In-Reply-To: <1381503982-1418-5-git-send-email-paul.durrant@citrix.com>

On Fri, 2013-10-11 at 16:06 +0100, Paul Durrant wrote:
> This patch adds a xenstore feature flag, festure-gso-tcpv6, to advertise
> that netback can handle IPv6 TCP GSO packets. It creates SKB_GSO_TCPV6 skbs
> if the frontend passes an extra segment with the new type
> XEN_NETIF_GSO_TYPE_TCPV6 added to netif.h.
> 
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>

This patch looks fine but should it not be after the following patch
which actually causes the necessary pull ups to happen?

> Cc: Wei Liu <wei.liu2@citrix.com>
> Cc: David Vrabel <david.vrabel@citrix.com>
> Cc: Ian Campbell <ian.campbell@citrix.com>
> ---
>  drivers/net/xen-netback/netback.c |   11 ++++++++---
>  drivers/net/xen-netback/xenbus.c  |    7 +++++++
>  include/xen/interface/io/netif.h  |   10 +++++++++-
>  3 files changed, 24 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
> index 74c13b9..65f04e7 100644
> --- a/drivers/net/xen-netback/netback.c
> +++ b/drivers/net/xen-netback/netback.c
> @@ -1096,15 +1096,20 @@ static int xenvif_set_skb_gso(struct xenvif *vif,
>  		return -EINVAL;
>  	}
>  
> -	/* Currently only TCPv4 S.O. is supported. */
> -	if (gso->u.gso.type != XEN_NETIF_GSO_TYPE_TCPV4) {
> +	switch (gso->u.gso.type) {
> +	case XEN_NETIF_GSO_TYPE_TCPV4:
> +		skb_shinfo(skb)->gso_type = SKB_GSO_TCPV4;
> +		break;
> +	case XEN_NETIF_GSO_TYPE_TCPV6:
> +		skb_shinfo(skb)->gso_type = SKB_GSO_TCPV6;
> +		break;
> +	default:
>  		netdev_err(vif->dev, "Bad GSO type %d.\n", gso->u.gso.type);
>  		xenvif_fatal_tx_err(vif);
>  		return -EINVAL;
>  	}
>  
>  	skb_shinfo(skb)->gso_size = gso->u.gso.size;
> -	skb_shinfo(skb)->gso_type = SKB_GSO_TCPV4;
>  
>  	/* Header must be checked, and gso_segs computed. */
>  	skb_shinfo(skb)->gso_type |= SKB_GSO_DODGY;
> diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c
> index 9c9b37d..7e4dcc9 100644
> --- a/drivers/net/xen-netback/xenbus.c
> +++ b/drivers/net/xen-netback/xenbus.c
> @@ -105,6 +105,13 @@ static int netback_probe(struct xenbus_device *dev,
>  			goto abort_transaction;
>  		}
>  
> +		err = xenbus_printf(xbt, dev->nodename, "feature-gso-tcpv6",
> +				    "%d", sg);
> +		if (err) {
> +			message = "writing feature-gso-tcpv6";
> +			goto abort_transaction;
> +		}
> +
>  		/* We support partial checksum setup for IPv6 packets */
>  		err = xenbus_printf(xbt, dev->nodename,
>  				    "feature-ipv6-csum-offload",
> diff --git a/include/xen/interface/io/netif.h b/include/xen/interface/io/netif.h
> index d9fb44739..d7dd8d7 100644
> --- a/include/xen/interface/io/netif.h
> +++ b/include/xen/interface/io/netif.h
> @@ -61,6 +61,13 @@
>   */
>  
>  /*
> + * "feature-gso-tcpv4" and "feature-gso-tcpv6" advertise the capability to
> + * handle large TCP packets (in IPv4 or IPv6 form respectively). Neither
> + * frontends nor backends are assumed to be capable unless the flags are
> + * present.
> + */
> +
> +/*
>   * This is the 'wire' format for packets:
>   *  Request 1: xen_netif_tx_request  -- XEN_NETTXF_* (any flags)
>   * [Request 2: xen_netif_extra_info]    (only if request 1 has XEN_NETTXF_extra_info)
> @@ -105,8 +112,9 @@ struct xen_netif_tx_request {
>  #define _XEN_NETIF_EXTRA_FLAG_MORE	(0)
>  #define  XEN_NETIF_EXTRA_FLAG_MORE	(1U<<_XEN_NETIF_EXTRA_FLAG_MORE)
>  
> -/* GSO types - only TCPv4 currently supported. */
> +/* GSO types */
>  #define XEN_NETIF_GSO_TYPE_TCPV4	(1)
> +#define XEN_NETIF_GSO_TYPE_TCPV6	(2)
>  
>  /*
>   * This structure needs to fit within both netif_tx_request and

^ permalink raw reply

* Re: [PATCH v2 net 2/4] bridge: Apply the PVID to priority-tagged frames
From: Vlad Yasevich @ 2013-10-16 16:16 UTC (permalink / raw)
  To: Stephen Hemminger, Toshiaki Makita
  Cc: David S . Miller, netdev, Toshiaki Makita
In-Reply-To: <20131016085537.1cbe9c37@nehalam.linuxnetplumber.net>

On 10/16/2013 11:55 AM, Stephen Hemminger wrote:
> On Wed, 16 Oct 2013 17:07:14 +0900
> Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> wrote:
>
>> IEEE 802.1Q says that when we receive priority-tagged (VID 0) frames
>> use the PVID for the port as its VID.
>> (See IEEE 802.1Q-2011 6.9.1 and Table 9-2)
>>
>> Apply the PVID to not only untagged frames but also priority-tagged frames.
>>
>> Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
>> ---
>>   net/bridge/br_vlan.c | 27 ++++++++++++++++++++-------
>>   1 file changed, 20 insertions(+), 7 deletions(-)
>>
>> diff --git a/net/bridge/br_vlan.c b/net/bridge/br_vlan.c
>> index 21b6d21..5a9c44a 100644
>> --- a/net/bridge/br_vlan.c
>> +++ b/net/bridge/br_vlan.c
>> @@ -189,6 +189,8 @@ out:
>>   bool br_allowed_ingress(struct net_bridge *br, struct net_port_vlans *v,
>>   			struct sk_buff *skb, u16 *vid)
>>   {
>> +	int err;
>> +
>>   	/* If VLAN filtering is disabled on the bridge, all packets are
>>   	 * permitted.
>>   	 */
>> @@ -201,20 +203,31 @@ bool br_allowed_ingress(struct net_bridge *br, struct net_port_vlans *v,
>>   	if (!v)
>>   		return false;
>>
>> -	if (br_vlan_get_tag(skb, vid)) {
>> +	err = br_vlan_get_tag(skb, vid);
>> +	if (!*vid) {
>>   		u16 pvid = br_get_pvid(v);
>
> Ok, but it looks like br_vlan_get_tag() could be cleaner if it just returned
> the tag, and there was another br_vlan_tag_present() function.

I was just thinking about that as well.  If we make br_vlan_get_tag()
return either the actual tag (if the packet is tagged), or the pvid
if (untagged/prio_tagged), then we can skp most of this.

>
> Also, does this still work if CONFIG_BRIDGE_VLAN_FILTERING is disabled?

Yes.  br_allowed_ingress becomes an inline if the config option is disabled.

-vlad

^ permalink raw reply

* Re: [PATCH net-next v4 4/5] xen-netback: handle IPv6 TCP GSO packets from the guest
From: Ian Campbell @ 2013-10-16 16:16 UTC (permalink / raw)
  To: Paul Durrant; +Cc: xen-devel, netdev, Wei Liu, David Vrabel
In-Reply-To: <1381940113.30409.5.camel@kazak.uk.xensource.com>

On Wed, 2013-10-16 at 17:15 +0100, Ian Campbell wrote:
> On Fri, 2013-10-11 at 16:06 +0100, Paul Durrant wrote:
> > This patch adds a xenstore feature flag, festure-gso-tcpv6, to advertise
> > that netback can handle IPv6 TCP GSO packets. It creates SKB_GSO_TCPV6 skbs
> > if the frontend passes an extra segment with the new type
> > XEN_NETIF_GSO_TYPE_TCPV6 added to netif.h.
> > 
> > Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> 
> This patch looks fine but should it not be after the following patch
> which actually causes the necessary pull ups to happen?

Sorry, it is, I shouldn't review series in two halves, forget what I've
already seen!

> > Cc: Wei Liu <wei.liu2@citrix.com>
> > Cc: David Vrabel <david.vrabel@citrix.com>

Acked-by: Ian Campbell <ian.campbell@citrix.com>

> > ---
> >  drivers/net/xen-netback/netback.c |   11 ++++++++---
> >  drivers/net/xen-netback/xenbus.c  |    7 +++++++
> >  include/xen/interface/io/netif.h  |   10 +++++++++-
> >  3 files changed, 24 insertions(+), 4 deletions(-)
> > 
> > diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
> > index 74c13b9..65f04e7 100644
> > --- a/drivers/net/xen-netback/netback.c
> > +++ b/drivers/net/xen-netback/netback.c
> > @@ -1096,15 +1096,20 @@ static int xenvif_set_skb_gso(struct xenvif *vif,
> >  		return -EINVAL;
> >  	}
> >  
> > -	/* Currently only TCPv4 S.O. is supported. */
> > -	if (gso->u.gso.type != XEN_NETIF_GSO_TYPE_TCPV4) {
> > +	switch (gso->u.gso.type) {
> > +	case XEN_NETIF_GSO_TYPE_TCPV4:
> > +		skb_shinfo(skb)->gso_type = SKB_GSO_TCPV4;
> > +		break;
> > +	case XEN_NETIF_GSO_TYPE_TCPV6:
> > +		skb_shinfo(skb)->gso_type = SKB_GSO_TCPV6;
> > +		break;
> > +	default:
> >  		netdev_err(vif->dev, "Bad GSO type %d.\n", gso->u.gso.type);
> >  		xenvif_fatal_tx_err(vif);
> >  		return -EINVAL;
> >  	}
> >  
> >  	skb_shinfo(skb)->gso_size = gso->u.gso.size;
> > -	skb_shinfo(skb)->gso_type = SKB_GSO_TCPV4;
> >  
> >  	/* Header must be checked, and gso_segs computed. */
> >  	skb_shinfo(skb)->gso_type |= SKB_GSO_DODGY;
> > diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c
> > index 9c9b37d..7e4dcc9 100644
> > --- a/drivers/net/xen-netback/xenbus.c
> > +++ b/drivers/net/xen-netback/xenbus.c
> > @@ -105,6 +105,13 @@ static int netback_probe(struct xenbus_device *dev,
> >  			goto abort_transaction;
> >  		}
> >  
> > +		err = xenbus_printf(xbt, dev->nodename, "feature-gso-tcpv6",
> > +				    "%d", sg);
> > +		if (err) {
> > +			message = "writing feature-gso-tcpv6";
> > +			goto abort_transaction;
> > +		}
> > +
> >  		/* We support partial checksum setup for IPv6 packets */
> >  		err = xenbus_printf(xbt, dev->nodename,
> >  				    "feature-ipv6-csum-offload",
> > diff --git a/include/xen/interface/io/netif.h b/include/xen/interface/io/netif.h
> > index d9fb44739..d7dd8d7 100644
> > --- a/include/xen/interface/io/netif.h
> > +++ b/include/xen/interface/io/netif.h
> > @@ -61,6 +61,13 @@
> >   */
> >  
> >  /*
> > + * "feature-gso-tcpv4" and "feature-gso-tcpv6" advertise the capability to
> > + * handle large TCP packets (in IPv4 or IPv6 form respectively). Neither
> > + * frontends nor backends are assumed to be capable unless the flags are
> > + * present.
> > + */
> > +
> > +/*
> >   * This is the 'wire' format for packets:
> >   *  Request 1: xen_netif_tx_request  -- XEN_NETTXF_* (any flags)
> >   * [Request 2: xen_netif_extra_info]    (only if request 1 has XEN_NETTXF_extra_info)
> > @@ -105,8 +112,9 @@ struct xen_netif_tx_request {
> >  #define _XEN_NETIF_EXTRA_FLAG_MORE	(0)
> >  #define  XEN_NETIF_EXTRA_FLAG_MORE	(1U<<_XEN_NETIF_EXTRA_FLAG_MORE)
> >  
> > -/* GSO types - only TCPv4 currently supported. */
> > +/* GSO types */
> >  #define XEN_NETIF_GSO_TYPE_TCPV4	(1)
> > +#define XEN_NETIF_GSO_TYPE_TCPV6	(2)
> >  
> >  /*
> >   * This structure needs to fit within both netif_tx_request and
> 

^ permalink raw reply

* Re: [Xen-devel] [PATCH net-next v4 0/5] xen-netback: IPv6 offload support
From: Ian Campbell @ 2013-10-16 16:20 UTC (permalink / raw)
  To: Paul Durrant; +Cc: xen-devel, netdev
In-Reply-To: <1381503982-1418-1-git-send-email-paul.durrant@citrix.com>

On Fri, 2013-10-11 at 16:06 +0100, Paul Durrant wrote:
> This patch series adds support for checksum and large packet offloads into
> xen-netback.
> Testing has mainly been done using the Microsoft network hardware
> certification suite running in Server 2008R2 VMs with Citrix PV frontends.

Are there any Linux netfront patches in existence/the pipeline to take
advantage of this?

> 
> v2:
> - Fixed Wei's email address in Cc lines
> 
> v3:
> - Responded to Wei's comments:
>  - netif.h now updated with comments and a definition of
>    XEN_NETIF_GSO_TYPE_NONE.
>  - limited number of pullups
> - Responded to Annie's comments:
>  - New GSO_BIT macro
> 
> v4:
> - Responded to more of Wei's comments
> - Remove parsing of IPv6 fragment header and added warning
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply

* Re: [PATCH] WAN: Adding support for Infineon PEF2256 E1 chipset
From: Rob Herring @ 2013-10-16 16:24 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: Rob Herring, Pawel Moll, Mark Rutland, Stephen Warren,
	Ian Campbell, Rob Landley, Grant Likely, Krzysztof Halasa,
	devicetree@vger.kernel.org, linux-doc,
	linux-kernel@vger.kernel.org, netdev, jerome.chantelauze
In-Reply-To: <201310161525.r9GFPZI5006238@localhost.localdomain>

On Wed, Oct 16, 2013 at 10:25 AM, Christophe Leroy
<christophe.leroy@c-s.fr> wrote:
> The patch adds WAN support for Infineon PEF2256 E1 Chipset.
>
> Signed-off-by: Jerome Chantelauze <jerome.chantelauze@c-s.fr>
> Acked-by: Christophe Leroy <christophe.leroy@c-s.fr>

[snip]

> diff -urN a/Documentation/devicetree/bindings/net/pef2256.txt b/Documentation/devicetree/bindings/net/pef2256.txt
> --- a/Documentation/devicetree/bindings/net/pef2256.txt 1970-01-01 01:00:00.000000000 +0100
> +++ b/Documentation/devicetree/bindings/net/pef2256.txt 2013-10-13 15:05:42.000000000 +0200
> @@ -0,0 +1,29 @@
> +* Wan on Infineon pef2256 E1 controller
> +
> +Required properties:
> +- compatible: Should be "infineon,pef2256"
> +- reg: Address and length of the register set for the device
> +- interrupts: Should contain interrupts
> +
> +Optional properties:
> +- data-rate: Data rate on the system highway.
> +  Supported values are: 2, 4, 8, 16.
> +  8 if not defined.

What are the units? Specify them in the property name.

> +- channel-phase: First time slot transmission channel phase.
> +  Supported values are: 0, 1, 2, 3, 4, 5, 6, 7.
> +  0 if not defined.

This description basically tells me nothing.

> +- rising-edge-sync-pulse: rising edge synchronous pulse.
> +  Supported values are: "receive", "transmit".
> +  "transmit" if not defined.

Are receive and transmit mutually exclusive? If so, then wouldn't a
single property like "rx-rising-edge-sync-pulse" be sufficient.

> +
> +Examples:
> +
> +       e1-wan@4,2000000 {
> +               compatible = "infineon,pef2256";
> +               reg = <4 0x2000000 0xFF>;
> +               interrupts = <8 1>;
> +               interrupt-parent = <&PIC>;
> +               data-rate = <4>;
> +               channel-phase = <1>;
> +               rising-edge-sync-pulse = "transmit";
> +       };
> diff -urN a/drivers/net/wan/Makefile b/drivers/net/wan/Makefile
> --- a/drivers/net/wan/Makefile  1970-01-01 01:00:00.000000000 +0100
> +++ b/drivers/net/wan/Makefile  2013-10-13 13:05:01.000000000 +0200
> @@ -22,6 +22,7 @@
>  obj-$(CONFIG_COSA)             += cosa.o
>  obj-$(CONFIG_FARSYNC)          += farsync.o
>  obj-$(CONFIG_DSCC4)             += dscc4.o
> +obj-$(CONFIG_PEF2256)           += pef2256.o
>  obj-$(CONFIG_X25_ASY)          += x25_asy.o
>
>  obj-$(CONFIG_LANMEDIA)         += lmc/
> diff -urN a/drivers/net/wan/Kconfig b/drivers/net/wan/Kconfig
> --- a/drivers/net/wan/Kconfig   1970-01-01 01:00:00.000000000 +0100
> +++ b/drivers/net/wan/Kconfig   2013-10-13 13:05:01.000000000 +0200
> @@ -266,6 +266,16 @@
>           To compile this driver as a module, choose M here: the
>           module will be called farsync.
>
> +config PEF2256
> +       tristate "PEF2256 support"
> +       depends on HDLC && OF && SYSFS

It would be better if this can build without OF selected.

Rob

^ permalink raw reply

* [PATCH net-next] tcp: remove redundant code in __tcp_retransmit_skb()
From: Neal Cardwell @ 2013-10-16 16:36 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Neal Cardwell, Eric Dumazet, Yuchung Cheng

Remove the specialized code in __tcp_retransmit_skb() that tries to trim
any ACKed payload preceding a FIN before we retransmit (this was added
in 1999 in v2.2.3pre3). This trimming code was made unreachable by the
more general code added above it that uses tcp_trim_head() to trim any
ACKed payload, with or without a FIN (this was added in "[NET]: Add
segmentation offload support to TCP." in 2002 circa v2.5.33).

Signed-off-by: Neal Cardwell <ncardwell@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
---
 net/ipv4/tcp_output.c | 15 ---------------
 1 file changed, 15 deletions(-)

diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index e5ce0e1..ce7c4d9 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -2351,21 +2351,6 @@ int __tcp_retransmit_skb(struct sock *sk, struct sk_buff *skb)
 
 	tcp_retrans_try_collapse(sk, skb, cur_mss);
 
-	/* Some Solaris stacks overoptimize and ignore the FIN on a
-	 * retransmit when old data is attached.  So strip it off
-	 * since it is cheap to do so and saves bytes on the network.
-	 */
-	if (skb->len > 0 &&
-	    (TCP_SKB_CB(skb)->tcp_flags & TCPHDR_FIN) &&
-	    tp->snd_una == (TCP_SKB_CB(skb)->end_seq - 1)) {
-		if (!pskb_trim(skb, 0)) {
-			/* Reuse, even though it does some unnecessary work */
-			tcp_init_nondata_skb(skb, TCP_SKB_CB(skb)->end_seq - 1,
-					     TCP_SKB_CB(skb)->tcp_flags);
-			skb->ip_summed = CHECKSUM_NONE;
-		}
-	}
-
 	/* Make a copy, if the first transmission SKB clone we made
 	 * is still in somebody's hands, else make a clone.
 	 */
-- 
1.8.4

^ permalink raw reply related

* Re: [PATCH net-next] tcp: remove redundant code in __tcp_retransmit_skb()
From: Eric Dumazet @ 2013-10-16 16:44 UTC (permalink / raw)
  To: Neal Cardwell; +Cc: David Miller, netdev, Eric Dumazet, Yuchung Cheng
In-Reply-To: <1381941411-30645-1-git-send-email-ncardwell@google.com>

On Wed, 2013-10-16 at 12:36 -0400, Neal Cardwell wrote:
> Remove the specialized code in __tcp_retransmit_skb() that tries to trim
> any ACKed payload preceding a FIN before we retransmit (this was added
> in 1999 in v2.2.3pre3). This trimming code was made unreachable by the
> more general code added above it that uses tcp_trim_head() to trim any
> ACKed payload, with or without a FIN (this was added in "[NET]: Add
> segmentation offload support to TCP." in 2002 circa v2.5.33).
> 
> Signed-off-by: Neal Cardwell <ncardwell@google.com>
> Cc: Eric Dumazet <edumazet@google.com>
> Cc: Yuchung Cheng <ycheng@google.com>
> ---
>  net/ipv4/tcp_output.c | 15 ---------------
>  1 file changed, 15 deletions(-)

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply

* Re: [Xen-devel] DomU's network interface will hung when Dom0 running 32bit
From: jianhai luan @ 2013-10-16 16:44 UTC (permalink / raw)
  To: David Vrabel, Wei Liu; +Cc: xen-devel, annie li, Ian Campbell, netdev
In-Reply-To: <525EBABB.4070906@citrix.com>


On 2013-10-17 0:11, David Vrabel wrote:
> On 16/10/13 16:17, Wei Liu wrote:
>> On Wed, Oct 16, 2013 at 11:04:34PM +0800, jianhai luan wrote:
>> [...]
>>> >From ef02403a10173896c5c102f768741d0700b8a3a2 Mon Sep 17 00:00:00 2001
>>> From: Jason Luan <jianhai.luan@oracle.com>
>>> Date: Tue, 15 Oct 2013 17:07:49 +0800
>>> Subject: [PATCH] xen-netback: pending timer only in the range [expire,
>>>   next_credit)
>>>
>>> The function time_after_eq() do correct judge in range of MAX_UNLONG/2.
>>> If net-front send lesser package, the delta between now and next_credit
>>> will out of the range and time_after_eq() will do wrong judge in result
>>> to net-front hung.  For example:
>>>      expire    next_credit    ....    next_credit+MAX_UNLONG/2    now
>>>      -----------------time increases this direction----------------->
>>>
>>> We should be add the environment which now beyond next_credit+MAX_UNLONG/2.
>>> Because the fact now mustn't before expire, time_before(now, expire) == true
>>> will show the environment.
>>>      time_after_eq(now, next_credit) || time_before (now, expire)
>>>      ==
>>>      !time_in_range_open(now, expire, next_credit)
>>>
> I would like the description improved because it's too hard to understand.
>
> How about something like:
>
> "time_after_eq() only works if the delta is < MAX_ULONG/2.
>
> If netfront sends at a very low rate, the time between subsequent calls
> to tx_credit_exceeded() may exceed MAX_ULONG/2 and the test for
> timer_after_eq() will be incorrect.  Credit will not be replenished and
> the guest may become unable to send (e.g., if prior to the long gap, all
> credit was exhausted)."

Thanks your description, i will accept it. :)
>
> But that's as far as I get because I can't see how the fix is correct.
> The time_in_range() test might still return the wrong value if now has
> advanced even further and wrapped so it is between expire and
> next_credit again.

typo, time_in_range() should be time_in_range_open().
Yes, if now have advanced even further and wrapped, it will always fall 
in [ expire, next_credit). In the range, please think two scenario:
    * No transmit limit: expire == next_credit, the range will be zero, 
replenish will always be done.
    * Transmit limit: Because guest may be consume all credit_bytes in 
very short time, other time in [expire, next_credit) will don't send any 
package. So the time which don't send package should be think about when 
we set the rate parameter.
       So if now fall in the range, the hung time should be acceptable. 
(if rate=10000M/s, the worse time will be 4s).
>
> I think the credit timeout should be always armed to expire in
> MAX_ULONG/4 jiffies (or some other large value).  If credit is exceeded,
> this timer is then adjusted to fire earlier (at next_credit as it does
> already).

Setting timer may be fixed the issue. But i don't think how to verify 
the fixed expect waiting 180 days. I verified the above patch only 
change expire's value to emulator the scenario.
>
> David

Jason.

^ permalink raw reply

* Re: [PATCH net-next v4 5/5] xen-netback: enable IPv6 TCP GSO to the guest
From: Ian Campbell @ 2013-10-16 16:49 UTC (permalink / raw)
  To: Paul Durrant; +Cc: xen-devel, netdev, Wei Liu, David Vrabel
In-Reply-To: <1381503982-1418-6-git-send-email-paul.durrant@citrix.com>

On Fri, 2013-10-11 at 16:06 +0100, Paul Durrant wrote:
> This patch adds code to handle SKB_GSO_TCPV6 skbs and construct appropriate
> extra or prefix segments to pass the large packet to the frontend. New
> xenstore flags, feature-gso-tcpv6 and feature-gso-tcpv6-prefix, are sampled
> to determine if the frontend is capable of handling such packets.

IIRC the reason we have feature-gso-tcpv4 and feature-gso-tcpv4-prefix
is that the former did things in a way which Windows couldn't cope with.
I assuming that is true for v6 too. But could Linux cope with the prefix
version too for v6 and reduce the number of options? Or is the
non-prefix variant actually better, if the guest can manage, for some
reason?

I suppose in the end its all piggybacking off the v4 code paths so
supporting both isn't a hardship.


> 
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> Cc: Wei Liu <wei.liu2@citrix.com>
> Cc: David Vrabel <david.vrabel@citrix.com>
> Cc: Ian Campbell <ian.campbell@citrix.com>
> ---
>  drivers/net/xen-netback/common.h    |    9 +++++--
>  drivers/net/xen-netback/interface.c |    6 +++--
>  drivers/net/xen-netback/netback.c   |   48 +++++++++++++++++++++++++++--------
>  drivers/net/xen-netback/xenbus.c    |   29 +++++++++++++++++++--
>  include/xen/interface/io/netif.h    |    1 +
>  5 files changed, 77 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
> index b4a9a3c..55b8dec 100644
> --- a/drivers/net/xen-netback/common.h
> +++ b/drivers/net/xen-netback/common.h
> @@ -87,9 +87,13 @@ struct pending_tx_info {
>  struct xenvif_rx_meta {
>  	int id;
>  	int size;
> +	int gso_type;
>  	int gso_size;
>  };
>  
> +#define GSO_BIT(type) \
> +	(1 << XEN_NETIF_GSO_TYPE_ ## type)
> +
>  /* Discriminate from any valid pending_idx value. */
>  #define INVALID_PENDING_IDX 0xFFFF
>  
> @@ -150,9 +154,10 @@ struct xenvif {
>  	u8               fe_dev_addr[6];
>  
>  	/* Frontend feature information. */
> +	int gso_mask;
> +	int gso_prefix_mask;

Are these storing NETIF_F_FOO? In which case they should be
netif_feature_t I think. At the least it needs to be unsigned.

> +
>  	u8 can_sg:1;
> -	u8 gso:1;
> -	u8 gso_prefix:1;
>  	u8 ip_csum:1;
>  	u8 ipv6_csum:1;
>  
> diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
> index cb0d8ea..e4aa267 100644
> --- a/drivers/net/xen-netback/interface.c
> +++ b/drivers/net/xen-netback/interface.c
> @@ -214,8 +214,10 @@ static netdev_features_t xenvif_fix_features(struct net_device *dev,
>  
>  	if (!vif->can_sg)
>  		features &= ~NETIF_F_SG;
> -	if (!vif->gso && !vif->gso_prefix)
> +	if (~(vif->gso_mask | vif->gso_prefix_mask) & GSO_BIT(TCPV4))

I must be blind -- where does this GSO_BIT macro come from?

I can't see it in current net-next...

> @@ -392,7 +394,14 @@ static void xenvif_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb,
>  		}
>  
>  		/* Leave a gap for the GSO descriptor. */
> -		if (*head && skb_shinfo(skb)->gso_size && !vif->gso_prefix)
> +		if (skb_shinfo(skb)->gso_type & SKB_GSO_TCPV4)
> +			gso_type = XEN_NETIF_GSO_TYPE_TCPV4;
> +		else if (skb_shinfo(skb)->gso_type & SKB_GSO_TCPV6)

I think you can use skb_is_gso(skb) and skb_is_gso_v6(skb) for these
ifs.

> +			gso_type = XEN_NETIF_GSO_TYPE_TCPV6;
> +		else
> +			gso_type = XEN_NETIF_GSO_TYPE_NONE;
> +
> +		if (*head && ((1 << gso_type) & vif->gso_mask))

Perhaps test_bit(gso_type, &vif->gso_mask) rather than opencoding
bitops?

I see there are a lot of these, a vif_can_gso_type(vif, gso_type) helper
might be nice.

>  			vif->rx.req_cons++;
>  
>  		*head = 0; /* There must be something in this buffer now. */
> @@ -423,14 +432,28 @@ static int xenvif_gop_skb(struct sk_buff *skb,
>  	unsigned char *data;
>  	int head = 1;
>  	int old_meta_prod;
> +	int gso_type;
> +	int gso_size;
>  
>  	old_meta_prod = npo->meta_prod;
>  
> +	if (skb_shinfo(skb)->gso_type & SKB_GSO_TCPV4) {
> +		gso_type = XEN_NETIF_GSO_TYPE_TCPV4;
> +		gso_size = skb_shinfo(skb)->gso_size;
> +	} else if (skb_shinfo(skb)->gso_type & SKB_GSO_TCPV6) {
> +		gso_type = XEN_NETIF_GSO_TYPE_TCPV6;
> +		gso_size = skb_shinfo(skb)->gso_size;
> +	} else {
> +		gso_type = XEN_NETIF_GSO_TYPE_NONE;
> +		gso_size = 0;
> +	}
> +
>  	/* Set up a GSO prefix descriptor, if necessary */
> -	if (skb_shinfo(skb)->gso_size && vif->gso_prefix) {
> +	if ((1 << skb_shinfo(skb)->gso_type) & vif->gso_prefix_mask) {

Isn't skb->gso_type a Linux GSO flag thing and vif->gso_prefix_mask a
Xen netif.h GSO type? Are they really comparable in this way?

>  		req = RING_GET_REQUEST(&vif->rx, vif->rx.req_cons++);
>  		meta = npo->meta + npo->meta_prod++;
> -		meta->gso_size = skb_shinfo(skb)->gso_size;
> +		meta->gso_type = gso_type;
> +		meta->gso_size = gso_size;
>  		meta->size = 0;
>  		meta->id = req->id;
>  	}

> +	if (vif->gso_mask & vif->gso_prefix_mask) {
> +		xenbus_dev_fatal(dev, err,
> +				 "%s: gso and gso prefix flags are not "
> +				 "mutually exclusive",

Aren't they? I thought they were? Maybe I'm reading this error message
backwards, in which case I would contend that it is written
backwards ;-)

Ian.

^ permalink raw reply

* [PATCH net-next v5 1/5] xen-netback: add support for IPv6 checksum offload to guest
From: Paul Durrant @ 2013-10-16 16:50 UTC (permalink / raw)
  To: xen-devel, netdev; +Cc: Paul Durrant, Wei Liu, David Vrabel, Ian Campbell
In-Reply-To: <1381942232-26268-1-git-send-email-paul.durrant@citrix.com>

Check xenstore flag feature-ipv6-csum-offload to determine if a
guest is happy to accept IPv6 packets with only partial checksum.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
---
 drivers/net/xen-netback/common.h    |    3 ++-
 drivers/net/xen-netback/interface.c |   10 +++++++---
 drivers/net/xen-netback/xenbus.c    |    7 ++++++-
 include/xen/interface/io/netif.h    |    7 +++++++
 4 files changed, 22 insertions(+), 5 deletions(-)

diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index 5715318..b4a9a3c 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -153,7 +153,8 @@ struct xenvif {
 	u8 can_sg:1;
 	u8 gso:1;
 	u8 gso_prefix:1;
-	u8 csum:1;
+	u8 ip_csum:1;
+	u8 ipv6_csum:1;
 
 	/* Internal feature information. */
 	u8 can_queue:1;	    /* can queue packets for receiver? */
diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
index 01bb854..8e92783 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -216,8 +216,10 @@ static netdev_features_t xenvif_fix_features(struct net_device *dev,
 		features &= ~NETIF_F_SG;
 	if (!vif->gso && !vif->gso_prefix)
 		features &= ~NETIF_F_TSO;
-	if (!vif->csum)
+	if (!vif->ip_csum)
 		features &= ~NETIF_F_IP_CSUM;
+	if (!vif->ipv6_csum)
+		features &= ~NETIF_F_IPV6_CSUM;
 
 	return features;
 }
@@ -306,7 +308,7 @@ struct xenvif *xenvif_alloc(struct device *parent, domid_t domid,
 	vif->domid  = domid;
 	vif->handle = handle;
 	vif->can_sg = 1;
-	vif->csum = 1;
+	vif->ip_csum = 1;
 	vif->dev = dev;
 
 	vif->credit_bytes = vif->remaining_credit = ~0UL;
@@ -316,7 +318,9 @@ struct xenvif *xenvif_alloc(struct device *parent, domid_t domid,
 	vif->credit_timeout.expires = jiffies;
 
 	dev->netdev_ops	= &xenvif_netdev_ops;
-	dev->hw_features = NETIF_F_SG | NETIF_F_IP_CSUM | NETIF_F_TSO;
+	dev->hw_features = NETIF_F_SG |
+		NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM |
+		NETIF_F_TSO;
 	dev->features = dev->hw_features;
 	SET_ETHTOOL_OPS(dev, &xenvif_ethtool_ops);
 
diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c
index 1b08d87..ad27b15 100644
--- a/drivers/net/xen-netback/xenbus.c
+++ b/drivers/net/xen-netback/xenbus.c
@@ -574,7 +574,12 @@ static int connect_rings(struct backend_info *be)
 	if (xenbus_scanf(XBT_NIL, dev->otherend, "feature-no-csum-offload",
 			 "%d", &val) < 0)
 		val = 0;
-	vif->csum = !val;
+	vif->ip_csum = !val;
+
+	if (xenbus_scanf(XBT_NIL, dev->otherend, "feature-ipv6-csum-offload",
+			 "%d", &val) < 0)
+		val = 0;
+	vif->ipv6_csum = !!val;
 
 	/* Map the shared frame, irq etc. */
 	err = xenvif_connect(vif, tx_ring_ref, rx_ring_ref,
diff --git a/include/xen/interface/io/netif.h b/include/xen/interface/io/netif.h
index eb262e3..c9e8184 100644
--- a/include/xen/interface/io/netif.h
+++ b/include/xen/interface/io/netif.h
@@ -51,6 +51,13 @@
  */
 
 /*
+ * "feature-no-csum-offload" should be used to turn IPv4 TCP/UDP checksum
+ * offload off or on. If it is missing then the feature is assumed to be on.
+ * "feature-ipv6-csum-offload" should be used to turn IPv6 TCP/UDP checksum
+ * offload on or off. If it is missing then the feature is assumed to be off.
+ */
+
+/*
  * This is the 'wire' format for packets:
  *  Request 1: xen_netif_tx_request  -- XEN_NETTXF_* (any flags)
  * [Request 2: xen_netif_extra_info]    (only if request 1 has XEN_NETTXF_extra_info)
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH net-next v5 0/5] xen-netback: IPv6 offload support
From: Paul Durrant @ 2013-10-16 16:50 UTC (permalink / raw)
  To: xen-devel, netdev

This patch series adds support for checksum and large packet offloads into
xen-netback.
Testing has mainly been done using the Microsoft network hardware
certification suite running in Server 2008R2 VMs with Citrix PV frontends.

v2:
- Fixed Wei's email address in Cc lines

v3:
- Responded to Wei's comments:
 - netif.h now updated with comments and a definition of
   XEN_NETIF_GSO_TYPE_NONE.
 - limited number of pullups
- Responded to Annie's comments:
 - New GSO_BIT macro

v4:
- Responded to more of Wei's comments
- Remove parsing of IPv6 fragment header and added warning

v5:
- Added comment concerning the value chosen for PKT_PROT_LEN
- Dropped deprecation of feature-no-csum-offload

^ permalink raw reply

* [PATCH net-next v5 3/5] xen-netback: Unconditionally set NETIF_F_RXCSUM
From: Paul Durrant @ 2013-10-16 16:50 UTC (permalink / raw)
  To: xen-devel, netdev; +Cc: Paul Durrant, Wei Liu, David Vrabel, Ian Campbell
In-Reply-To: <1381942232-26268-1-git-send-email-paul.durrant@citrix.com>

There is no mechanism to insist that a guest always generates a packet
with good checksum (at least for IPv4) so we must handle checksum
offloading from the guest and hence should set NETIF_F_RXCSUM.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
---
 drivers/net/xen-netback/interface.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
index 8e92783..cb0d8ea 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -321,7 +321,7 @@ struct xenvif *xenvif_alloc(struct device *parent, domid_t domid,
 	dev->hw_features = NETIF_F_SG |
 		NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM |
 		NETIF_F_TSO;
-	dev->features = dev->hw_features;
+	dev->features = dev->hw_features | NETIF_F_RXCSUM;
 	SET_ETHTOOL_OPS(dev, &xenvif_ethtool_ops);
 
 	dev->tx_queue_len = XENVIF_QUEUE_LENGTH;
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH net-next v5 4/5] xen-netback: handle IPv6 TCP GSO packets from the guest
From: Paul Durrant @ 2013-10-16 16:50 UTC (permalink / raw)
  To: xen-devel, netdev; +Cc: Paul Durrant, Wei Liu, David Vrabel
In-Reply-To: <1381942232-26268-1-git-send-email-paul.durrant@citrix.com>

This patch adds a xenstore feature flag, festure-gso-tcpv6, to advertise
that netback can handle IPv6 TCP GSO packets. It creates SKB_GSO_TCPV6 skbs
if the frontend passes an extra segment with the new type
XEN_NETIF_GSO_TYPE_TCPV6 added to netif.h.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
 drivers/net/xen-netback/netback.c |   11 ++++++++---
 drivers/net/xen-netback/xenbus.c  |    7 +++++++
 include/xen/interface/io/netif.h  |   10 +++++++++-
 3 files changed, 24 insertions(+), 4 deletions(-)

diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index 4271f8d..0e327d4 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -1098,15 +1098,20 @@ static int xenvif_set_skb_gso(struct xenvif *vif,
 		return -EINVAL;
 	}
 
-	/* Currently only TCPv4 S.O. is supported. */
-	if (gso->u.gso.type != XEN_NETIF_GSO_TYPE_TCPV4) {
+	switch (gso->u.gso.type) {
+	case XEN_NETIF_GSO_TYPE_TCPV4:
+		skb_shinfo(skb)->gso_type = SKB_GSO_TCPV4;
+		break;
+	case XEN_NETIF_GSO_TYPE_TCPV6:
+		skb_shinfo(skb)->gso_type = SKB_GSO_TCPV6;
+		break;
+	default:
 		netdev_err(vif->dev, "Bad GSO type %d.\n", gso->u.gso.type);
 		xenvif_fatal_tx_err(vif);
 		return -EINVAL;
 	}
 
 	skb_shinfo(skb)->gso_size = gso->u.gso.size;
-	skb_shinfo(skb)->gso_type = SKB_GSO_TCPV4;
 
 	/* Header must be checked, and gso_segs computed. */
 	skb_shinfo(skb)->gso_type |= SKB_GSO_DODGY;
diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c
index 108e752..02cb00b 100644
--- a/drivers/net/xen-netback/xenbus.c
+++ b/drivers/net/xen-netback/xenbus.c
@@ -105,6 +105,13 @@ static int netback_probe(struct xenbus_device *dev,
 			goto abort_transaction;
 		}
 
+		err = xenbus_printf(xbt, dev->nodename, "feature-gso-tcpv6",
+				    "%d", sg);
+		if (err) {
+			message = "writing feature-gso-tcpv6";
+			goto abort_transaction;
+		}
+
 		/* We support partial checksum setup for IPv6 packets */
 		err = xenbus_printf(xbt, dev->nodename,
 				    "feature-ipv6-csum-offload",
diff --git a/include/xen/interface/io/netif.h b/include/xen/interface/io/netif.h
index c9e8184..5e766eb 100644
--- a/include/xen/interface/io/netif.h
+++ b/include/xen/interface/io/netif.h
@@ -58,6 +58,13 @@
  */
 
 /*
+ * "feature-gso-tcpv4" and "feature-gso-tcpv6" advertise the capability to
+ * handle large TCP packets (in IPv4 or IPv6 form respectively). Neither
+ * frontends nor backends are assumed to be capable unless the flags are
+ * present.
+ */
+
+/*
  * This is the 'wire' format for packets:
  *  Request 1: xen_netif_tx_request  -- XEN_NETTXF_* (any flags)
  * [Request 2: xen_netif_extra_info]    (only if request 1 has XEN_NETTXF_extra_info)
@@ -102,8 +109,9 @@ struct xen_netif_tx_request {
 #define _XEN_NETIF_EXTRA_FLAG_MORE	(0)
 #define  XEN_NETIF_EXTRA_FLAG_MORE	(1U<<_XEN_NETIF_EXTRA_FLAG_MORE)
 
-/* GSO types - only TCPv4 currently supported. */
+/* GSO types */
 #define XEN_NETIF_GSO_TYPE_TCPV4	(1)
+#define XEN_NETIF_GSO_TYPE_TCPV6	(2)
 
 /*
  * This structure needs to fit within both netif_tx_request and
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH net-next v5 2/5] xen-netback: add support for IPv6 checksum offload from guest
From: Paul Durrant @ 2013-10-16 16:50 UTC (permalink / raw)
  To: xen-devel, netdev; +Cc: Paul Durrant, Wei Liu, David Vrabel, Ian Campbell
In-Reply-To: <1381942232-26268-1-git-send-email-paul.durrant@citrix.com>

For performance of VM to VM traffic on a single host it is better to avoid
calculation of TCP/UDP checksum in the sending frontend. To allow this this
patch adds the code necessary to set up partial checksum for IPv6 packets
and xenstore flag feature-ipv6-csum-offload to advertise that fact to
frontends.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
---
 drivers/net/xen-netback/netback.c |  235 +++++++++++++++++++++++++++++++------
 drivers/net/xen-netback/xenbus.c  |    9 ++
 2 files changed, 205 insertions(+), 39 deletions(-)

diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index f3e591c..4271f8d 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -109,15 +109,12 @@ static inline unsigned long idx_to_kaddr(struct xenvif *vif,
 	return (unsigned long)pfn_to_kaddr(idx_to_pfn(vif, idx));
 }
 
-/*
- * This is the amount of packet we copy rather than map, so that the
- * guest can't fiddle with the contents of the headers while we do
- * packet processing on them (netfilter, routing, etc).
+/* This is a miniumum size for the linear area to avoid lots of
+ * calls to __pskb_pull_tail() as we set up checksum offsets. The
+ * value 128 was chosen as it covers all IPv4 and most likely
+ * IPv6 headers.
  */
-#define PKT_PROT_LEN    (ETH_HLEN + \
-			 VLAN_HLEN + \
-			 sizeof(struct iphdr) + MAX_IPOPTLEN + \
-			 sizeof(struct tcphdr) + MAX_TCP_OPTION_SPACE)
+#define PKT_PROT_LEN 128
 
 static u16 frag_get_pending_idx(skb_frag_t *frag)
 {
@@ -1118,61 +1115,74 @@ static int xenvif_set_skb_gso(struct xenvif *vif,
 	return 0;
 }
 
-static int checksum_setup(struct xenvif *vif, struct sk_buff *skb)
+static inline void maybe_pull_tail(struct sk_buff *skb, unsigned int len)
+{
+	if (skb_is_nonlinear(skb) && skb_headlen(skb) < len) {
+		/* If we need to pullup then pullup to the max, so we
+		 * won't need to do it again.
+		 */
+		int target = min_t(int, skb->len, MAX_TCP_HEADER);
+		__pskb_pull_tail(skb, target - skb_headlen(skb));
+	}
+}
+
+static int checksum_setup_ip(struct xenvif *vif, struct sk_buff *skb,
+			     int recalculate_partial_csum)
 {
-	struct iphdr *iph;
+	struct iphdr *iph = (void *)skb->data;
+	unsigned int header_size;
+	unsigned int off;
 	int err = -EPROTO;
-	int recalculate_partial_csum = 0;
 
-	/*
-	 * A GSO SKB must be CHECKSUM_PARTIAL. However some buggy
-	 * peers can fail to set NETRXF_csum_blank when sending a GSO
-	 * frame. In this case force the SKB to CHECKSUM_PARTIAL and
-	 * recalculate the partial checksum.
-	 */
-	if (skb->ip_summed != CHECKSUM_PARTIAL && skb_is_gso(skb)) {
-		vif->rx_gso_checksum_fixup++;
-		skb->ip_summed = CHECKSUM_PARTIAL;
-		recalculate_partial_csum = 1;
-	}
+	off = sizeof(struct iphdr);
 
-	/* A non-CHECKSUM_PARTIAL SKB does not require setup. */
-	if (skb->ip_summed != CHECKSUM_PARTIAL)
-		return 0;
+	header_size = skb->network_header + off + MAX_IPOPTLEN;
+	maybe_pull_tail(skb, header_size);
 
-	if (skb->protocol != htons(ETH_P_IP))
-		goto out;
+	off = iph->ihl * 4;
 
-	iph = (void *)skb->data;
 	switch (iph->protocol) {
 	case IPPROTO_TCP:
-		if (!skb_partial_csum_set(skb, 4 * iph->ihl,
+		if (!skb_partial_csum_set(skb, off,
 					  offsetof(struct tcphdr, check)))
 			goto out;
 
 		if (recalculate_partial_csum) {
 			struct tcphdr *tcph = tcp_hdr(skb);
+
+			header_size = skb->network_header +
+				off +
+				sizeof(struct tcphdr);
+			maybe_pull_tail(skb, header_size);
+
 			tcph->check = ~csum_tcpudp_magic(iph->saddr, iph->daddr,
-							 skb->len - iph->ihl*4,
+							 skb->len - off,
 							 IPPROTO_TCP, 0);
 		}
 		break;
 	case IPPROTO_UDP:
-		if (!skb_partial_csum_set(skb, 4 * iph->ihl,
+		if (!skb_partial_csum_set(skb, off,
 					  offsetof(struct udphdr, check)))
 			goto out;
 
 		if (recalculate_partial_csum) {
 			struct udphdr *udph = udp_hdr(skb);
+
+			header_size = skb->network_header +
+				off +
+				sizeof(struct udphdr);
+			maybe_pull_tail(skb, header_size);
+
 			udph->check = ~csum_tcpudp_magic(iph->saddr, iph->daddr,
-							 skb->len - iph->ihl*4,
+							 skb->len - off,
 							 IPPROTO_UDP, 0);
 		}
 		break;
 	default:
 		if (net_ratelimit())
 			netdev_err(vif->dev,
-				   "Attempting to checksum a non-TCP/UDP packet, dropping a protocol %d packet\n",
+				   "Attempting to checksum a non-TCP/UDP packet, "
+				   "dropping a protocol %d packet\n",
 				   iph->protocol);
 		goto out;
 	}
@@ -1183,6 +1193,158 @@ out:
 	return err;
 }
 
+static int checksum_setup_ipv6(struct xenvif *vif, struct sk_buff *skb,
+			       int recalculate_partial_csum)
+{
+	int err = -EPROTO;
+	struct ipv6hdr *ipv6h = (void *)skb->data;
+	u8 nexthdr;
+	unsigned int header_size;
+	unsigned int off;
+	bool fragment;
+	bool done;
+
+	done = false;
+
+	off = sizeof(struct ipv6hdr);
+
+	header_size = skb->network_header + off;
+	maybe_pull_tail(skb, header_size);
+
+	nexthdr = ipv6h->nexthdr;
+
+	while ((off <= sizeof(struct ipv6hdr) + ntohs(ipv6h->payload_len)) &&
+	       !done) {
+		switch (nexthdr) {
+		case IPPROTO_DSTOPTS:
+		case IPPROTO_HOPOPTS:
+		case IPPROTO_ROUTING: {
+			struct ipv6_opt_hdr *hp = (void *)(skb->data + off);
+
+			header_size = skb->network_header +
+				off +
+				sizeof(struct ipv6_opt_hdr);
+			maybe_pull_tail(skb, header_size);
+
+			nexthdr = hp->nexthdr;
+			off += ipv6_optlen(hp);
+			break;
+		}
+		case IPPROTO_AH: {
+			struct ip_auth_hdr *hp = (void *)(skb->data + off);
+
+			header_size = skb->network_header +
+				off +
+				sizeof(struct ip_auth_hdr);
+			maybe_pull_tail(skb, header_size);
+
+			nexthdr = hp->nexthdr;
+			off += (hp->hdrlen+2)<<2;
+			break;
+		}
+		case IPPROTO_FRAGMENT:
+			fragment = true;
+			/* fall through */
+		default:
+			done = true;
+			break;
+		}
+	}
+
+	if (!done) {
+		if (net_ratelimit())
+			netdev_err(vif->dev, "Failed to parse packet header\n");
+		goto out;
+	}
+
+	if (fragment) {
+		if (net_ratelimit())
+			netdev_err(vif->dev, "Packet is a fragment!\n");
+		goto out;
+	}
+
+	switch (nexthdr) {
+	case IPPROTO_TCP:
+		if (!skb_partial_csum_set(skb, off,
+					  offsetof(struct tcphdr, check)))
+			goto out;
+
+		if (recalculate_partial_csum) {
+			struct tcphdr *tcph = tcp_hdr(skb);
+
+			header_size = skb->network_header +
+				off +
+				sizeof(struct tcphdr);
+			maybe_pull_tail(skb, header_size);
+
+			tcph->check = ~csum_ipv6_magic(&ipv6h->saddr,
+						       &ipv6h->daddr,
+						       skb->len - off,
+						       IPPROTO_TCP, 0);
+		}
+		break;
+	case IPPROTO_UDP:
+		if (!skb_partial_csum_set(skb, off,
+					  offsetof(struct udphdr, check)))
+			goto out;
+
+		if (recalculate_partial_csum) {
+			struct udphdr *udph = udp_hdr(skb);
+
+			header_size = skb->network_header +
+				off +
+				sizeof(struct udphdr);
+			maybe_pull_tail(skb, header_size);
+
+			udph->check = ~csum_ipv6_magic(&ipv6h->saddr,
+						       &ipv6h->daddr,
+						       skb->len - off,
+						       IPPROTO_UDP, 0);
+		}
+		break;
+	default:
+		if (net_ratelimit())
+			netdev_err(vif->dev,
+				   "Attempting to checksum a non-TCP/UDP packet, "
+				   "dropping a protocol %d packet\n",
+				   nexthdr);
+		goto out;
+	}
+
+	err = 0;
+
+out:
+	return err;
+}
+
+static int checksum_setup(struct xenvif *vif, struct sk_buff *skb)
+{
+	int err = -EPROTO;
+	int recalculate_partial_csum = 0;
+
+	/* A GSO SKB must be CHECKSUM_PARTIAL. However some buggy
+	 * peers can fail to set NETRXF_csum_blank when sending a GSO
+	 * frame. In this case force the SKB to CHECKSUM_PARTIAL and
+	 * recalculate the partial checksum.
+	 */
+	if (skb->ip_summed != CHECKSUM_PARTIAL && skb_is_gso(skb)) {
+		vif->rx_gso_checksum_fixup++;
+		skb->ip_summed = CHECKSUM_PARTIAL;
+		recalculate_partial_csum = 1;
+	}
+
+	/* A non-CHECKSUM_PARTIAL SKB does not require setup. */
+	if (skb->ip_summed != CHECKSUM_PARTIAL)
+		return 0;
+
+	if (skb->protocol == htons(ETH_P_IP))
+		err = checksum_setup_ip(vif, skb, recalculate_partial_csum);
+	else if (skb->protocol == htons(ETH_P_IPV6))
+		err = checksum_setup_ipv6(vif, skb, recalculate_partial_csum);
+
+	return err;
+}
+
 static bool tx_credit_exceeded(struct xenvif *vif, unsigned size)
 {
 	unsigned long now = jiffies;
@@ -1428,12 +1590,7 @@ static int xenvif_tx_submit(struct xenvif *vif, int budget)
 
 		xenvif_fill_frags(vif, skb);
 
-		/*
-		 * If the initial fragment was < PKT_PROT_LEN then
-		 * pull through some bytes from the other fragments to
-		 * increase the linear region to PKT_PROT_LEN bytes.
-		 */
-		if (skb_headlen(skb) < PKT_PROT_LEN && skb_is_nonlinear(skb)) {
+		if (skb_is_nonlinear(skb) && skb_headlen(skb) < PKT_PROT_LEN) {
 			int target = min_t(int, skb->len, PKT_PROT_LEN);
 			__pskb_pull_tail(skb, target - skb_headlen(skb));
 		}
diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c
index ad27b15..108e752 100644
--- a/drivers/net/xen-netback/xenbus.c
+++ b/drivers/net/xen-netback/xenbus.c
@@ -105,6 +105,15 @@ static int netback_probe(struct xenbus_device *dev,
 			goto abort_transaction;
 		}
 
+		/* We support partial checksum setup for IPv6 packets */
+		err = xenbus_printf(xbt, dev->nodename,
+				    "feature-ipv6-csum-offload",
+				    "%d", 1);
+		if (err) {
+			message = "writing feature-ipv6-csum-offload";
+			goto abort_transaction;
+		}
+
 		/* We support rx-copy path. */
 		err = xenbus_printf(xbt, dev->nodename,
 				    "feature-rx-copy", "%d", 1);
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH net-next v5 5/5] xen-netback: enable IPv6 TCP GSO to the guest
From: Paul Durrant @ 2013-10-16 16:50 UTC (permalink / raw)
  To: xen-devel, netdev; +Cc: Paul Durrant, Wei Liu, David Vrabel, Ian Campbell
In-Reply-To: <1381942232-26268-1-git-send-email-paul.durrant@citrix.com>

This patch adds code to handle SKB_GSO_TCPV6 skbs and construct appropriate
extra or prefix segments to pass the large packet to the frontend. New
xenstore flags, feature-gso-tcpv6 and feature-gso-tcpv6-prefix, are sampled
to determine if the frontend is capable of handling such packets.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
---
 drivers/net/xen-netback/common.h    |    9 +++++--
 drivers/net/xen-netback/interface.c |    6 +++--
 drivers/net/xen-netback/netback.c   |   48 +++++++++++++++++++++++++++--------
 drivers/net/xen-netback/xenbus.c    |   29 +++++++++++++++++++--
 include/xen/interface/io/netif.h    |    1 +
 5 files changed, 77 insertions(+), 16 deletions(-)

diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index b4a9a3c..55b8dec 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -87,9 +87,13 @@ struct pending_tx_info {
 struct xenvif_rx_meta {
 	int id;
 	int size;
+	int gso_type;
 	int gso_size;
 };
 
+#define GSO_BIT(type) \
+	(1 << XEN_NETIF_GSO_TYPE_ ## type)
+
 /* Discriminate from any valid pending_idx value. */
 #define INVALID_PENDING_IDX 0xFFFF
 
@@ -150,9 +154,10 @@ struct xenvif {
 	u8               fe_dev_addr[6];
 
 	/* Frontend feature information. */
+	int gso_mask;
+	int gso_prefix_mask;
+
 	u8 can_sg:1;
-	u8 gso:1;
-	u8 gso_prefix:1;
 	u8 ip_csum:1;
 	u8 ipv6_csum:1;
 
diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
index cb0d8ea..e4aa267 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -214,8 +214,10 @@ static netdev_features_t xenvif_fix_features(struct net_device *dev,
 
 	if (!vif->can_sg)
 		features &= ~NETIF_F_SG;
-	if (!vif->gso && !vif->gso_prefix)
+	if (~(vif->gso_mask | vif->gso_prefix_mask) & GSO_BIT(TCPV4))
 		features &= ~NETIF_F_TSO;
+	if (~(vif->gso_mask | vif->gso_prefix_mask) & GSO_BIT(TCPV6))
+		features &= ~NETIF_F_TSO6;
 	if (!vif->ip_csum)
 		features &= ~NETIF_F_IP_CSUM;
 	if (!vif->ipv6_csum)
@@ -320,7 +322,7 @@ struct xenvif *xenvif_alloc(struct device *parent, domid_t domid,
 	dev->netdev_ops	= &xenvif_netdev_ops;
 	dev->hw_features = NETIF_F_SG |
 		NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM |
-		NETIF_F_TSO;
+		NETIF_F_TSO | NETIF_F_TSO6;
 	dev->features = dev->hw_features | NETIF_F_RXCSUM;
 	SET_ETHTOOL_OPS(dev, &xenvif_ethtool_ops);
 
diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index 0e327d4..828fdab 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -142,7 +142,7 @@ static int max_required_rx_slots(struct xenvif *vif)
 	int max = DIV_ROUND_UP(vif->dev->mtu, PAGE_SIZE);
 
 	/* XXX FIXME: RX path dependent on MAX_SKB_FRAGS */
-	if (vif->can_sg || vif->gso || vif->gso_prefix)
+	if (vif->can_sg || vif->gso_mask || vif->gso_prefix_mask)
 		max += MAX_SKB_FRAGS + 1; /* extra_info + frags */
 
 	return max;
@@ -314,6 +314,7 @@ static struct xenvif_rx_meta *get_next_rx_buffer(struct xenvif *vif,
 	req = RING_GET_REQUEST(&vif->rx, vif->rx.req_cons++);
 
 	meta = npo->meta + npo->meta_prod++;
+	meta->gso_type = XEN_NETIF_GSO_TYPE_NONE;
 	meta->gso_size = 0;
 	meta->size = 0;
 	meta->id = req->id;
@@ -336,6 +337,7 @@ static void xenvif_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb,
 	struct gnttab_copy *copy_gop;
 	struct xenvif_rx_meta *meta;
 	unsigned long bytes;
+	int gso_type;
 
 	/* Data must not cross a page boundary. */
 	BUG_ON(size + offset > PAGE_SIZE<<compound_order(page));
@@ -394,7 +396,14 @@ static void xenvif_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb,
 		}
 
 		/* Leave a gap for the GSO descriptor. */
-		if (*head && skb_shinfo(skb)->gso_size && !vif->gso_prefix)
+		if (skb_shinfo(skb)->gso_type & SKB_GSO_TCPV4)
+			gso_type = XEN_NETIF_GSO_TYPE_TCPV4;
+		else if (skb_shinfo(skb)->gso_type & SKB_GSO_TCPV6)
+			gso_type = XEN_NETIF_GSO_TYPE_TCPV6;
+		else
+			gso_type = XEN_NETIF_GSO_TYPE_NONE;
+
+		if (*head && ((1 << gso_type) & vif->gso_mask))
 			vif->rx.req_cons++;
 
 		*head = 0; /* There must be something in this buffer now. */
@@ -425,14 +434,28 @@ static int xenvif_gop_skb(struct sk_buff *skb,
 	unsigned char *data;
 	int head = 1;
 	int old_meta_prod;
+	int gso_type;
+	int gso_size;
 
 	old_meta_prod = npo->meta_prod;
 
+	if (skb_shinfo(skb)->gso_type & SKB_GSO_TCPV4) {
+		gso_type = XEN_NETIF_GSO_TYPE_TCPV4;
+		gso_size = skb_shinfo(skb)->gso_size;
+	} else if (skb_shinfo(skb)->gso_type & SKB_GSO_TCPV6) {
+		gso_type = XEN_NETIF_GSO_TYPE_TCPV6;
+		gso_size = skb_shinfo(skb)->gso_size;
+	} else {
+		gso_type = XEN_NETIF_GSO_TYPE_NONE;
+		gso_size = 0;
+	}
+
 	/* Set up a GSO prefix descriptor, if necessary */
-	if (skb_shinfo(skb)->gso_size && vif->gso_prefix) {
+	if ((1 << skb_shinfo(skb)->gso_type) & vif->gso_prefix_mask) {
 		req = RING_GET_REQUEST(&vif->rx, vif->rx.req_cons++);
 		meta = npo->meta + npo->meta_prod++;
-		meta->gso_size = skb_shinfo(skb)->gso_size;
+		meta->gso_type = gso_type;
+		meta->gso_size = gso_size;
 		meta->size = 0;
 		meta->id = req->id;
 	}
@@ -440,10 +463,13 @@ static int xenvif_gop_skb(struct sk_buff *skb,
 	req = RING_GET_REQUEST(&vif->rx, vif->rx.req_cons++);
 	meta = npo->meta + npo->meta_prod++;
 
-	if (!vif->gso_prefix)
-		meta->gso_size = skb_shinfo(skb)->gso_size;
-	else
+	if ((1 << gso_type) & vif->gso_mask) {
+		meta->gso_type = gso_type;
+		meta->gso_size = gso_size;
+	} else {
+		meta->gso_type = XEN_NETIF_GSO_TYPE_NONE;
 		meta->gso_size = 0;
+	}
 
 	meta->size = 0;
 	meta->id = req->id;
@@ -589,7 +615,8 @@ void xenvif_rx_action(struct xenvif *vif)
 
 		vif = netdev_priv(skb->dev);
 
-		if (vif->meta[npo.meta_cons].gso_size && vif->gso_prefix) {
+		if ((1 << vif->meta[npo.meta_cons].gso_type) &
+		    vif->gso_prefix_mask) {
 			resp = RING_GET_RESPONSE(&vif->rx,
 						 vif->rx.rsp_prod_pvt++);
 
@@ -626,7 +653,8 @@ void xenvif_rx_action(struct xenvif *vif)
 					vif->meta[npo.meta_cons].size,
 					flags);
 
-		if (vif->meta[npo.meta_cons].gso_size && !vif->gso_prefix) {
+		if ((1 << vif->meta[npo.meta_cons].gso_type) &
+		    vif->gso_mask) {
 			struct xen_netif_extra_info *gso =
 				(struct xen_netif_extra_info *)
 				RING_GET_RESPONSE(&vif->rx,
@@ -634,8 +662,8 @@ void xenvif_rx_action(struct xenvif *vif)
 
 			resp->flags |= XEN_NETRXF_extra_info;
 
+			gso->u.gso.type = vif->meta[npo.meta_cons].gso_type;
 			gso->u.gso.size = vif->meta[npo.meta_cons].gso_size;
-			gso->u.gso.type = XEN_NETIF_GSO_TYPE_TCPV4;
 			gso->u.gso.pad = 0;
 			gso->u.gso.features = 0;
 
diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c
index 02cb00b..f035899 100644
--- a/drivers/net/xen-netback/xenbus.c
+++ b/drivers/net/xen-netback/xenbus.c
@@ -577,15 +577,40 @@ static int connect_rings(struct backend_info *be)
 		val = 0;
 	vif->can_sg = !!val;
 
+	vif->gso_mask = 0;
+	vif->gso_prefix_mask = 0;
+
 	if (xenbus_scanf(XBT_NIL, dev->otherend, "feature-gso-tcpv4",
 			 "%d", &val) < 0)
 		val = 0;
-	vif->gso = !!val;
+	if (val)
+		vif->gso_mask |= GSO_BIT(TCPV4);
 
 	if (xenbus_scanf(XBT_NIL, dev->otherend, "feature-gso-tcpv4-prefix",
 			 "%d", &val) < 0)
 		val = 0;
-	vif->gso_prefix = !!val;
+	if (val)
+		vif->gso_prefix_mask |= GSO_BIT(TCPV4);
+
+	if (xenbus_scanf(XBT_NIL, dev->otherend, "feature-gso-tcpv6",
+			 "%d", &val) < 0)
+		val = 0;
+	if (val)
+		vif->gso_mask |= GSO_BIT(TCPV6);
+
+	if (xenbus_scanf(XBT_NIL, dev->otherend, "feature-gso-tcpv6-prefix",
+			 "%d", &val) < 0)
+		val = 0;
+	if (val)
+		vif->gso_prefix_mask |= GSO_BIT(TCPV6);
+
+	if (vif->gso_mask & vif->gso_prefix_mask) {
+		xenbus_dev_fatal(dev, err,
+				 "%s: gso and gso prefix flags are not "
+				 "mutually exclusive",
+				 dev->otherend);
+		return -EOPNOTSUPP;
+	}
 
 	if (xenbus_scanf(XBT_NIL, dev->otherend, "feature-no-csum-offload",
 			 "%d", &val) < 0)
diff --git a/include/xen/interface/io/netif.h b/include/xen/interface/io/netif.h
index 5e766eb..c50061d 100644
--- a/include/xen/interface/io/netif.h
+++ b/include/xen/interface/io/netif.h
@@ -110,6 +110,7 @@ struct xen_netif_tx_request {
 #define  XEN_NETIF_EXTRA_FLAG_MORE	(1U<<_XEN_NETIF_EXTRA_FLAG_MORE)
 
 /* GSO types */
+#define XEN_NETIF_GSO_TYPE_NONE		(0)
 #define XEN_NETIF_GSO_TYPE_TCPV4	(1)
 #define XEN_NETIF_GSO_TYPE_TCPV6	(2)
 
-- 
1.7.10.4

^ permalink raw reply related

* RE: [PATCH net-next v4 5/5] xen-netback: enable IPv6 TCP GSO to the guest
From: Paul Durrant @ 2013-10-16 16:52 UTC (permalink / raw)
  To: Ian Campbell
  Cc: xen-devel@lists.xen.org, netdev@vger.kernel.org, Wei Liu,
	David Vrabel
In-Reply-To: <1381942156.30409.19.camel@kazak.uk.xensource.com>

> -----Original Message-----
> From: Ian Campbell
> Sent: 16 October 2013 17:49
> To: Paul Durrant
> Cc: xen-devel@lists.xen.org; netdev@vger.kernel.org; Wei Liu; David Vrabel
> Subject: Re: [PATCH net-next v4 5/5] xen-netback: enable IPv6 TCP GSO to
> the guest
> 
> On Fri, 2013-10-11 at 16:06 +0100, Paul Durrant wrote:
> > This patch adds code to handle SKB_GSO_TCPV6 skbs and construct
> appropriate
> > extra or prefix segments to pass the large packet to the frontend. New
> > xenstore flags, feature-gso-tcpv6 and feature-gso-tcpv6-prefix, are
> sampled
> > to determine if the frontend is capable of handling such packets.
> 
> IIRC the reason we have feature-gso-tcpv4 and feature-gso-tcpv4-prefix
> is that the former did things in a way which Windows couldn't cope with.
> I assuming that is true for v6 too. But could Linux cope with the prefix
> version too for v6 and reduce the number of options? Or is the
> non-prefix variant actually better, if the guest can manage, for some
> reason?
>

The non-prefix variant actually conveys type information and so will be better for frontends that don't have to re-parse the headers anyway.

  Paul
 
> I suppose in the end its all piggybacking off the v4 code paths so
> supporting both isn't a hardship.
> 


^ permalink raw reply

* RE: [Xen-devel] [PATCH net-next v4 0/5] xen-netback: IPv6 offload support
From: Paul Durrant @ 2013-10-16 16:53 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel@lists.xen.org, netdev@vger.kernel.org
In-Reply-To: <1381940401.30409.9.camel@kazak.uk.xensource.com>

> -----Original Message-----
> From: Ian Campbell
> Sent: 16 October 2013 17:20
> To: Paul Durrant
> Cc: xen-devel@lists.xen.org; netdev@vger.kernel.org
> Subject: Re: [Xen-devel] [PATCH net-next v4 0/5] xen-netback: IPv6 offload
> support
> 
> On Fri, 2013-10-11 at 16:06 +0100, Paul Durrant wrote:
> > This patch series adds support for checksum and large packet offloads into
> > xen-netback.
> > Testing has mainly been done using the Microsoft network hardware
> > certification suite running in Server 2008R2 VMs with Citrix PV frontends.
> 
> Are there any Linux netfront patches in existence/the pipeline to take
> advantage of this?
> 

I was waiting for the backend patches to be accepted first ;-)

  Paul

> >
> > v2:
> > - Fixed Wei's email address in Cc lines
> >
> > v3:
> > - Responded to Wei's comments:
> >  - netif.h now updated with comments and a definition of
> >    XEN_NETIF_GSO_TYPE_NONE.
> >  - limited number of pullups
> > - Responded to Annie's comments:
> >  - New GSO_BIT macro
> >
> > v4:
> > - Responded to more of Wei's comments
> > - Remove parsing of IPv6 fragment header and added warning
> >
> >
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@lists.xen.org
> > http://lists.xen.org/xen-devel
> 


^ permalink raw reply

* RE: [Xen-devel] [PATCH net-next v4 2/5] xen-netback: add support for IPv6 checksum offload from guest
From: Paul Durrant @ 2013-10-16 16:57 UTC (permalink / raw)
  To: Ian Campbell
  Cc: David Vrabel, Wei Liu, netdev@vger.kernel.org,
	xen-devel@lists.xen.org
In-Reply-To: <1381939919.30409.4.camel@kazak.uk.xensource.com>

> -----Original Message-----
> From: Ian Campbell
> Sent: 16 October 2013 17:12
> To: Paul Durrant
> Cc: David Vrabel; Wei Liu; netdev@vger.kernel.org; xen-devel@lists.xen.org
> Subject: Re: [Xen-devel] [PATCH net-next v4 2/5] xen-netback: add support
> for IPv6 checksum offload from guest
> 
> On Mon, 2013-10-14 at 13:34 +0100, Paul Durrant wrote:
> > > -----Original Message-----
> > > From: David Vrabel
> > > Sent: 14 October 2013 13:19
> > > To: Wei Liu
> > > Cc: Paul Durrant; netdev@vger.kernel.org; Ian Campbell; David Vrabel;
> xen-
> > > devel@lists.xen.org
> > > Subject: Re: [Xen-devel] [PATCH net-next v4 2/5] xen-netback: add
> support
> > > for IPv6 checksum offload from guest
> > >
> > > On 14/10/13 11:55, Wei Liu wrote:
> > > > On Mon, Oct 14, 2013 at 11:49:20AM +0100, Paul Durrant wrote:
> > > >>> -----Original Message-----
> > > >>> From: Wei Liu [mailto:wei.liu2@citrix.com]
> > > >>> Sent: 14 October 2013 11:43
> > > >>> To: Paul Durrant
> > > >>> Cc: xen-devel@lists.xen.org; netdev@vger.kernel.org; Wei Liu; David
> > > Vrabel;
> > > >>> Ian Campbell
> > > >>> Subject: Re: [PATCH net-next v4 2/5] xen-netback: add support for
> IPv6
> > > >>> checksum offload from guest
> > > >>>
> > > >>> On Fri, Oct 11, 2013 at 04:06:19PM +0100, Paul Durrant wrote:
> > > >>> [...]
> > > >>>> -/*
> > > >>>> - * This is the amount of packet we copy rather than map, so that
> the
> > > >>>> - * guest can't fiddle with the contents of the headers while we do
> > > >>>> - * packet processing on them (netfilter, routing, etc).
> > > >>>> +/* This is a miniumum size for the linear area to avoid lots of
> > > >>>> + * calls to __pskb_pull_tail() as we set up checksum offsets.
> > > >>>>   */
> > > >>>
> > > >>> You seem to forget to explain why 128 is chosen. :-)
> > > >>
> > > >> Is that not sufficient explanation? What sort of thing are you looking
> for?
> > > >>
> > > >
> > > >>From the second version of this patch, we had a conversation.
> > > >
> > > >> Where does 128 come from?
> > > >>
> > > >
> > > > "It's just an arbitrary power of 2 that was chosen because it seems to
> > > > cover most likely v6 headers and all v4 headers."
> > > >
> > > > So something like: "We choose 128 which is likely to cover most V6
> > > > headers and all V4 headers" would be sufficeint.
> > >
> > > Is "most IPv6 headers" actually good enough?  Don't we need to ensure
> > > netback copies all IP headers?
> > >
> >
> > It will do if checksum offload is in use, but perhaps the pull as far
> > as the transport header needs to be done anyway? I'm unsure of the
> > expectations of other code.
> 
> I've always been under the impression that transport headers needed
> pulling up too, for the benefit of netfilter perhaps?
> 
> AIUI the frags should be pure "payload". I may be wrong about that
> though...
> 

I don't believe there is actually any upper bound on IPv6 header size so coming up with a static value would be hard. Do all h/w drivers really have to parse headers (assuming they're driving dumb h/w that doesn't do header payload split)?

  Paul

^ permalink raw reply

* Re: [Xen-devel] [PATCH net-next v4 0/5] xen-netback: IPv6 offload support
From: Ian Campbell @ 2013-10-16 17:00 UTC (permalink / raw)
  To: Paul Durrant; +Cc: xen-devel@lists.xen.org, netdev@vger.kernel.org
In-Reply-To: <9AAE0902D5BC7E449B7C8E4E778ABCD013B267@AMSPEX01CL01.citrite.net>

On Wed, 2013-10-16 at 17:53 +0100, Paul Durrant wrote:
> > -----Original Message-----
> > From: Ian Campbell
> > Sent: 16 October 2013 17:20
> > To: Paul Durrant
> > Cc: xen-devel@lists.xen.org; netdev@vger.kernel.org
> > Subject: Re: [Xen-devel] [PATCH net-next v4 0/5] xen-netback: IPv6 offload
> > support
> > 
> > On Fri, 2013-10-11 at 16:06 +0100, Paul Durrant wrote:
> > > This patch series adds support for checksum and large packet offloads into
> > > xen-netback.
> > > Testing has mainly been done using the Microsoft network hardware
> > > certification suite running in Server 2008R2 VMs with Citrix PV frontends.
> > 
> > Are there any Linux netfront patches in existence/the pipeline to take
> > advantage of this?
> > 
> 
> I was waiting for the backend patches to be accepted first ;-)

I think it would be useful to get et least an RFC so others can try it
etc.

Ian.

^ permalink raw reply

* Re: [PATCH net-next v4 5/5] xen-netback: enable IPv6 TCP GSO to the guest
From: Ian Campbell @ 2013-10-16 17:01 UTC (permalink / raw)
  To: Paul Durrant
  Cc: xen-devel@lists.xen.org, netdev@vger.kernel.org, Wei Liu,
	David Vrabel
In-Reply-To: <9AAE0902D5BC7E449B7C8E4E778ABCD013B24B@AMSPEX01CL01.citrite.net>

On Wed, 2013-10-16 at 17:52 +0100, Paul Durrant wrote:
> > -----Original Message-----
> > From: Ian Campbell
> > Sent: 16 October 2013 17:49
> > To: Paul Durrant
> > Cc: xen-devel@lists.xen.org; netdev@vger.kernel.org; Wei Liu; David Vrabel
> > Subject: Re: [PATCH net-next v4 5/5] xen-netback: enable IPv6 TCP GSO to
> > the guest
> > 
> > On Fri, 2013-10-11 at 16:06 +0100, Paul Durrant wrote:
> > > This patch adds code to handle SKB_GSO_TCPV6 skbs and construct
> > appropriate
> > > extra or prefix segments to pass the large packet to the frontend. New
> > > xenstore flags, feature-gso-tcpv6 and feature-gso-tcpv6-prefix, are
> > sampled
> > > to determine if the frontend is capable of handling such packets.
> > 
> > IIRC the reason we have feature-gso-tcpv4 and feature-gso-tcpv4-prefix
> > is that the former did things in a way which Windows couldn't cope with.
> > I assuming that is true for v6 too. But could Linux cope with the prefix
> > version too for v6 and reduce the number of options? Or is the
> > non-prefix variant actually better, if the guest can manage, for some
> > reason?
> >
> 
> The non-prefix variant actually conveys type information and so will
> be better for frontends that don't have to re-parse the headers
> anyway.

Make sense, thanks.

> 
>   Paul
>  
> > I suppose in the end its all piggybacking off the v4 code paths so
> > supporting both isn't a hardship.
> > 
> 

^ permalink raw reply

* Re: PROBLEM: GRE forwarding not working with 3.10.x, WCCPv2 and Cisco 7200 Router showing IP0 bad-hlen 4 in tcpdump
From: Eric Dumazet @ 2013-10-16 17:03 UTC (permalink / raw)
  To: Peter Schmitt; +Cc: netdev@vger.kernel.org, Pravin B Shelar
In-Reply-To: <1381912513.57737.YahooMailNeo@web172002.mail.ir2.yahoo.com>

On Wed, 2013-10-16 at 09:35 +0100, Peter Schmitt wrote:
> Hi,
> 
> I have a setup with the current kernel (3.10.16) and can't get WCCPv2 to work using GRE since kernel 3.10.x. With 3.9.x everything was working fine.
> 
> My setup is simple with three boxes included:
> 
> Client (10.111.111.222/24) -- (10.111.111.1/24) Cisco 7200 (192.168.180.113/24) -- Internet Gateway
>                                             (192.168.234.1/30)
>                                                     |
>                                                 WCCPv2
>                                                     |
>                                             (192.168.234.2/30)
>                                    Caching Box with Squid 2.7STABLE9 WCCPv2
> 
> 
> On the caching box, I have the interfaces:
> 6: eth3 inet 192.168.234.2/30 scope global eth3\ valid_lft forever preferred_lft forever
> 22: wccp104102 inet 192.168.234.2/32 scope global wccp104102\ valid_lft forever preferred_lft forever 
> 
> The WCCP handshake is established and I can see messages on the Cisco Router:
> *Oct 14 10:44:09.487: %WCCP-5-SERVICEFOUND: Service 80 acquired on WCCP client 192.168.234.2
> *Oct 14 10:44:09.495: %WCCP-5-SERVICEFOUND: Service 90 acquired on WCCP client 192.168.234.2
> 
> When the client now starts a request, it gets connection timeouts:
> >wget google.com
> --2013-10-14 13:53:30--  http://google.com/
> Resolving google.com (google.com)... 173.194.70.113, 173.194.70.100, 173.194.70.138, ...
> Connecting to google.com (google.com)|173.194.70.113|:80... failed: Connection timed out.
> Connecting to google.com (google.com)|173.194.70.100|:80... failed: Connection timed out.
> ...
> 
> A tcpdump on the Caching Box reveals the following:
> > tcpdump -i eth3 -n proto gre                                                                                                                                                                
> tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
> listening on eth3, link-type EN10MB (Ethernet), capture size 96 bytes
> 13:53:35.686377 IP 192.168.234.1 > 192.168.234.2: GREv0, length 68: gre-proto-0x883e
> 13:53:36.678849 IP 192.168.234.1 > 192.168.234.2: GREv0, length 68: gre-proto-0x883e
> 13:53:38.682782 IP 192.168.234.1 > 192.168.234.2: GREv0, length 68: gre-proto-0x883e
> 13:53:42.690670 IP 192.168.234.1 > 192.168.234.2: GREv0, length 68: gre-proto-0x883e
> 13:53:50.714444 IP 192.168.234.1 > 192.168.234.2: GREv0, length 68: gre-proto-0x883e
> 13:54:06.745979 IP 192.168.234.1 > 192.168.234.2: GREv0, length 68: gre-proto-0x883e
> 13:54:38.813539 IP 192.168.234.1 > 192.168.234.2: GREv0, length 68: gre-proto-0x883e
> 13:54:39.808525 IP 192.168.234.1 > 192.168.234.2: GREv0, length 68: gre-proto-0x883e
> 13:54:41.812964 IP 192.168.234.1 > 192.168.234.2: GREv0, length 68: gre-proto-0x883e
> 13:54:45.816337 IP 192.168.234.1 > 192.168.234.2: GREv0, length 68: gre-proto-0x883e 
> 
> > tcpdump -i wccp104102 -n                                                                                                                                                                    
> tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
> listening on wccp104102, link-type LINUX_SLL (Linux cooked), capture size 96 bytes
> 13:53:35.686377 IP0 bad-hlen 4
> 13:53:36.678849 IP0 bad-hlen 4
> 13:53:38.682782 IP0 bad-hlen 4
> 13:53:42.690670 IP0 bad-hlen 4
> 13:53:50.714444 IP0 bad-hlen 4
> 13:54:06.745979 IP0 bad-hlen 4
> 13:54:38.813539 IP0 bad-hlen 4
> 13:54:39.808525 IP0 bad-hlen 4
> 13:54:41.812964 IP0 bad-hlen 4
> 13:54:45.816337 IP0 bad-hlen 4 
> So I get correct GRE packets on the eth3 interface, but only bad-hlen messages from the GRE interface.
> 
> 
> ver_linux from the Caching box:
> 
> Linux cache 3.10.16-x86 #1 SMP Mon Oct 14 06:33:21 CEST 2013 i686 GNU/Linux
> Gnu C                       4.4.3
> Gnu make                 3.81
> binutils                      2.20.1
> util-linux                    2.17.2
> mount                       2.15.1-rc1
> module-init-tools        3.11.1
> e2fsprogs                  1.42.7
> PPP                         2.4.5
> Linux C Library          2.11.1
> Dynamic linker (ldd)   2.11.1
> Procps                     3.2.8
> Net-tools                  1.60
> Kbd                         1.15
> Sh-utils                    7.4
> Modules Loaded       xt_TPROXY nf_tproxy_core xt_set xt_socket nf_defrag_ipv6 xt_REDIRECT ip_set_hash_ip hwmon_vid hwmon bridge ip_set iptable_nat nf_nat_ipv4 ipt_REJECT nf_nat_irc nf_conntrack_irc nf_nat_ftp nf_nat nf_conntrack_ftp ip_gre gre bonding pcspkr uhci_hcd ehci_pci ehci_hcd lpc_ich mfd_core pata_acpi ata_generic
> 
> 
> Has anybody an idea of what's going wrong here? If you need any further information or some pcap files, I will surely provide them.
> 
> Thanks a lot for your attention!


3.10 is buggy (for ETH_P_WCCP support I mean), 3.11 has the probable
fix :

commit 3d7b46cd20e300bd6989fb1f43d46f1b9645816e
("ip_tunnel: push generic protocol handling to ip_tunnel module.")

The problem is 3.10 do not pull the extra 4 bytes of WCCP

This is now properly done in iptunnel_pull_header()

^ permalink raw reply

* Re: [RFC] bridge and friends: reduce TheLinuxWay(tm)
From: Vlad Yasevich @ 2013-10-16 17:11 UTC (permalink / raw)
  To: Jamal Hadi Salim, Stephen Hemminger
  Cc: Stephen Hemminger, netdev@vger.kernel.org
In-Reply-To: <525E9AB1.6090502@mojatatu.com>

On 10/16/2013 09:54 AM, Jamal Hadi Salim wrote:
> On 10/14/13 17:41, Stephen Hemminger wrote:
>
>> Unfortunately, by now this is all set in ABI.
>> It was a side effect of the per-feature evolutionary style of
>> development.
>
> Sadly, I agree. This is the dark side of "have code will travel";
> you let these things out in the wild, they breed and you cant
> take them back.
> BTW: I dont think what i suggested will be a harmful refactoring because
> no existing interfaces are removed - your call.
>
> In similar vein:
> What is the motivation behind IFLA_EXT_MASK?

This was to display or filter out virtual function data.

> Could you not have used
> ifm ifindex to relay the interface of interest? Currently the ifindex is
> not used at all. IMO, the following interfaces are useful:
> - get attributes for all bridge ports (this is there)
> - get attributes for bridge interface XXX; there using IFLA_EXT_MASK
> I think it should be using ifm->ifindex

I probably doesn't need to use this as we want the bridge data, not the 
VF data stored as part of PF interface.

> - get attributes for all bridge ports for bridge br-blah (not there)
> you could also use the ifindex of br-blah here instead

This would be usefull.


>
> Separate issue:
> To provide equivalence to brctl:
> - PF_BRIDGE should allow me to attach a bridge port to bridge of choice
> with SETLINK

You can already do this with:

	ip link set dev ethX master brX

I know, not very intuative, but it's there :(

-vlad



>
> cheers,
> jamal
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox