Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [net-next 00/11][pull request] Intel Wired LAN Driver Updates
From: David Miller @ 2013-10-26  4:30 UTC (permalink / raw)
  To: jeffrey.t.kirsher; +Cc: netdev, gospo, sassmann
In-Reply-To: <1382628458-26601-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Date: Thu, 24 Oct 2013 08:27:27 -0700

> This series contains updates to igb, igbvf, i40e, ixgbe and ixgbevf.

Pulled, thanks a lot Jeff.

^ permalink raw reply

* Re: [PATCH net-next] macsonic: Updated printk() statement to netdev_info function
From: Joe Perches @ 2013-10-26  4:26 UTC (permalink / raw)
  To: Matt Zanchelli; +Cc: netdev
In-Reply-To: <1382738000-24202-1-git-send-email-zanchm@rpi.edu>

On Fri, 2013-10-25 at 21:53 +0000, Matt Zanchelli wrote:
> Updated the printk() statement in mac_sonic_probe() to recommended netdev_info.

Hi again Matt.

> Reviewed-by: Mukkai Krishnamoorthy <mskmoorthy@gmail.com>
> Reviewed-by: Maxwell Ensley-Field <mensleyfield@gmail.com>
> Reviewed-by: Nicole Negedly <nnegedly@gmail.com>
> Reviewed-by: Daniel Felizardo <danfelizardo@gmail.com>
> Signed-off-by: Matt Zanchelli <zanchm@rpi.edu>

Does it really take 4 reviewers for a single line change?

I'd rather have each of these reviewers submit patches to
different files rather than have them review this pretty
trivial change to a pretty old driver that really doesn't
need much in the way of changes.

And please, if you modify this file all, please be more
comprehensive in the printk conversions.

There are many printks in this file, not just this one.

> diff --git a/drivers/net/ethernet/natsemi/macsonic.c b/drivers/net/ethernet/natsemi/macsonic.c
[]
> @@ -601,7 +601,7 @@ found:
>  	if (err)
>  		goto out;
>  
> -	printk("%s: MAC %pM IRQ %d\n", dev->name, dev->dev_addr, dev->irq);
> +	netdev_info(dev, "MAC %pM IRQ %d\n", dev->dev_addr, dev->irq);
>  
>  	return 0;

Something like:

 drivers/net/ethernet/natsemi/macsonic.c | 66 +++++++++++++++------------------
 drivers/net/ethernet/natsemi/sonic.h    |  3 +-
 2 files changed, 30 insertions(+), 39 deletions(-)

diff --git a/drivers/net/ethernet/natsemi/macsonic.c b/drivers/net/ethernet/natsemi/macsonic.c
index 346a4e0..bc33181 100644
--- a/drivers/net/ethernet/natsemi/macsonic.c
+++ b/drivers/net/ethernet/natsemi/macsonic.c
@@ -31,6 +31,8 @@
  *          on centris.
  */
 
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
 #include <linux/kernel.h>
 #include <linux/module.h>
 #include <linux/types.h>
@@ -80,8 +82,6 @@ static unsigned int sonic_debug = SONIC_DEBUG;
 static unsigned int sonic_debug = 1;
 #endif
 
-static int sonic_version_printed;
-
 /* For onboard SONIC */
 #define ONBOARD_SONIC_REGISTERS	0x50F0A000
 #define ONBOARD_SONIC_PROM_BASE	0x50f08000
@@ -143,8 +143,7 @@ static int macsonic_open(struct net_device* dev)
 
 	retval = request_irq(dev->irq, sonic_interrupt, 0, "sonic", dev);
 	if (retval) {
-		printk(KERN_ERR "%s: unable to get IRQ %d.\n",
-				dev->name, dev->irq);
+		netdev_err(dev, "unable to get IRQ %d\n", dev->irq);
 		goto err;
 	}
 	/* Under the A/UX interrupt scheme, the onboard SONIC interrupt comes
@@ -155,8 +154,7 @@ static int macsonic_open(struct net_device* dev)
 		retval = request_irq(IRQ_NUBUS_9, macsonic_interrupt, 0,
 				     "sonic", dev);
 		if (retval) {
-			printk(KERN_ERR "%s: unable to get IRQ %d.\n",
-					dev->name, IRQ_NUBUS_9);
+			netdev_err(dev, "unable to get IRQ %d\n", IRQ_NUBUS_9);
 			goto err_irq;
 		}
 	}
@@ -277,11 +275,9 @@ static void mac_onboard_sonic_ethernet_addr(struct net_device *dev)
 		 * If we still have what seems to be a bogus address, we'll
 		 * look in the CAM. The top entry should be ours.
 		 */
-		printk(KERN_WARNING "macsonic: MAC address in PROM seems "
-		                    "to be invalid, trying CAM\n");
+		netdev_warn(dev, "MAC address in PROM seems to be invalid, trying CAM\n");
 	} else {
-		printk(KERN_WARNING "macsonic: cannot read MAC address from "
-		                    "PROM, trying CAM\n");
+		netdev_warn(dev, "cannot read MAC address from PROM, trying CAM\n");
 	}
 
 	/* This only works if MacOS has already initialized the card. */
@@ -304,8 +300,7 @@ static void mac_onboard_sonic_ethernet_addr(struct net_device *dev)
 
 	/* Still nonsense ... messed up someplace! */
 
-	printk(KERN_WARNING "macsonic: MAC address in CAM entry 15 "
-	                    "seems invalid, will use a random MAC\n");
+	netdev_warn(dev, "MAC address in CAM entry 15 seems invalid, will use a random MAC\n");
 	eth_hw_addr_random(dev);
 }
 
@@ -318,7 +313,7 @@ static int mac_onboard_sonic_probe(struct net_device *dev)
 	if (!MACH_IS_MAC)
 		return -ENODEV;
 
-	printk(KERN_INFO "Checking for internal Macintosh ethernet (SONIC).. ");
+	netdev_info(dev, "Checking for internal Macintosh ethernet (SONIC).. ");
 
 	/* Bogus probing, on the models which may or may not have
 	   Ethernet (BTW, the Ethernet *is* always at the same
@@ -336,7 +331,7 @@ static int mac_onboard_sonic_probe(struct net_device *dev)
 		local_irq_restore(flags);
 
 		if (!card_present) {
-			printk("none.\n");
+			printk("none\n");
 			return -ENODEV;
 		}
 		commslot = 1;
@@ -352,12 +347,10 @@ static int mac_onboard_sonic_probe(struct net_device *dev)
 	else
 		dev->irq = IRQ_NUBUS_9;
 
-	if (!sonic_version_printed) {
-		printk(KERN_INFO "%s", version);
-		sonic_version_printed = 1;
-	}
-	printk(KERN_INFO "%s: onboard / comm-slot SONIC at 0x%08lx\n",
-	       dev_name(lp->device), dev->base_addr);
+	pr_info_once("%s\n", version);
+
+	netdev_info(dev, "onboard / comm-slot SONIC at 0x%08lx\n",
+		    dev->base_addr);
 
 	/* The PowerBook's SONIC is 16 bit always. */
 	if (macintosh_config->ident == MAC_MODEL_PB520) {
@@ -388,13 +381,13 @@ static int mac_onboard_sonic_probe(struct net_device *dev)
 		lp->dma_bitmode = SONIC_BITMODE32;
 		sr = SONIC_READ(SONIC_SR);
 	}
-	printk(KERN_INFO
-	       "%s: revision 0x%04x, using %d bit DMA and register offset %d\n",
-	       dev_name(lp->device), sr, lp->dma_bitmode?32:16, lp->reg_offset);
+	netdev_info(dev, "revision 0x%04x, using %d bit DMA and register offset %d\n",
+		    sr, lp->dma_bitmode ? 32 : 16, lp->reg_offset);
 
 #if 0 /* This is sometimes useful to find out how MacOS configured the card. */
-	printk(KERN_INFO "%s: DCR: 0x%04x, DCR2: 0x%04x\n", dev_name(lp->device),
-	       SONIC_READ(SONIC_DCR) & 0xffff, SONIC_READ(SONIC_DCR2) & 0xffff);
+	netdev_info(dev, "DCR: 0x%04x, DCR2: 0x%04x\n",
+		    SONIC_READ(SONIC_DCR) & 0xffff,
+		    SONIC_READ(SONIC_DCR2) & 0xffff);
 #endif
 
 	/* Software reset, then initialize control registers. */
@@ -527,7 +520,7 @@ static int mac_nubus_sonic_probe(struct net_device *dev)
 		dma_bitmode = SONIC_BITMODE16;
 		break;
 	default:
-		printk(KERN_ERR "macsonic: WTF, id is %d\n", id);
+		netdev_err(dev, "WTF, id is %d\n", id);
 		return -ENODEV;
 	}
 
@@ -538,18 +531,17 @@ static int mac_nubus_sonic_probe(struct net_device *dev)
 	lp->dma_bitmode = dma_bitmode;
 	dev->irq = SLOT2IRQ(ndev->board->slot);
 
-	if (!sonic_version_printed) {
-		printk(KERN_INFO "%s", version);
-		sonic_version_printed = 1;
-	}
-	printk(KERN_INFO "%s: %s in slot %X\n",
-	       dev_name(lp->device), ndev->board->name, ndev->board->slot);
-	printk(KERN_INFO "%s: revision 0x%04x, using %d bit DMA and register offset %d\n",
-	       dev_name(lp->device), SONIC_READ(SONIC_SR), dma_bitmode?32:16, reg_offset);
+	pr_info_once("%s\n", version);
+
+	netdev_info(dev, "%s in slot %X\n",
+		    ndev->board->name, ndev->board->slot);
+	netdev_info(dev, "revision 0x%04x, using %d bit DMA and register offset %d\n",
+		    SONIC_READ(SONIC_SR), dma_bitmode ? 32 : 16, reg_offset);
 
 #if 0 /* This is sometimes useful to find out how MacOS configured the card. */
-	printk(KERN_INFO "%s: DCR: 0x%04x, DCR2: 0x%04x\n", dev_name(lp->device),
-	       SONIC_READ(SONIC_DCR) & 0xffff, SONIC_READ(SONIC_DCR2) & 0xffff);
+	netdev_info(dev, "DCR: 0x%04x, DCR2: 0x%04x\n",
+		    SONIC_READ(SONIC_DCR) & 0xffff,
+		    SONIC_READ(SONIC_DCR2) & 0xffff);
 #endif
 
 	/* Software reset, then initialize control registers. */
@@ -601,7 +593,7 @@ found:
 	if (err)
 		goto out;
 
-	printk("%s: MAC %pM IRQ %d\n", dev->name, dev->dev_addr, dev->irq);
+	netdev_info(dev, "MAC %pM IRQ %d\n", dev->dev_addr, dev->irq);
 
 	return 0;
 
diff --git a/drivers/net/ethernet/natsemi/sonic.h b/drivers/net/ethernet/natsemi/sonic.h
index 07091dd..6d31916 100644
--- a/drivers/net/ethernet/natsemi/sonic.h
+++ b/drivers/net/ethernet/natsemi/sonic.h
@@ -444,7 +444,6 @@ static inline __u16 sonic_rra_get(struct net_device* dev, int entry,
 			     (entry * SIZEOF_SONIC_RR) + offset);
 }
 
-static const char *version =
-    "sonic.c:v0.92 20.9.98 tsbogend@alpha.franken.de\n";
+static const char *version = "v0.92 20.9.98 tsbogend@alpha.franken.de";
 
 #endif /* SONIC_H */

^ permalink raw reply related

* Re: [PATCH net 1/2] qlcnic: Do not force adapter to perform LRO without destination IP check
From: Eric Dumazet @ 2013-10-26  4:17 UTC (permalink / raw)
  To: Shahed Shaikh; +Cc: davem, netdev, Dept_NX_Linux_NIC_Driver
In-Reply-To: <1382711917-28501-2-git-send-email-shahed.shaikh@qlogic.com>

On Fri, 2013-10-25 at 10:38 -0400, Shahed Shaikh wrote:
> From: Shahed Shaikh <shahed.shaikh@qlogic.com>
> 
> Forcing adapter to perform LRO without destination IP check
> degrades the performance.


Hmm... the performance, or does it allow two packets belonging to
different flows being wrongly aggregated ?

It looks a critical bug fix, not only a performance issue.

^ permalink raw reply

* Hi,
From: John Reynolds @ 2013-10-26  4:15 UTC (permalink / raw)
  To: netdev

Hi,
Is there a userspace API that can be used to determine if a network interface is a 802.1Q interface, and what its vlan id and parent interface is. it appears that SOICGIFPFLAGS is not supported and netdev_priv() is only available to drivers, or do I have to resort to parsing the /proc/net/* files ?

regards

^ permalink raw reply

* Re: [PATCH net-next] veth: extend features to support tunneling
From: Eric Dumazet @ 2013-10-26  4:13 UTC (permalink / raw)
  To: Alexei Starovoitov; +Cc: David Miller, Eric Dumazet, stephen, netdev
In-Reply-To: <CAMEtUuwgDpLFMj0DNx+c4QZz573rgmLiNcZOmHgou2A3684Shw@mail.gmail.com>

On Fri, 2013-10-25 at 19:22 -0700, Alexei Starovoitov wrote:
> On Fri, Oct 25, 2013 at 6:25 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> > From: Eric Dumazet <edumazet@google.com>
> >
> > While investigating on a recent vxlan regression, I found veth
> > was using a zero features set for vxlan tunnels.
> 
> oneliner can be better :)

Yes, I'll post the gso fix as well, of course ;)

> 
> > We have to segment GSO frames, copy the payload, and do the checksum.
> >
> > This patch brings a ~200% performance increase
> >
> > We probably have to add hw_enc_features support
> > on other virtual devices.
> >
> > Signed-off-by: Eric Dumazet <edumazet@google.com>
> > Cc: Alexei Starovoitov <ast@plumgrid.com>
> > ---
> 
> iperf over veth with gre/vxlan tunneling is now ~4Gbps. 20x gain as advertised.
> Thanks!

Wow, such a difference might show another bug somewhere else...

Thanks !

^ permalink raw reply

* Re: [PATCH net 0/2] qlcnic: Bug fixes
From: David Miller @ 2013-10-26  4:05 UTC (permalink / raw)
  To: shahed.shaikh; +Cc: netdev, Dept_NX_Linux_NIC_Driver
In-Reply-To: <1382711917-28501-1-git-send-email-shahed.shaikh@qlogic.com>

From: Shahed Shaikh <shahed.shaikh@qlogic.com>
Date: Fri, 25 Oct 2013 10:38:35 -0400

> From: Shahed Shaikh <shahed.shaikh@qlogic.com>
> 
> This patch series contains following fixes-
> * Performace drop because driver was forcing adapter not to check
>   destination IP for LRO.
> * driver was not issuing qlcnic_fw_cmd_set_drv_version() to 83xx adapter
>   becasue of improper handling of QLCNIC_FW_CAPABILITY_MORE_CAPS bit.
> 
> Please apply to net.

Applied, what exactly does that destination IP check do?

^ permalink raw reply

* Re: [PATCH 1/4] sctp: merge two if statements to one
From: wangweidong @ 2013-10-26  2:59 UTC (permalink / raw)
  To: Vlad Yasevich, davem, nhorman; +Cc: dingtianhong, linux-sctp, netdev
In-Reply-To: <526B016D.6030301@gmail.com>

On 2013/10/26 7:40, Vlad Yasevich wrote:
> On 10/24/2013 09:50 PM, Wang Weidong wrote:
>> Two if statements do the same work, maybe we can merge them to
>> one. There is just code simplification, no functional changes.
>>
>> Signed-off-by: Wang Weidong <wangweidong1@huawei.com>
>> ---
>>   net/sctp/auth.c | 12 ++++--------
>>   1 file changed, 4 insertions(+), 8 deletions(-)
>>
>> diff --git a/net/sctp/auth.c b/net/sctp/auth.c
>> index 8c4fa5d..19fb0ae 100644
>> --- a/net/sctp/auth.c
>> +++ b/net/sctp/auth.c
>> @@ -539,18 +539,14 @@ struct sctp_hmac *sctp_auth_asoc_get_hmac(const struct sctp_association *asoc)
>>       for (i = 0; i < n_elt; i++) {
>>           id = ntohs(hmacs->hmac_ids[i]);
>>
>> -        /* Check the id is in the supported range */
>> -        if (id > SCTP_AUTH_HMAC_ID_MAX) {
>> -            id = 0;
>> -            continue;
>> -        }
>> -
>> -        /* See is we support the id.  Supported IDs have name and
>> +        /* Check the id is in the supported range. And
>> +         * see is we support the id.  Supported IDs have name and
>>            * length fields set, so that we can allocated and use
>>            * them.  We can safely just check for name, for without the
>>            * name, we can't allocate the TFM.
>>                */
>> -        if (!sctp_hmac_list[id].hmac_name) {
>> +        if (id > SCTP_AUTH_HMAC_ID_MAX ||
>> +            !sctp_hmac_list[id].hmac_name) {
> 
> Can you please make the 2 parts of the 'if' statement above line up
> with each other instead of the code below.  I makes it easy to see what
> the whole 'if conditional' is.
> 
> Thanks
> -vlad
> 

Ok, I will resend a new version.
Thanks. 

>>               id = 0;
>>               continue;
>>           }
>>
> 
> 
> 

^ permalink raw reply

* Re: [PATCH 1/4] sctp: merge two if statements to one
From: wangweidong @ 2013-10-26  2:55 UTC (permalink / raw)
  To: Sergei Shtylyov, davem, nhorman, vyasevich
  Cc: dingtianhong, linux-sctp, netdev
In-Reply-To: <526A6C45.3080200@cogentembedded.com>

On 2013/10/25 21:04, Sergei Shtylyov wrote:
> Hello.
> 
> On 25-10-2013 5:50, Wang Weidong wrote:
> 
>> Two if statements do the same work, maybe we can merge them to
>> one. There is just code simplification, no functional changes.
> 
>> Signed-off-by: Wang Weidong <wangweidong1@huawei.com>
>> ---
>>   net/sctp/auth.c | 12 ++++--------
>>   1 file changed, 4 insertions(+), 8 deletions(-)
> 
>    I understand what I noticed below is not your typos but maybe it's time to fix them?

Yeah, I will fix them.
Thanks.

> 
>> diff --git a/net/sctp/auth.c b/net/sctp/auth.c
>> index 8c4fa5d..19fb0ae 100644
>> --- a/net/sctp/auth.c
>> +++ b/net/sctp/auth.c
>> @@ -539,18 +539,14 @@ struct sctp_hmac *sctp_auth_asoc_get_hmac(const struct sctp_association *asoc)
>>       for (i = 0; i < n_elt; i++) {
>>           id = ntohs(hmacs->hmac_ids[i]);
>>
>> -        /* Check the id is in the supported range */
>> -        if (id > SCTP_AUTH_HMAC_ID_MAX) {
>> -            id = 0;
>> -            continue;
>> -        }
>> -
>> -        /* See is we support the id.  Supported IDs have name and
>> +        /* Check the id is in the supported range. And
>> +         * see is we support the id.  Supported IDs have name and
> 
>    s/is/if/.
> 
>>            * length fields set, so that we can allocated and use
> 
>    s/allocated/allocate/.
> 
> WBR, Sergei
> 
> 
> 

^ permalink raw reply

* Re: [PATCH net-next] veth: extend features to support tunneling
From: Alexei Starovoitov @ 2013-10-26  2:22 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, Eric Dumazet, stephen, netdev
In-Reply-To: <1382750703.4032.32.camel@edumazet-glaptop.roam.corp.google.com>

On Fri, Oct 25, 2013 at 6:25 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> From: Eric Dumazet <edumazet@google.com>
>
> While investigating on a recent vxlan regression, I found veth
> was using a zero features set for vxlan tunnels.

oneliner can be better :)

> We have to segment GSO frames, copy the payload, and do the checksum.
>
> This patch brings a ~200% performance increase
>
> We probably have to add hw_enc_features support
> on other virtual devices.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Alexei Starovoitov <ast@plumgrid.com>
> ---

iperf over veth with gre/vxlan tunneling is now ~4Gbps. 20x gain as advertised.
Thanks!

^ permalink raw reply

* Re: [PATCH net-next] tcp: gso: fix truesize tracking
From: Alexei Starovoitov @ 2013-10-26  1:46 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, Eric Dumazet, stephen, netdev
In-Reply-To: <1382747177.4032.21.camel@edumazet-glaptop.roam.corp.google.com>

On Fri, Oct 25, 2013 at 5:26 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> From: Eric Dumazet <edumazet@google.com>
>
> commit 6ff50cd55545 ("tcp: gso: do not generate out of order packets")
> had an heuristic that can trigger a warning in skb_try_coalesce(),
> because skb->truesize of the gso segments were exactly set to mss.
>
> This breaks the requirement that
>
> skb->truesize >= skb->len + truesizeof(struct sk_buff);
>
> It can trivially be reproduced by :
>
> ifconfig lo mtu 1500
> ethtool -K lo tso off
> netperf
>
> As the skbs are looped into the TCP networking stack, skb_try_coalesce()
> warns us of these skb under-estimating their truesize.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Reported-by: Alexei Starovoitov <ast@plumgrid.com>
> ---

it fixed it in my setup. Thanks!

^ permalink raw reply

* [PATCH net-next] veth: extend features to support tunneling
From: Eric Dumazet @ 2013-10-26  1:25 UTC (permalink / raw)
  To: Alexei Starovoitov; +Cc: David Miller, Eric Dumazet, stephen, netdev
In-Reply-To: <1382748742.4032.24.camel@edumazet-glaptop.roam.corp.google.com>

From: Eric Dumazet <edumazet@google.com>

While investigating on a recent vxlan regression, I found veth
was using a zero features set for vxlan tunnels.

We have to segment GSO frames, copy the payload, and do the checksum.

This patch brings a ~200% performance increase

We probably have to add hw_enc_features support
on other virtual devices.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
---
 drivers/net/veth.c |    3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/veth.c b/drivers/net/veth.c
index b2d0347..b24db7a 100644
--- a/drivers/net/veth.c
+++ b/drivers/net/veth.c
@@ -261,6 +261,8 @@ static const struct net_device_ops veth_netdev_ops = {
 
 #define VETH_FEATURES (NETIF_F_SG | NETIF_F_FRAGLIST | NETIF_F_ALL_TSO |    \
 		       NETIF_F_HW_CSUM | NETIF_F_RXCSUM | NETIF_F_HIGHDMA | \
+		       NETIF_F_GSO_GRE | NETIF_F_GSO_UDP_TUNNEL |	    \
+		       NETIF_F_GSO_IPIP | NETIF_F_GSO_SIT | NETIF_F_UFO	|   \
 		       NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_CTAG_RX | \
 		       NETIF_F_HW_VLAN_STAG_TX | NETIF_F_HW_VLAN_STAG_RX )
 
@@ -279,6 +281,7 @@ static void veth_setup(struct net_device *dev)
 	dev->destructor = veth_dev_free;
 
 	dev->hw_features = VETH_FEATURES;
+	dev->hw_enc_features = VETH_FEATURES;
 }
 
 /*

^ permalink raw reply related

* Re: vxlan gso is broken by stackable gso_segment()
From: Eric Dumazet @ 2013-10-26  0:52 UTC (permalink / raw)
  To: Alexei Starovoitov; +Cc: David Miller, Eric Dumazet, stephen, netdev
In-Reply-To: <1382743502.4032.6.camel@edumazet-glaptop.roam.corp.google.com>

On Fri, 2013-10-25 at 16:25 -0700, Eric Dumazet wrote:

> Please note the original performance is not that good, you mentioned 230
> Mbps on lxc, while I get more than 5Gb/s on a 10G link.
> 
> This should be investigated ...

This is probably trivial to increase performance :

veth currently do not support any kind of tunneling TSO :

tx-fcoe-segmentation: off [fixed]
tx-gre-segmentation: off [fixed]
tx-ipip-segmentation: off [fixed]
tx-sit-segmentation: off [fixed]
tx-udp_tnl-segmentation: off [fixed]
tx-mpls-segmentation: off [fixed]

I'll submit a patch for net-next

^ permalink raw reply

* [PATCH net-next] tcp: gso: fix truesize tracking
From: Eric Dumazet @ 2013-10-26  0:26 UTC (permalink / raw)
  To: Alexei Starovoitov; +Cc: David Miller, Eric Dumazet, stephen, netdev
In-Reply-To: <1382743502.4032.6.camel@edumazet-glaptop.roam.corp.google.com>

From: Eric Dumazet <edumazet@google.com>

commit 6ff50cd55545 ("tcp: gso: do not generate out of order packets")
had an heuristic that can trigger a warning in skb_try_coalesce(),
because skb->truesize of the gso segments were exactly set to mss.

This breaks the requirement that

skb->truesize >= skb->len + truesizeof(struct sk_buff);

It can trivially be reproduced by :

ifconfig lo mtu 1500
ethtool -K lo tso off
netperf

As the skbs are looped into the TCP networking stack, skb_try_coalesce()
warns us of these skb under-estimating their truesize.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Alexei Starovoitov <ast@plumgrid.com>
---
Based on net-next but applies as well on net tree with some fuzz.

David, we might backport this one to 3.12 and stable later, I let you
decide.

 net/ipv4/tcp_offload.c |   13 +++++--------
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/net/ipv4/tcp_offload.c b/net/ipv4/tcp_offload.c
index a7a5583e..a2b68a1 100644
--- a/net/ipv4/tcp_offload.c
+++ b/net/ipv4/tcp_offload.c
@@ -18,6 +18,7 @@ struct sk_buff *tcp_gso_segment(struct sk_buff *skb,
 				netdev_features_t features)
 {
 	struct sk_buff *segs = ERR_PTR(-EINVAL);
+	unsigned int sum_truesize = 0;
 	struct tcphdr *th;
 	unsigned int thlen;
 	unsigned int seq;
@@ -104,13 +105,7 @@ struct sk_buff *tcp_gso_segment(struct sk_buff *skb,
 		if (copy_destructor) {
 			skb->destructor = gso_skb->destructor;
 			skb->sk = gso_skb->sk;
-			/* {tcp|sock}_wfree() use exact truesize accounting :
-			 * sum(skb->truesize) MUST be exactly be gso_skb->truesize
-			 * So we account mss bytes of 'true size' for each segment.
-			 * The last segment will contain the remaining.
-			 */
-			skb->truesize = mss;
-			gso_skb->truesize -= mss;
+			sum_truesize += skb->truesize;
 		}
 		skb = skb->next;
 		th = tcp_hdr(skb);
@@ -127,7 +122,9 @@ struct sk_buff *tcp_gso_segment(struct sk_buff *skb,
 	if (copy_destructor) {
 		swap(gso_skb->sk, skb->sk);
 		swap(gso_skb->destructor, skb->destructor);
-		swap(gso_skb->truesize, skb->truesize);
+		sum_truesize += skb->truesize;
+		atomic_add(sum_truesize - gso_skb->truesize,
+			   &skb->sk->sk_wmem_alloc);
 	}
 
 	delta = htonl(oldlen + (skb_tail_pointer(skb) -

^ permalink raw reply related

* Re: [PATCH 4/4] sctp: fix comment in chunk.c
From: Vlad Yasevich @ 2013-10-25 23:45 UTC (permalink / raw)
  To: Wang Weidong, davem, nhorman; +Cc: dingtianhong, linux-sctp, netdev
In-Reply-To: <1382665805-13952-5-git-send-email-wangweidong1@huawei.com>

On 10/24/2013 09:50 PM, Wang Weidong wrote:
> fix a spelling
>
> Signed-off-by: Wang Weidong <wangweidong1@huawei.com>

You could have combined these spelling fixes into 1 patch.

Acked-by: Vlad Yasevich <vyasevich@gmail.com>

-vlad

> ---
>   net/sctp/chunk.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/net/sctp/chunk.c b/net/sctp/chunk.c
> index 7bd5ed4..f2044fc 100644
> --- a/net/sctp/chunk.c
> +++ b/net/sctp/chunk.c
> @@ -201,7 +201,7 @@ struct sctp_datamsg *sctp_datamsg_from_user(struct sctp_association *asoc,
>
>   	max = asoc->frag_point;
>   	/* If the the peer requested that we authenticate DATA chunks
> -	 * we need to accound for bundling of the AUTH chunks along with
> +	 * we need to account for bundling of the AUTH chunks along with
>   	 * DATA.
>   	 */
>   	if (sctp_auth_send_cid(SCTP_CID_DATA, asoc)) {
>

^ permalink raw reply

* Re: [PATCH 3/4] sctp: fix some comments in associola.c
From: Vlad Yasevich @ 2013-10-25 23:43 UTC (permalink / raw)
  To: Wang Weidong, davem, nhorman; +Cc: dingtianhong, linux-sctp, netdev
In-Reply-To: <1382665805-13952-4-git-send-email-wangweidong1@huawei.com>

On 10/24/2013 09:50 PM, Wang Weidong wrote:
> fix some spellings
>
> Signed-off-by: Wang Weidong <wangweidong1@huawei.com>

Acked-by: Vlad Yasevich <vyasevich@gmail.com>

-vlad

> ---
>   net/sctp/associola.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/net/sctp/associola.c b/net/sctp/associola.c
> index cef5099..c9b91cb 100644
> --- a/net/sctp/associola.c
> +++ b/net/sctp/associola.c
> @@ -602,7 +602,7 @@ void sctp_assoc_rm_peer(struct sctp_association *asoc,
>
>   		/* Start a T3 timer here in case it wasn't running so
>   		 * that these migrated packets have a chance to get
> -		 * retrnasmitted.
> +		 * retransmitted.
>   		 */
>   		if (!timer_pending(&active->T3_rtx_timer))
>   			if (!mod_timer(&active->T3_rtx_timer,
> @@ -665,7 +665,7 @@ struct sctp_transport *sctp_assoc_add_peer(struct sctp_association *asoc,
>   	/* Set the path max_retrans.  */
>   	peer->pathmaxrxt = asoc->pathmaxrxt;
>
> -	/* And the partial failure retrnas threshold */
> +	/* And the partial failure retrans threshold */
>   	peer->pf_retrans = asoc->pf_retrans;
>
>   	/* Initialize the peer's SACK delay timeout based on the
>

^ permalink raw reply

* Re: [PATCH 2/4] sctp: remove the repeat initialize with 0
From: Vlad Yasevich @ 2013-10-25 23:42 UTC (permalink / raw)
  To: Wang Weidong, davem, nhorman; +Cc: dingtianhong, linux-sctp, netdev
In-Reply-To: <1382665805-13952-3-git-send-email-wangweidong1@huawei.com>

On 10/24/2013 09:50 PM, Wang Weidong wrote:
> kmem_cache_zalloc had set the allocated memory to zero. I think no need
> to initialize with 0. And move the comments to the function begin.
>
> Signed-off-by: Wang Weidong <wangweidong1@huawei.com>


Yes, thank you.

Acked-by: Vlad Yasevich <vyasevich@gmail.com>

-vlad

> ---
>   net/sctp/sm_make_chunk.c | 29 ++++++++---------------------
>   1 file changed, 8 insertions(+), 21 deletions(-)
>
> diff --git a/net/sctp/sm_make_chunk.c b/net/sctp/sm_make_chunk.c
> index d244a23..fe69032 100644
> --- a/net/sctp/sm_make_chunk.c
> +++ b/net/sctp/sm_make_chunk.c
> @@ -1297,6 +1297,13 @@ struct sctp_chunk *sctp_make_auth(const struct sctp_association *asoc)
>
>   /* Turn an skb into a chunk.
>    * FIXME: Eventually move the structure directly inside the skb->cb[].
> + *
> + * sctpimpguide-05.txt Section 2.8.2
> + * M1) Each time a new DATA chunk is transmitted
> + * set the 'TSN.Missing.Report' count for that TSN to 0. The
> + * 'TSN.Missing.Report' count will be used to determine missing chunks
> + * and when to fast retransmit.
> + *
>    */
>   struct sctp_chunk *sctp_chunkify(struct sk_buff *skb,
>   			    const struct sctp_association *asoc,
> @@ -1314,29 +1321,9 @@ struct sctp_chunk *sctp_chunkify(struct sk_buff *skb,
>   	INIT_LIST_HEAD(&retval->list);
>   	retval->skb		= skb;
>   	retval->asoc		= (struct sctp_association *)asoc;
> -	retval->has_tsn		= 0;
> -	retval->has_ssn         = 0;
> -	retval->rtt_in_progress	= 0;
> -	retval->sent_at		= 0;
>   	retval->singleton	= 1;
> -	retval->end_of_packet	= 0;
> -	retval->ecn_ce_done	= 0;
> -	retval->pdiscard	= 0;
> -
> -	/* sctpimpguide-05.txt Section 2.8.2
> -	 * M1) Each time a new DATA chunk is transmitted
> -	 * set the 'TSN.Missing.Report' count for that TSN to 0. The
> -	 * 'TSN.Missing.Report' count will be used to determine missing chunks
> -	 * and when to fast retransmit.
> -	 */
> -	retval->tsn_missing_report = 0;
> -	retval->tsn_gap_acked = 0;
> -	retval->fast_retransmit = SCTP_CAN_FRTX;
>
> -	/* If this is a fragmented message, track all fragments
> -	 * of the message (for SEND_FAILED).
> -	 */
> -	retval->msg = NULL;
> +	retval->fast_retransmit = SCTP_CAN_FRTX;
>
>   	/* Polish the bead hole.  */
>   	INIT_LIST_HEAD(&retval->transmitted_list);
>

^ permalink raw reply

* Re: [PATCH 1/4] sctp: merge two if statements to one
From: Vlad Yasevich @ 2013-10-25 23:40 UTC (permalink / raw)
  To: Wang Weidong, davem, nhorman; +Cc: dingtianhong, linux-sctp, netdev
In-Reply-To: <1382665805-13952-2-git-send-email-wangweidong1@huawei.com>

On 10/24/2013 09:50 PM, Wang Weidong wrote:
> Two if statements do the same work, maybe we can merge them to
> one. There is just code simplification, no functional changes.
>
> Signed-off-by: Wang Weidong <wangweidong1@huawei.com>
> ---
>   net/sctp/auth.c | 12 ++++--------
>   1 file changed, 4 insertions(+), 8 deletions(-)
>
> diff --git a/net/sctp/auth.c b/net/sctp/auth.c
> index 8c4fa5d..19fb0ae 100644
> --- a/net/sctp/auth.c
> +++ b/net/sctp/auth.c
> @@ -539,18 +539,14 @@ struct sctp_hmac *sctp_auth_asoc_get_hmac(const struct sctp_association *asoc)
>   	for (i = 0; i < n_elt; i++) {
>   		id = ntohs(hmacs->hmac_ids[i]);
>
> -		/* Check the id is in the supported range */
> -		if (id > SCTP_AUTH_HMAC_ID_MAX) {
> -			id = 0;
> -			continue;
> -		}
> -
> -		/* See is we support the id.  Supported IDs have name and
> +		/* Check the id is in the supported range. And
> +		 * see is we support the id.  Supported IDs have name and
>   		 * length fields set, so that we can allocated and use
>   		 * them.  We can safely just check for name, for without the
>   		 * name, we can't allocate the TFM.
>   		 	*/
> -		if (!sctp_hmac_list[id].hmac_name) {
> +		if (id > SCTP_AUTH_HMAC_ID_MAX ||
> +			!sctp_hmac_list[id].hmac_name) {

Can you please make the 2 parts of the 'if' statement above line up
with each other instead of the code below.  I makes it easy to see what
the whole 'if conditional' is.

Thanks
-vlad

>   			id = 0;
>   			continue;
>   		}
>

^ permalink raw reply

* Re: [PATCH net 2/2] ipv6: ip6_dst_check needs to check for expired dst_entries
From: David Miller @ 2013-10-25 23:27 UTC (permalink / raw)
  To: hannes; +Cc: netdev, sgunderson, valentyn, yoshfuji
In-Reply-To: <20131024054824.GA15744@order.stressinduktion.org>

From: Hannes Frederic Sowa <hannes@stressinduktion.org>
Date: Thu, 24 Oct 2013 07:48:24 +0200

> On receiving a packet too big icmp error we check if our current cached
> dst_entry in the socket is still valid. This validation check did not
> care about the expiration of the (cached) route.
> 
> The error path I traced down:
> The socket receives a packet too big mtu notification. It still has a
> valid dst_entry and thus issues the ip6_rt_pmtu_update on this dst_entry,
> setting RTF_EXPIRE and updates the dst.expiration value (which could
> fail because of not up-to-date expiration values, see previous patch).
> 
> In some seldom cases we race with a) the ip6_fib gc or b) another routing
> lookup which would result in a recreation of the cached rt6_info from its
> parent non-cached rt6_info. While copying the rt6_info we reinitialize the
> metrics store by copying it over from the parent thus invalidating the
> just installed pmtu update (both dsts use the same key to the inetpeer
> storage). The dst_entry with the just invalidated metrics data would
> just get its RTF_EXPIRES flag cleared and would continue to stay valid
> for the socket.
> 
> We should have not issued the pmtu update on the already expired dst_entry
> in the first placed. By checking the expiration on the dst entry and
> doing a relookup in case it is out of date we close the race because
> we would install a new rt6_info into the fib before we issue the pmtu
> update, thus closing this race.
> 
> Not reliably updating the dst.expire value was fixed by the patch "ipv6:
> reset dst.expires value when clearing expire flag".
> 
> Reported-by: Steinar H. Gunderson <sgunderson@bigfoot.com>
> Reported-by: Valentijn Sessink <valentyn@blub.net>
> Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
> ---
> I would propose this patch for -stable.

Applied and queued up for -stable.

^ permalink raw reply

* Re: [PATCH net v2 1/2] ipv6: reset dst.expires value when clearing expire flag
From: David Miller @ 2013-10-25 23:27 UTC (permalink / raw)
  To: hannes; +Cc: netdev, sgunderson, valentyn, yoshfuji, edumazet
In-Reply-To: <20131024081427.GC15744@order.stressinduktion.org>

From: Hannes Frederic Sowa <hannes@stressinduktion.org>
Date: Thu, 24 Oct 2013 10:14:27 +0200

> On receiving a packet too big icmp error we update the expire value by
> calling rt6_update_expires. This function uses dst_set_expires which is
> implemented that it can only reduce the expiration value of the dst entry.
> 
> If we insert new routing non-expiry information into the ipv6 fib where
> we already have a matching rt6_info we only clear the RTF_EXPIRES flag
> in rt6i_flags and leave the dst.expires value as is.
> 
> When new mtu information arrives for that cached dst_entry we again
> call dst_set_expires. This time it won't update the dst.expire value
> because we left the dst.expire value intact from the last update. So
> dst_set_expires won't touch dst.expires.
> 
> Fix this by resetting dst.expires when clearing the RTF_EXPIRE flag.
> dst_set_expires checks for a zero expiration and updates the
> dst.expires.
> 
> In the past this (not updating dst.expires) was necessary because
> dst.expire was placed in a union with the dst_entry *from reference
> and rt6_clean_expires did assign NULL to it. This split happend in
> ecd9883724b78cc72ed92c98bcb1a46c764fff21 ("ipv6: fix race condition
> regarding dst->expires and dst->from").
> 
> Reported-by: Steinar H. Gunderson <sgunderson@bigfoot.com>
> Reported-by: Valentijn Sessink <valentyn@blub.net>
> Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
> Acked-by: Eric Dumazet <edumazet@google.com>
> Tested-by: Valentijn Sessink <valentyn@blub.net>
> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>

Applied.

^ permalink raw reply

* Re: [PATCH 1/1] ax88179_178a: Remove AX_MEDIUM_ALWAYS_ONE bit in AX_MEDIUM_STATUS_MODE register to avoid TX throttling
From: David Miller @ 2013-10-25 23:27 UTC (permalink / raw)
  To: freddy; +Cc: linux-usb, linux-kernel, netdev, allan, louis
In-Reply-To: <1382597905-2393-1-git-send-email-freddy@asix.com.tw>

From: freddy@asix.com.tw
Date: Thu, 24 Oct 2013 14:58:25 +0800

> From: Freddy Xin <freddy@asix.com.tw>
> 
> Remove AX_MEDIUM_ALWAYS_ONE in AX_MEDIUM_STATUS_MODE register.
> Setting this bit may cause TX throttling in Half-Duplex mode.
> 
> Signed-off-by: Freddy Xin <freddy@asix.com.tw>

Applied.

^ permalink raw reply

* Re: [PATCHv2 net] netpoll: fix rx_hook() interface by passing the skb
From: David Miller @ 2013-10-25 23:27 UTC (permalink / raw)
  To: antonio; +Cc: netdev, David.Laight
In-Reply-To: <1382564190-334-1-git-send-email-antonio@meshcoding.com>

From: Antonio Quartulli <antonio@meshcoding.com>
Date: Wed, 23 Oct 2013 23:36:30 +0200

> Right now skb->data is passed to rx_hook() even if the skb
> has not been linearised and without giving rx_hook() a way
> to linearise it.
> 
> Change the rx_hook() interface and make it accept the skb
> and the offset to the UDP payload as arguments. rx_hook() is
> also renamed to rx_skb_hook() to ensure that out of the tree
> users notice the API change.
> 
> In this way any rx_skb_hook() implementation can perform all
> the needed operations to properly (and safely) access the
> skb data.
> 
> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>

Applied.

^ permalink raw reply

* Re: [PATCH] net: wan: sbni: remove assembly crc32 code
From: David Miller @ 2013-10-25 23:27 UTC (permalink / raw)
  To: sebastian; +Cc: andi, akpm, linux-kernel, ak, netdev
In-Reply-To: <20131022183625.GA4382@breakpoint.cc>

From: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Date: Tue, 22 Oct 2013 20:36:25 +0200

> There is also a C function doing the same thing. Unless the asm code is
> 110% faster we could stick to the C function.
> 
> Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>

Applied.

^ permalink raw reply

* Re: vxlan gso is broken by stackable gso_segment()
From: Eric Dumazet @ 2013-10-25 23:25 UTC (permalink / raw)
  To: Alexei Starovoitov; +Cc: David Miller, Eric Dumazet, stephen, netdev
In-Reply-To: <CAMEtUuxwi05oX0TqNdMbjy04MT=ZLemLXJQTTcAMrnWuhXLKdQ@mail.gmail.com>

On Fri, 2013-10-25 at 15:41 -0700, Alexei Starovoitov wrote:

> 'bool tunnel' actually still used to indicate encap_level > 0
> 

Yes, I am studying if the setting of skb->encapsulation = 1 was really
needed in the :

if (tunnel) {
     skb_reset_inner_headers(skb);
     skb->encapsulation = 1;
}

And was planning to rename 'bool tunnel' by 'bool stacked' or
something... 

> Eric's fix brings back performance for vxlan and gre keeps working. Thx!

Please note the original performance is not that good, you mentioned 230
Mbps on lxc, while I get more than 5Gb/s on a 10G link.

This should be investigated ...

> 
> net/core/skbuff.c:3474 skb_try_coalesce() warning, I mentioned before,
> is unrelated.
> I still see it with this patch. Running either gre or vxlan tunnels.

I think this might be related to commit 6ff50cd55545 ("tcp: gso: do not
generate out of order packets")

I'll investigate this as well, thanks.

^ permalink raw reply

* Re: [PATCH] ibm emac: Don't call napi_complete if napi_reschedule failed
From: David Miller @ 2013-10-25 23:13 UTC (permalink / raw)
  To: alistair; +Cc: netdev, benh
In-Reply-To: <1382585780-13244-1-git-send-email-alistair@popple.id.au>

From: Alistair Popple <alistair@popple.id.au>
Date: Thu, 24 Oct 2013 14:36:20 +1100

> This patch fixes a bug which would trigger the BUG_ON() at
> net/core/dev.c:4156. It was found that this was due to napi_complete
> being called even when the corresponding call to napi_reschedule had
> failed.
> 
> This patch ensures that we only contine processing rotting packets in
> the current mal_poll call if we are not already on the polling list.
> It also adds locking calls to protect the read-write-modify of the
> MAL_CFG DCR.
> 
> Signed-off-by: Alistair Popple <alistair@popple.id.au>

The mistaken napi_complete() call and the locking change are two different
bugs, fix them in two different patches.

^ permalink raw reply

* Re: vxlan gso is broken by stackable gso_segment()
From: David Miller @ 2013-10-25 23:10 UTC (permalink / raw)
  To: ast; +Cc: eric.dumazet, edumazet, stephen, netdev
In-Reply-To: <CAMEtUuxwi05oX0TqNdMbjy04MT=ZLemLXJQTTcAMrnWuhXLKdQ@mail.gmail.com>

From: Alexei Starovoitov <ast@plumgrid.com>
Date: Fri, 25 Oct 2013 15:41:47 -0700

> 'bool tunnel' actually still used to indicate encap_level > 0

Good catch, I misread the code.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox