Netdev List
 help / color / mirror / Atom feed
* [GIT] Networking
From: David Miller @ 2014-12-03  4:39 UTC (permalink / raw)
  To: torvalds; +Cc: akpm, netdev, linux-kernel


1) Fill in ethtool link parameters for all link types in cxgb4,
   from Hariprasad Shenai.

2) Fix probe regressions in stmmac driver, from Huacai Chen.

3) Network namespace leaks on errirs in rtnetlink, from Nicolas
   Dichtel.

4) Remove erroneous BUG check which can actually trigger
   legitimately, in xen-netfront.  From Seth Forshee.

5) Validate length of IFLA_BOND_ARP_IP_TARGET netlink
   attributes, from Thomas Grag.

Please pull, thanks a lot.

The following changes since commit 7a5a4f978750756755dc839014e13d1b088ccc8e:

  Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip (2014-11-29 10:49:24 -0800)

are available in the git repository at:


  git://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git master

for you to fetch changes up to 4c2d518695338801110bc166eece6aa02822b0b4:

  cxgb4: Fill in supported link mode for SFP modules (2014-12-02 19:57:49 -0800)

----------------------------------------------------------------
Hariprasad Shenai (1):
      cxgb4: Fill in supported link mode for SFP modules

Huacai Chen (1):
      stmmac: platform: Move plat_dat checking earlier

Mitsuhiro Kimura (2):
      sh_eth: Fix skb alloc size and alignment adjust rule.
      sh_eth: Fix sleeping function called from invalid context

Nicolas Dichtel (1):
      rtnetlink: release net refcnt on error in do_setlink()

Seth Forshee (1):
      xen-netfront: Remove BUGs on paged skb data which crosses a page boundary

Thomas Graf (1):
      bond: Check length of IFLA_BOND_ARP_IP_TARGET attributes

 drivers/net/bonding/bond_netlink.c                    |    7 ++++++-
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c       |    8 ++++++--
 drivers/net/ethernet/renesas/sh_eth.c                 |   96 ++++++++++++++++++++++++++++++++++++++++++++++++------------------------------------------------
 drivers/net/ethernet/renesas/sh_eth.h                 |    5 +++--
 drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c |   18 +++++++++---------
 drivers/net/xen-netfront.c                            |    5 -----
 net/core/rtnetlink.c                                  |    1 +
 7 files changed, 73 insertions(+), 67 deletions(-)

^ permalink raw reply

* RE: [PATCH net] gso: do GSO for local skb with size bigger than MTU
From: Du, Fan @ 2014-12-03  4:50 UTC (permalink / raw)
  To: David Miller; +Cc: netdev@vger.kernel.org, fw@strlen.de, Du, Fan
In-Reply-To: <20141202.203511.1346917579566370675.davem@davemloft.net>



>-----Original Message-----
>From: David Miller [mailto:davem@davemloft.net]
>Sent: Wednesday, December 3, 2014 12:35 PM
>To: Du, Fan
>Cc: netdev@vger.kernel.org; fw@strlen.de
>Subject: Re: [PATCH net] gso: do GSO for local skb with size bigger than MTU
>
>From: "Du, Fan" <fan.du@intel.com>
>Date: Wed, 3 Dec 2014 03:32:46 +0000
>
>>>If guest sees a 1500 byte MTU, that's it's link layer MTU and it had
>>>better be able to send 1500 byte packets onto the "wire".
>>
>> This patch makes it happens exactly as you putted.
>>
>>>If you cannot properly propagate the vxlan encapsulation overhead back
>>>into the guest's MTU you must hide this problem from the rest of our stack
>somehow.
>>
>> Again, this patch hide this problem to make Guest feel it can send packet with
>MTU as 1500 bytes.
>
>I said make the guest see the real MTU, not hide the real MTU by fragmenting or
>spitting ICMP PMTU messages back.

Do you have any better idea to achieve what you said besides this patch approach
without both fragmentation and ICMP message at the same time to cater for all kinds
tunnel tech?

^ permalink raw reply

* Re: tun issue after e0b46d0ee9c: tun: Use iovec iterators
From: David Miller @ 2014-12-03  4:54 UTC (permalink / raw)
  To: herbert; +Cc: mleitner, netdev
In-Reply-To: <20141130100331.GA10050@gondor.apana.org.au>

From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Sun, 30 Nov 2014 18:03:31 +0800

> On Sat, Nov 29, 2014 at 07:59:35AM +0800, Herbert Xu wrote:
>> On Fri, Nov 28, 2014 at 05:25:27PM -0200, Marcelo Ricardo Leitner wrote:
>> > 
>> > I saw there are tun updates on Dave's queue but none seemed to handle this.
>> > 
>> > I can't use current net-next
>> > (799d2fff1858004526ad75d66a5dd8a5cce6ad40) on a kvm hypervisor
>> > because tun got clogged somehow. Bisected down to:
>> > 
>> > commit e0b46d0ee9c240c7430a47e9b0365674d4a04522
>> > Author: Herbert Xu <herbert@gondor.apana.org.au>
>> > Date:   Fri Nov 7 21:22:23 2014 +0800
>> 
>> Does this patch help?
> 
> Oops, there was an embarrassing typo in the patch which causes
> it to not even build.  Here is the corrected version.
> 
> -- >8 --
> Subject: tun: Fix GSO meta-data handling in tun_get_user
>     
> When we write the GSO meta-data in tun_get_user we end up advancing
> the IO vector twice, thus exhausting the user buffer before we can
> finish writing the packet.
>     
> Fixes: f5ff53b4d97c ("{macvtap,tun}_get_user(): switch to iov_iter")
> Reported-by: Marcelo Ricardo Leitner <mleitner@redhat.com>
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

Applied, thanks.

^ permalink raw reply

* Re: [PATCH net] gso: do GSO for local skb with size bigger than MTU
From: David Miller @ 2014-12-03  5:14 UTC (permalink / raw)
  To: fan.du; +Cc: netdev, fw
In-Reply-To: <5A90DA2E42F8AE43BC4A093BF0678848DEE0DC@SHSMSX104.ccr.corp.intel.com>

From: "Du, Fan" <fan.du@intel.com>
Date: Wed, 3 Dec 2014 04:50:21 +0000

> Do you have any better idea to achieve what you said besides this patch approach
> without both fragmentation and ICMP message at the same time to cater for all kinds
> tunnel tech?

I am not obligated to figure out for you how to design a correctly
implemented patch.

But I am obligated to keep a bad change from going into the tree and
that is what I am doing.

^ permalink raw reply

* [PATCH net-next] r8152: reduce memory copy for rx
From: Hayes Wang @ 2014-12-03  5:14 UTC (permalink / raw)
  To: netdev; +Cc: nic_swsd, linux-kernel, linux-usb, Hayes Wang

If the data size is more than half of the AGG_BUG_SZ, allocate a new
rx buffer and use skb_clone() to avoid the memory copy.

The original method is that allocate the memory and copy data for each
packet in a rx buffer. The new one is that when the data size for a rx
buffer is more than RX_THRESHOLD_CLONED, allocate a new rx buffer and
use skb_clone for each packet in the rx buffer. According to the
experiment, the new mothod has better performance.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
---
 drivers/net/usb/r8152.c | 110 +++++++++++++++++++++++++++++++++---------------
 1 file changed, 77 insertions(+), 33 deletions(-)

diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c
index 4a9ece0..e44b9fb 100644
--- a/drivers/net/usb/r8152.c
+++ b/drivers/net/usb/r8152.c
@@ -26,7 +26,7 @@
 #include <linux/mdio.h>
 
 /* Version Information */
-#define DRIVER_VERSION "v1.07.0 (2014/10/09)"
+#define DRIVER_VERSION "v1.08.0 (2014/11/27)"
 #define DRIVER_AUTHOR "Realtek linux nic maintainers <nic_swsd@realtek.com>"
 #define DRIVER_DESC "Realtek RTL8152/RTL8153 Based USB Ethernet Adapters"
 #define MODULENAME "r8152"
@@ -447,6 +447,8 @@ enum rtl_register_content {
 #define RTL8152_RMS		(VLAN_ETH_FRAME_LEN + VLAN_HLEN)
 #define RTL8153_RMS		RTL8153_MAX_PACKET
 #define RTL8152_TX_TIMEOUT	(5 * HZ)
+#define AGG_BUF_SZ		16384 /* 16K */
+#define RX_THRESHOLD_CLONED	(AGG_BUF_SZ / 2)
 
 /* rtl8152 flags */
 enum rtl8152_flags {
@@ -534,8 +536,7 @@ struct rx_agg {
 	struct list_head list;
 	struct urb *urb;
 	struct r8152 *context;
-	void *buffer;
-	void *head;
+	struct sk_buff *skb;
 };
 
 struct tx_agg {
@@ -605,9 +606,8 @@ enum tx_csum_stat {
  * The RTL chips use a 64 element hash table based on the Ethernet CRC.
  */
 static const int multicast_filter_limit = 32;
-static unsigned int agg_buf_sz = 16384;
 
-#define RTL_LIMITED_TSO_SIZE	(agg_buf_sz - sizeof(struct tx_desc) - \
+#define RTL_LIMITED_TSO_SIZE	(AGG_BUF_SZ - sizeof(struct tx_desc) - \
 				 VLAN_ETH_HLEN - VLAN_HLEN)
 
 static
@@ -1210,9 +1210,8 @@ static void free_all_mem(struct r8152 *tp)
 		usb_free_urb(tp->rx_info[i].urb);
 		tp->rx_info[i].urb = NULL;
 
-		kfree(tp->rx_info[i].buffer);
-		tp->rx_info[i].buffer = NULL;
-		tp->rx_info[i].head = NULL;
+		dev_kfree_skb(tp->rx_info[i].skb);
+		tp->rx_info[i].skb = NULL;
 	}
 
 	for (i = 0; i < RTL8152_MAX_TX; i++) {
@@ -1231,6 +1230,31 @@ static void free_all_mem(struct r8152 *tp)
 	tp->intr_buff = NULL;
 }
 
+static struct sk_buff *rtl_alloc_rx_skb(struct r8152 *tp, gfp_t gfp_mask)
+{
+	struct net_device *netdev = tp->netdev;
+	struct sk_buff *skb;
+
+	skb = __netdev_alloc_skb(netdev, AGG_BUF_SZ, gfp_mask);
+	if (!skb)
+		goto out1;
+
+	if (skb->data != rx_agg_align(skb->data)) {
+		int rl;
+
+		dev_kfree_skb_any(skb);
+		skb = __netdev_alloc_skb(netdev, AGG_BUF_SZ + RX_ALIGN,
+					 gfp_mask);
+		if (!skb)
+			goto out1;
+
+		rl = (int)(rx_agg_align(skb->data) - (void *)skb->data);
+		skb_reserve(skb, rl);
+	}
+out1:
+	return skb;
+}
+
 static int alloc_all_mem(struct r8152 *tp)
 {
 	struct net_device *netdev = tp->netdev;
@@ -1239,7 +1263,6 @@ static int alloc_all_mem(struct r8152 *tp)
 	struct usb_host_endpoint *ep_intr = alt->endpoint + 2;
 	struct urb *urb;
 	int node, i;
-	u8 *buf;
 
 	node = netdev->dev.parent ? dev_to_node(netdev->dev.parent) : -1;
 
@@ -1249,39 +1272,33 @@ static int alloc_all_mem(struct r8152 *tp)
 	skb_queue_head_init(&tp->tx_queue);
 
 	for (i = 0; i < RTL8152_MAX_RX; i++) {
-		buf = kmalloc_node(agg_buf_sz, GFP_KERNEL, node);
-		if (!buf)
-			goto err1;
+		struct sk_buff *skb;
 
-		if (buf != rx_agg_align(buf)) {
-			kfree(buf);
-			buf = kmalloc_node(agg_buf_sz + RX_ALIGN, GFP_KERNEL,
-					   node);
-			if (!buf)
-				goto err1;
-		}
+		skb = rtl_alloc_rx_skb(tp, GFP_KERNEL);
+		if (!skb)
+			goto err1;
 
 		urb = usb_alloc_urb(0, GFP_KERNEL);
 		if (!urb) {
-			kfree(buf);
+			dev_kfree_skb(skb);
 			goto err1;
 		}
 
 		INIT_LIST_HEAD(&tp->rx_info[i].list);
 		tp->rx_info[i].context = tp;
 		tp->rx_info[i].urb = urb;
-		tp->rx_info[i].buffer = buf;
-		tp->rx_info[i].head = rx_agg_align(buf);
+		tp->rx_info[i].skb = skb;
 	}
 
 	for (i = 0; i < RTL8152_MAX_TX; i++) {
-		buf = kmalloc_node(agg_buf_sz, GFP_KERNEL, node);
+		u8 *buf = kmalloc_node(AGG_BUF_SZ, GFP_KERNEL, node);
+
 		if (!buf)
 			goto err1;
 
 		if (buf != tx_agg_align(buf)) {
 			kfree(buf);
-			buf = kmalloc_node(agg_buf_sz + TX_ALIGN, GFP_KERNEL,
+			buf = kmalloc_node(AGG_BUF_SZ + TX_ALIGN, GFP_KERNEL,
 					   node);
 			if (!buf)
 				goto err1;
@@ -1538,7 +1555,7 @@ static int r8152_tx_agg_fill(struct r8152 *tp, struct tx_agg *agg)
 	tx_data = agg->head;
 	agg->skb_num = 0;
 	agg->skb_len = 0;
-	remain = agg_buf_sz;
+	remain = AGG_BUF_SZ;
 
 	while (remain >= ETH_ZLEN + sizeof(struct tx_desc)) {
 		struct tx_desc *tx_desc;
@@ -1587,7 +1604,7 @@ static int r8152_tx_agg_fill(struct r8152 *tp, struct tx_agg *agg)
 
 		dev_kfree_skb_any(skb);
 
-		remain = agg_buf_sz - (int)(tx_agg_align(tx_data) - agg->head);
+		remain = AGG_BUF_SZ - (int)(tx_agg_align(tx_data) - agg->head);
 	}
 
 	if (!skb_queue_empty(&skb_head)) {
@@ -1666,6 +1683,8 @@ static void rx_bottom(struct r8152 *tp)
 
 	list_for_each_safe(cursor, next, &rx_queue) {
 		struct rx_desc *rx_desc;
+		struct sk_buff *rx_skb;
+		bool cloned = false;
 		struct rx_agg *agg;
 		int len_used = 0;
 		struct urb *urb;
@@ -1678,10 +1697,21 @@ static void rx_bottom(struct r8152 *tp)
 		if (urb->actual_length < ETH_ZLEN)
 			goto submit;
 
-		rx_desc = agg->head;
-		rx_data = agg->head;
+		rx_skb = agg->skb;
+		rx_desc = (struct rx_desc *)rx_skb->data;
+		rx_data = rx_skb->data;
 		len_used += sizeof(struct rx_desc);
 
+		if (!NET_IP_ALIGN && urb->actual_length > RX_THRESHOLD_CLONED) {
+			struct sk_buff *new_skb;
+
+			new_skb = rtl_alloc_rx_skb(tp, GFP_ATOMIC);
+			if (new_skb) {
+				agg->skb = new_skb;
+				cloned = true;
+			}
+		}
+
 		while (urb->actual_length > len_used) {
 			struct net_device *netdev = tp->netdev;
 			struct net_device_stats *stats = &netdev->stats;
@@ -1699,14 +1729,23 @@ static void rx_bottom(struct r8152 *tp)
 			pkt_len -= CRC_SIZE;
 			rx_data += sizeof(struct rx_desc);
 
-			skb = netdev_alloc_skb_ip_align(netdev, pkt_len);
+			if (cloned)
+				skb = skb_clone(rx_skb, GFP_ATOMIC);
+			else
+				skb = netdev_alloc_skb_ip_align(netdev,
+								pkt_len);
 			if (!skb) {
 				stats->rx_dropped++;
 				goto find_next_rx;
 			}
 
 			skb->ip_summed = r8152_rx_csum(tp, rx_desc);
-			memcpy(skb->data, rx_data, pkt_len);
+
+			if (cloned)
+				skb_reserve(skb, (int)(rx_data - rx_skb->data));
+			else
+				memcpy(skb->data, rx_data, pkt_len);
+
 			skb_put(skb, pkt_len);
 			skb->protocol = eth_type_trans(skb, netdev);
 			rtl_rx_vlan_tag(rx_desc, skb);
@@ -1717,10 +1756,14 @@ static void rx_bottom(struct r8152 *tp)
 find_next_rx:
 			rx_data = rx_agg_align(rx_data + pkt_len + CRC_SIZE);
 			rx_desc = (struct rx_desc *)rx_data;
-			len_used = (int)(rx_data - (u8 *)agg->head);
+			len_used = (int)(rx_data - rx_skb->data);
 			len_used += sizeof(struct rx_desc);
 		}
 
+		/* free the cloned skb */
+		if (cloned)
+			dev_kfree_skb_any(rx_skb);
+
 submit:
 		r8152_submit_rx(tp, agg, GFP_ATOMIC);
 	}
@@ -1789,10 +1832,11 @@ static void bottom_half(unsigned long data)
 static
 int r8152_submit_rx(struct r8152 *tp, struct rx_agg *agg, gfp_t mem_flags)
 {
+	struct sk_buff *skb = agg->skb;
 	int ret;
 
 	usb_fill_bulk_urb(agg->urb, tp->udev, usb_rcvbulkpipe(tp->udev, 1),
-			  agg->head, agg_buf_sz,
+			  skb->data, AGG_BUF_SZ,
 			  (usb_complete_t)read_bulk_callback, agg);
 
 	ret = usb_submit_urb(agg->urb, mem_flags);
@@ -1951,7 +1995,7 @@ static void set_tx_qlen(struct r8152 *tp)
 {
 	struct net_device *netdev = tp->netdev;
 
-	tp->tx_qlen = agg_buf_sz / (netdev->mtu + VLAN_ETH_HLEN + VLAN_HLEN +
+	tp->tx_qlen = AGG_BUF_SZ / (netdev->mtu + VLAN_ETH_HLEN + VLAN_HLEN +
 				    sizeof(struct tx_desc));
 }
 
-- 
1.9.3

^ permalink raw reply related

* Re: [PATCH net-next v2] ipv6: remove useless spin_lock/spin_unlock
From: Joe Perches @ 2014-12-03  5:18 UTC (permalink / raw)
  To: Duan Jiong; +Cc: David Miller, netdev, Eric Dumazet
In-Reply-To: <547E7594.7010301@cn.fujitsu.com>

On Wed, 2014-12-03 at 10:29 +0800, Duan Jiong wrote:
> xchg is atomic, so there is no necessary to use spin_lock/spin_unlock
> to protect it. At last, remove the redundant
> opt = xchg(&inet6_sk(sk)->opt, opt); statement.
> 
> Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com>
> ---
> v2: remove the redundant opt = xchg(&inet6_sk(sk)->opt, opt); statement.
> 
>  net/ipv6/ipv6_sockglue.c | 6 +-----
>  1 file changed, 1 insertion(+), 5 deletions(-)
> 
> diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
> index e1a9583..66980d8 100644
> --- a/net/ipv6/ipv6_sockglue.c
> +++ b/net/ipv6/ipv6_sockglue.c
> @@ -110,12 +110,8 @@ struct ipv6_txoptions *ipv6_update_options(struct sock *sk,
>  			icsk->icsk_ext_hdr_len = opt->opt_flen + opt->opt_nflen;
>  			icsk->icsk_sync_mss(sk, icsk->icsk_pmtu_cookie);
>  		}
> -		opt = xchg(&inet6_sk(sk)->opt, opt);
> -	} else {
> -		spin_lock(&sk->sk_dst_lock);
> -		opt = xchg(&inet6_sk(sk)->opt, opt);
> -		spin_unlock(&sk->sk_dst_lock);
>  	}
> +	opt = xchg(&inet6_sk(sk)->opt, opt);
>  	sk_dst_reset(sk);
>  
>  	return opt;

The original function could now be written
integrating the multiple ifs like:
---
 net/ipv6/ipv6_sockglue.c | 25 ++++++++++++-------------
 1 file changed, 12 insertions(+), 13 deletions(-)

diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
index e1a9583..12c9c6b 100644
--- a/net/ipv6/ipv6_sockglue.c
+++ b/net/ipv6/ipv6_sockglue.c
@@ -102,20 +102,19 @@ static
 struct ipv6_txoptions *ipv6_update_options(struct sock *sk,
 					   struct ipv6_txoptions *opt)
 {
-	if (inet_sk(sk)->is_icsk) {
-		if (opt &&
-		    !((1 << sk->sk_state) & (TCPF_LISTEN | TCPF_CLOSE)) &&
-		    inet_sk(sk)->inet_daddr != LOOPBACK4_IPV6) {
-			struct inet_connection_sock *icsk = inet_csk(sk);
-			icsk->icsk_ext_hdr_len = opt->opt_flen + opt->opt_nflen;
-			icsk->icsk_sync_mss(sk, icsk->icsk_pmtu_cookie);
-		}
-		opt = xchg(&inet6_sk(sk)->opt, opt);
-	} else {
-		spin_lock(&sk->sk_dst_lock);
-		opt = xchg(&inet6_sk(sk)->opt, opt);
-		spin_unlock(&sk->sk_dst_lock);
+	struct inet_sock *inet = inet_sk(sk);
+
+	if (inet->is_icsk &&
+	    opt &&
+	    !((1 << sk->sk_state) & (TCPF_LISTEN | TCPF_CLOSE)) &&
+	    inet->inet_daddr != LOOPBACK4_IPV6) {
+		struct inet_connection_sock *icsk = inet_csk(sk);
+
+		icsk->icsk_ext_hdr_len = opt->opt_flen + opt->opt_nflen;
+		icsk->icsk_sync_mss(sk, icsk->icsk_pmtu_cookie);
 	}
+
+	opt = xchg(&inet6_sk(sk)->opt, opt);
 	sk_dst_reset(sk);
 
 	return opt;

^ permalink raw reply related

* Re: Is this 32-bit NCM?
From: Kevin Zhu @ 2014-12-03  5:38 UTC (permalink / raw)
  To: Enrico Mioso
  Cc: Bjørn Mork, Eli Britstein, Alex Strizhevsky,
	Midge Shaojun Tan, youtux-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org,
	linux-usb-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
In-Reply-To: <alpine.LNX.2.03.1412021627360.8131-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>

My dongle also works with the huawei driver. I think only the 32bit
format and NDP location matter. We may modify the TX function to put NTH
and NDP at the beginning of a NTB and see if it will work with the
driver cdc_ncm.

Regards,
Kevin

On 12/02/2014 11:28 PM, Enrico Mioso wrote:
> ... And what do you think about the source code of their ndis driver?
> We at least know now the device work with it, so we have something to mimic :D
> thank you for your work and patience Kevin.
>
> On Tue, 2 Dec 2014, Kevin Zhu wrote:
>
> ==Date: Tue, 2 Dec 2014 16:04:25
> ==From: Kevin Zhu <Mingying.Zhu-6C2+4RG2qWF0ubjbjo6WXg@public.gmane.org>
> ==To: Enrico Mioso <mrkiko.rs-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>, Bjørn Mork <bjorn@mork.no>
> ==Cc: Eli Britstein <Eli.Britstein-6C2+4RG2qWF0ubjbjo6WXg@public.gmane.org>,
> ==    Alex Strizhevsky <alexxst-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
> ==    Midge Shaojun  Tan <ShaojunMidge.Tan-6C2+4RG2qWF0ubjbjo6WXg@public.gmane.org>,
> ==    "youtux-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org" <youtux-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
> ==    "linux-usb-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" <linux-usb-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
> ==    "netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" <netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
> ==Subject: Re: Is this 32-bit NCM?
> ==
> ==I do not understand why the wSequence matters. By the way, I think I see some NDPs are right after NTH headers in the windows capture.
> ==
> ==________________________________________
> ==From: Enrico Mioso <mrkiko.rs-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> ==Sent: Tuesday, December 2, 2014 21:53
> ==To: Bjørn Mork
> ==Cc: Kevin Zhu; Eli Britstein; Alex Strizhevsky; Midge Shaojun  Tan; youtux-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org; linux-usb-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; netdev-u79uwXL29TaqPxH82wqD4g@public.gmane.orgg
> ==Subject: Re: Is this 32-bit NCM?
> ==
> ==Thank you very much Bjorn.
> ==
> ==
> ==On Tue, 2 Dec 2014, Bjørn Mork wrote:
> ==
> ====Date: Tue, 2 Dec 2014 14:37:03
> ====From: Bjørn Mork <bjorn-yOkvZcmFvRU@public.gmane.org>
> ====To: Enrico Mioso <mrkiko.rs-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> ====Cc: Kevin Zhu <Mingying.Zhu-6C2+4RG2qWF0ubjbjo6WXg@public.gmane.org>,
> ====    Eli Britstein <Eli.Britstein-6C2+4RG2qWF0ubjbjo6WXg@public.gmane.org>,
> ====    Alex Strizhevsky <alexxst-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
> ====    Midge Shaojun Tan <ShaojunMidge.Tan-6C2+4RG2qWF0ubjbjo6WXg@public.gmane.org>,
> ====    "youtux-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org" <youtux-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
> ====    "linux-usb-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" <linux-usb-u79uwXL29TaqPxH82wqD4g@public.gmane.orgg>,
> ====    "netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" <netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
> ====Subject: Re: Is this 32-bit NCM?
> ====
> ====Enrico Mioso <mrkiko.rs-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
> ====
> ====> ... but out of curiosity: are NCM specs allowing to change order of things in
> ====> the package or not?
> ====> This is not to start philosofical falames or something, but to understand
> ====> better how things work. And, if they do: how much arbitrarily?
> ====
> ====Only the NTB header has a fixed location. The rest can be anywhere and
> ====in any order. Quoting from section 3 Data Transport:
> ====
> ====  "Within any given NTB, the NTH always must be first; but the other
> ====   items may occur in arbitrary order."
> ====
> ====
> ====Bjørn
> ====
> ==This email and any files transmitted with it are confidential material. They are intended solely for the use of the designated individual or entity to whom they are addressed. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, use, distribution or copying of this communication is strictly prohibited and may be unlawful.
> ==
> ==If you have received this email in error please immediately notify the sender and delete or destroy any copy of this message
> ==
This email and any files transmitted with it are confidential material. They are intended solely for the use of the designated individual or entity to whom they are addressed. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, use, distribution or copying of this communication is strictly prohibited and may be unlawful.

If you have received this email in error please immediately notify the sender and delete or destroy any copy of this message
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* [Discussion] About over-MTU-sized skb in virtualized env
From: Du, Fan @ 2014-12-03  5:40 UTC (permalink / raw)
  To: fw-HFFVJYpyMKqzQB+pC5nmwQ@public.gmane.org
  Cc: dev-yBygre7rU0TnMu66kgdUjQ@public.gmane.org, Du, Fan,
	Michael S. Tsirkin,
	netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Flavio Leitner,
	David Miller

Hi Florian

214 static int ip_finish_output_gso(struct sk_buff *skb)
 215 {
 216     netdev_features_t features;
 217     struct sk_buff *segs;
 218     int ret = 0;
 219
 220     /* common case: locally created skb or seglen is <= mtu */
 221     if (((IPCB(skb)->flags & IPSKB_FORWARDED) == 0) ||
 222           skb_gso_network_seglen(skb) <= ip_skb_dst_mtu(skb))
 223         return ip_finish_output2(skb);

Could you please state _concrete_ _reason_ why locally created skb length is _always_ fitting into MTU size?
Especially in below scenario:
http://docs.openstack.org/admin-guide-cloud/content/figures/14/a/a/common/figures/under-the-hood-scenario-1-ovs-compute.png

As many people have discuss it in this thread:
http://marc.info/?t=141715669700004&r=1&w=2







_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

^ permalink raw reply

* Re: Is this 32-bit NCM?
From: Enrico Mioso @ 2014-12-03  6:00 UTC (permalink / raw)
  To: Kevin Zhu
  Cc: Bjørn Mork, Eli Britstein, Alex Strizhevsky,
	Midge Shaojun Tan, youtux@gmail.com, linux-usb@vger.kernel.org,
	netdev@vger.kernel.org
In-Reply-To: <547EA1D1.8010202@audiocodes.com>

[-- Attachment #1: Type: TEXT/PLAIN, Size: 4718 bytes --]

Yes - I think this would be ok. You might try this with the 16-bit river first, 
and then with the 32-bit one to see how things work.
I hope for the best.
Let us all know,
Enrico


On Wed, 3 Dec 2014, Kevin Zhu wrote:

==Date: Wed, 3 Dec 2014 06:38:27
==From: Kevin Zhu <Mingying.Zhu@audiocodes.com>
==To: Enrico Mioso <mrkiko.rs@gmail.com>
==Cc: Bjørn Mork <bjorn@mork.no>, Eli Britstein <Eli.Britstein@audiocodes.com>,
==    Alex Strizhevsky <alexxst@gmail.com>,
==    Midge Shaojun  Tan <ShaojunMidge.Tan@audiocodes.com>,
==    "youtux@gmail.com" <youtux@gmail.com>,
==    "linux-usb@vger.kernel.org" <linux-usb@vger.kernel.org>,
==    "netdev@vger.kernel.org" <netdev@vger.kernel.org>
==Subject: Re: Is this 32-bit NCM?
==
==My dongle also works with the huawei driver. I think only the 32bit
==format and NDP location matter. We may modify the TX function to put NTH
==and NDP at the beginning of a NTB and see if it will work with the
==driver cdc_ncm.
==
==Regards,
==Kevin
==
==On 12/02/2014 11:28 PM, Enrico Mioso wrote:
==> ... And what do you think about the source code of their ndis driver?
==> We at least know now the device work with it, so we have something to mimic :D
==> thank you for your work and patience Kevin.
==>
==> On Tue, 2 Dec 2014, Kevin Zhu wrote:
==>
==> ==Date: Tue, 2 Dec 2014 16:04:25
==> ==From: Kevin Zhu <Mingying.Zhu@audiocodes.com>
==> ==To: Enrico Mioso <mrkiko.rs@gmail.com>, Bjørn Mork <bjorn@mork.no>
==> ==Cc: Eli Britstein <Eli.Britstein@audiocodes.com>,
==> ==    Alex Strizhevsky <alexxst@gmail.com>,
==> ==    Midge Shaojun  Tan <ShaojunMidge.Tan@audiocodes.com>,
==> ==    "youtux@gmail.com" <youtux@gmail.com>,
==> ==    "linux-usb@vger.kernel.org" <linux-usb@vger.kernel.org>,
==> ==    "netdev@vger.kernel.org" <netdev@vger.kernel.org>
==> ==Subject: Re: Is this 32-bit NCM?
==> ==
==> ==I do not understand why the wSequence matters. By the way, I think I see some NDPs are right after NTH headers in the windows capture.
==> ==
==> ==________________________________________
==> ==From: Enrico Mioso <mrkiko.rs@gmail.com>
==> ==Sent: Tuesday, December 2, 2014 21:53
==> ==To: Bjørn Mork
==> ==Cc: Kevin Zhu; Eli Britstein; Alex Strizhevsky; Midge Shaojun  Tan; youtux@gmail.com; linux-usb@vger.kernel.org; netdev@vger.kernel.org
==> ==Subject: Re: Is this 32-bit NCM?
==> ==
==> ==Thank you very much Bjorn.
==> ==
==> ==
==> ==On Tue, 2 Dec 2014, Bjørn Mork wrote:
==> ==
==> ====Date: Tue, 2 Dec 2014 14:37:03
==> ====From: Bjørn Mork <bjorn@mork.no>
==> ====To: Enrico Mioso <mrkiko.rs@gmail.com>
==> ====Cc: Kevin Zhu <Mingying.Zhu@audiocodes.com>,
==> ====    Eli Britstein <Eli.Britstein@audiocodes.com>,
==> ====    Alex Strizhevsky <alexxst@gmail.com>,
==> ====    Midge Shaojun Tan <ShaojunMidge.Tan@audiocodes.com>,
==> ====    "youtux@gmail.com" <youtux@gmail.com>,
==> ====    "linux-usb@vger.kernel.org" <linux-usb@vger.kernel.org>,
==> ====    "netdev@vger.kernel.org" <netdev@vger.kernel.org>
==> ====Subject: Re: Is this 32-bit NCM?
==> ====
==> ====Enrico Mioso <mrkiko.rs@gmail.com> writes:
==> ====
==> ====> ... but out of curiosity: are NCM specs allowing to change order of things in
==> ====> the package or not?
==> ====> This is not to start philosofical falames or something, but to understand
==> ====> better how things work. And, if they do: how much arbitrarily?
==> ====
==> ====Only the NTB header has a fixed location. The rest can be anywhere and
==> ====in any order. Quoting from section 3 Data Transport:
==> ====
==> ====  "Within any given NTB, the NTH always must be first; but the other
==> ====   items may occur in arbitrary order."
==> ====
==> ====
==> ====Bjørn
==> ====
==> ==This email and any files transmitted with it are confidential material. They are intended solely for the use of the designated individual or entity to whom they are addressed. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, use, distribution or copying of this communication is strictly prohibited and may be unlawful.
==> ==
==> ==If you have received this email in error please immediately notify the sender and delete or destroy any copy of this message
==> ==
==This email and any files transmitted with it are confidential material. They are intended solely for the use of the designated individual or entity to whom they are addressed. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, use, distribution or copying of this communication is strictly prohibited and may be unlawful.
==
==If you have received this email in error please immediately notify the sender and delete or destroy any copy of this message
==

^ permalink raw reply

* Re: Is this 32-bit NCM?
From: Kevin Zhu @ 2014-12-03  6:05 UTC (permalink / raw)
  To: Enrico Mioso
  Cc: Bjørn Mork, Eli Britstein, Alex Strizhevsky,
	Midge Shaojun Tan, youtux-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org,
	linux-usb-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
In-Reply-To: <alpine.LNX.2.03.1412030659001.1288-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>

OK. I will. Thank you for everything!

Regards,
Kevin

On 12/03/2014 02:00 PM, Enrico Mioso wrote:
> Yes - I think this would be ok. You might try this with the 16-bit river first,
> and then with the 32-bit one to see how things work.
> I hope for the best.
> Let us all know,
> Enrico
>
>
> On Wed, 3 Dec 2014, Kevin Zhu wrote:
>
> ==Date: Wed, 3 Dec 2014 06:38:27
> ==From: Kevin Zhu <Mingying.Zhu-6C2+4RG2qWF0ubjbjo6WXg@public.gmane.org>
> ==To: Enrico Mioso <mrkiko.rs-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> ==Cc: Bjørn Mork <bjorn-yOkvZcmFvRU@public.gmane.org>, Eli Britstein <Eli.Britstein@audiocodes.com>,
> ==    Alex Strizhevsky <alexxst-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
> ==    Midge Shaojun  Tan <ShaojunMidge.Tan-6C2+4RG2qWF0ubjbjo6WXg@public.gmane.org>,
> ==    "youtux-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org" <youtux-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
> ==    "linux-usb-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" <linux-usb-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
> ==    "netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" <netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
> ==Subject: Re: Is this 32-bit NCM?
> ==
> ==My dongle also works with the huawei driver. I think only the 32bit
> ==format and NDP location matter. We may modify the TX function to put NTH
> ==and NDP at the beginning of a NTB and see if it will work with the
> ==driver cdc_ncm.
> ==
> ==Regards,
> ==Kevin
> ==
> ==On 12/02/2014 11:28 PM, Enrico Mioso wrote:
> ==> ... And what do you think about the source code of their ndis driver?
> ==> We at least know now the device work with it, so we have something to mimic :D
> ==> thank you for your work and patience Kevin.
> ==>
> ==> On Tue, 2 Dec 2014, Kevin Zhu wrote:
> ==>
> ==> ==Date: Tue, 2 Dec 2014 16:04:25
> ==> ==From: Kevin Zhu <Mingying.Zhu-6C2+4RG2qWF0ubjbjo6WXg@public.gmane.org>
> ==> ==To: Enrico Mioso <mrkiko.rs-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>, Bjørn Mork <bjorn-yOkvZcmFvRU@public.gmane.org>
> ==> ==Cc: Eli Britstein <Eli.Britstein-6C2+4RG2qWF0ubjbjo6WXg@public.gmane.org>,
> ==> ==    Alex Strizhevsky <alexxst-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
> ==> ==    Midge Shaojun  Tan <ShaojunMidge.Tan-6C2+4RG2qWF0ubjbjo6WXg@public.gmane.org>,
> ==> ==    "youtux-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org" <youtux-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
> ==> ==    "linux-usb-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" <linux-usb-u79uwXL29TasMV2rI37PzA@public.gmane.orgorg>,
> ==> ==    "netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" <netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
> ==> ==Subject: Re: Is this 32-bit NCM?
> ==> ==
> ==> ==I do not understand why the wSequence matters. By the way, I think I see some NDPs are right after NTH headers in the windows capture.
> ==> ==
> ==> ==________________________________________
> ==> ==From: Enrico Mioso <mrkiko.rs-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> ==> ==Sent: Tuesday, December 2, 2014 21:53
> ==> ==To: Bjørn Mork
> ==> ==Cc: Kevin Zhu; Eli Britstein; Alex Strizhevsky; Midge Shaojun  Tan; youtux-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org; linux-usb-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; netdev@vger.kernel.org
> ==> ==Subject: Re: Is this 32-bit NCM?
> ==> ==
> ==> ==Thank you very much Bjorn.
> ==> ==
> ==> ==
> ==> ==On Tue, 2 Dec 2014, Bjørn Mork wrote:
> ==> ==
> ==> ====Date: Tue, 2 Dec 2014 14:37:03
> ==> ====From: Bjørn Mork <bjorn-yOkvZcmFvRU@public.gmane.org>
> ==> ====To: Enrico Mioso <mrkiko.rs-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> ==> ====Cc: Kevin Zhu <Mingying.Zhu-6C2+4RG2qWF0ubjbjo6WXg@public.gmane.org>,
> ==> ====    Eli Britstein <Eli.Britstein-6C2+4RG2qWF0ubjbjo6WXg@public.gmane.org>,
> ==> ====    Alex Strizhevsky <alexxst-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
> ==> ====    Midge Shaojun Tan <ShaojunMidge.Tan@audiocodes.com>,
> ==> ====    "youtux-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org" <youtux-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
> ==> ====    "linux-usb-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" <linux-usb@vger.kernel.org>,
> ==> ====    "netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" <netdev-u79uwXL29TasMV2rI37PzA@public.gmane.orgorg>
> ==> ====Subject: Re: Is this 32-bit NCM?
> ==> ====
> ==> ====Enrico Mioso <mrkiko.rs-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
> ==> ====
> ==> ====> ... but out of curiosity: are NCM specs allowing to change order of things in
> ==> ====> the package or not?
> ==> ====> This is not to start philosofical falames or something, but to understand
> ==> ====> better how things work. And, if they do: how much arbitrarily?
> ==> ====
> ==> ====Only the NTB header has a fixed location. The rest can be anywhere and
> ==> ====in any order. Quoting from section 3 Data Transport:
> ==> ====
> ==> ====  "Within any given NTB, the NTH always must be first; but the other
> ==> ====   items may occur in arbitrary order."
> ==> ====
> ==> ====
> ==> ====Bjørn
> ==> ====
> ==> ==This email and any files transmitted with it are confidential material. They are intended solely for the use of the designated individual or entity to whom they are addressed. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, use, distribution or copying of this communication is strictly prohibited and may be unlawful.
> ==> ==
> ==> ==If you have received this email in error please immediately notify the sender and delete or destroy any copy of this message
> ==> ==
> ==This email and any files transmitted with it are confidential material. They are intended solely for the use of the designated individual or entity to whom they are addressed. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, use, distribution or copying of this communication is strictly prohibited and may be unlawful.
> ==
> ==If you have received this email in error please immediately notify the sender and delete or destroy any copy of this message
> ==
This email and any files transmitted with it are confidential material. They are intended solely for the use of the designated individual or entity to whom they are addressed. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, use, distribution or copying of this communication is strictly prohibited and may be unlawful.

If you have received this email in error please immediately notify the sender and delete or destroy any copy of this message
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH net-next] r8152: reduce memory copy for rx
From: Eric Dumazet @ 2014-12-03  6:07 UTC (permalink / raw)
  To: Hayes Wang; +Cc: netdev, nic_swsd, linux-kernel, linux-usb
In-Reply-To: <1394712342-15778-105-Taiwan-albertk@realtek.com>

On Wed, 2014-12-03 at 13:14 +0800, Hayes Wang wrote:
> If the data size is more than half of the AGG_BUG_SZ, allocate a new
> rx buffer and use skb_clone() to avoid the memory copy.
> 
> The original method is that allocate the memory and copy data for each
> packet in a rx buffer. The new one is that when the data size for a rx
> buffer is more than RX_THRESHOLD_CLONED, allocate a new rx buffer and
> use skb_clone for each packet in the rx buffer. According to the
> experiment, the new mothod has better performance.

Better performance for what workload exactly ?

cloning in rx path has many drawbacks, with skb->truesize being usually
wrong.

^ permalink raw reply

* [PATCHv2 net] cxgb4: Add a check for flashing FW using ethtool
From: Hariprasad Shenai @ 2014-12-03  6:19 UTC (permalink / raw)
  To: netdev; +Cc: davem, leedom, anish, nirranjan, kumaras, Hariprasad Shenai

Don't let T4 firmware flash on a T5 adapter and vice-versa
using ethtool

Based on original work by Casey Leedom <leedom@chelsio.com>

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>

---
V2:
 Use bool for return value based on review comment by Sergei Shtylyov

 drivers/net/ethernet/chelsio/cxgb4/t4_hw.c |   26 ++++++++++++++++++++++++++
 1 files changed, 26 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c b/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c
index 163a2a1..c623f1f 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c
@@ -1131,6 +1131,27 @@ unsigned int t4_flash_cfg_addr(struct adapter *adapter)
 		return FLASH_CFG_START;
 }
 
+/* Return TRUE if the specified firmware matches the adapter.  I.e. T4
+ * firmware for T4 adapters, T5 firmware for T5 adapters, etc.  We go ahead
+ * and emit an error message for mismatched firmware to save our caller the
+ * effort ...
+ */
+static bool t4_fw_matches_chip(const struct adapter *adap,
+			       const struct fw_hdr *hdr)
+{
+	/* The expression below will return FALSE for any unsupported adapter
+	 * which will keep us "honest" in the future ...
+	 */
+	if ((is_t4(adap->params.chip) && hdr->chip == FW_HDR_CHIP_T4) ||
+	    (is_t5(adap->params.chip) && hdr->chip == FW_HDR_CHIP_T5))
+		return true;
+
+	dev_err(adap->pdev_dev,
+		"FW image (%d) is not suitable for this adapter (%d)\n",
+		hdr->chip, CHELSIO_CHIP_VERSION(adap->params.chip));
+	return false;
+}
+
 /**
  *	t4_load_fw - download firmware
  *	@adap: the adapter
@@ -1170,6 +1191,8 @@ int t4_load_fw(struct adapter *adap, const u8 *fw_data, unsigned int size)
 			FW_MAX_SIZE);
 		return -EFBIG;
 	}
+	if (!t4_fw_matches_chip(adap, hdr))
+		return -EINVAL;
 
 	for (csum = 0, i = 0; i < size / sizeof(csum); i++)
 		csum += ntohl(p[i]);
@@ -3080,6 +3103,9 @@ int t4_fw_upgrade(struct adapter *adap, unsigned int mbox,
 	const struct fw_hdr *fw_hdr = (const struct fw_hdr *)fw_data;
 	int reset, ret;
 
+	if (!t4_fw_matches_chip(adap, fw_hdr))
+		return -EINVAL;
+
 	ret = t4_fw_halt(adap, mbox, force);
 	if (ret < 0 && !force)
 		return ret;
-- 
1.7.1

^ permalink raw reply related

* Re: Is this 32-bit NCM?
From: Enrico Mioso @ 2014-12-03  6:29 UTC (permalink / raw)
  To: Kevin Zhu
  Cc: Bjørn Mork, Eli Britstein, Alex Strizhevsky,
	Midge Shaojun Tan, youtux-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org,
	linux-usb-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
In-Reply-To: <547EA82F.5050500-6C2+4RG2qWF0ubjbjo6WXg@public.gmane.org>

[-- Attachment #1: Type: TEXT/PLAIN, Size: 7899 bytes --]

No problems.
I remain there - for anything. Especially because if we collaborate and are 
able to find a good solution to this problem, then it is so much better for 
all.
Tell me if I can do something useful or try something useful...
Enrico
On Wed, 3 Dec 2014, Kevin Zhu wrote:

==Date: Wed, 3 Dec 2014 07:05:37
==From: Kevin Zhu <Mingying.Zhu-6C2+4RG2qWF0ubjbjo6WXg@public.gmane.org>
==To: Enrico Mioso <mrkiko.rs-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
==Cc: Bjørn Mork <bjorn-yOkvZcmFvRU@public.gmane.org>, Eli Britstein <Eli.Britstein-6C2+4RG2qWF0ubjbjo6WXg@public.gmane.org>,
==    Alex Strizhevsky <alexxst-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
==    Midge Shaojun  Tan <ShaojunMidge.Tan-6C2+4RG2qWF0ubjbjo6WXg@public.gmane.org>,
==    "youtux-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org" <youtux-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
==    "linux-usb-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" <linux-usb-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
==    "netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" <netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
==Subject: Re: Is this 32-bit NCM?
==
==OK. I will. Thank you for everything!
==
==Regards,
==Kevin
==
==On 12/03/2014 02:00 PM, Enrico Mioso wrote:
==> Yes - I think this would be ok. You might try this with the 16-bit river first,
==> and then with the 32-bit one to see how things work.
==> I hope for the best.
==> Let us all know,
==> Enrico
==>
==>
==> On Wed, 3 Dec 2014, Kevin Zhu wrote:
==>
==> ==Date: Wed, 3 Dec 2014 06:38:27
==> ==From: Kevin Zhu <Mingying.Zhu-6C2+4RG2qWF0ubjbjo6WXg@public.gmane.org>
==> ==To: Enrico Mioso <mrkiko.rs-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
==> ==Cc: Bjørn Mork <bjorn-yOkvZcmFvRU@public.gmane.org>, Eli Britstein <Eli.Britstein-6C2+4RG2qWF0ubjbjo6WXg@public.gmane.org>,
==> ==    Alex Strizhevsky <alexxst-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
==> ==    Midge Shaojun  Tan <ShaojunMidge.Tan-6C2+4RG2qWF0ubjbjo6WXg@public.gmane.org>,
==> ==    "youtux-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org" <youtux-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
==> ==    "linux-usb-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" <linux-usb-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
==> ==    "netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" <netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
==> ==Subject: Re: Is this 32-bit NCM?
==> ==
==> ==My dongle also works with the huawei driver. I think only the 32bit
==> ==format and NDP location matter. We may modify the TX function to put NTH
==> ==and NDP at the beginning of a NTB and see if it will work with the
==> ==driver cdc_ncm.
==> ==
==> ==Regards,
==> ==Kevin
==> ==
==> ==On 12/02/2014 11:28 PM, Enrico Mioso wrote:
==> ==> ... And what do you think about the source code of their ndis driver?
==> ==> We at least know now the device work with it, so we have something to mimic :D
==> ==> thank you for your work and patience Kevin.
==> ==>
==> ==> On Tue, 2 Dec 2014, Kevin Zhu wrote:
==> ==>
==> ==> ==Date: Tue, 2 Dec 2014 16:04:25
==> ==> ==From: Kevin Zhu <Mingying.Zhu-6C2+4RG2qWF0ubjbjo6WXg@public.gmane.org>
==> ==> ==To: Enrico Mioso <mrkiko.rs-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>, Bjørn Mork <bjorn-yOkvZcmFvRU@public.gmane.org>
==> ==> ==Cc: Eli Britstein <Eli.Britstein-6C2+4RG2qWF0ubjbjo6WXg@public.gmane.org>,
==> ==> ==    Alex Strizhevsky <alexxst-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
==> ==> ==    Midge Shaojun  Tan <ShaojunMidge.Tan-6C2+4RG2qWF0ubjbjo6WXg@public.gmane.org>,
==> ==> ==    "youtux-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org" <youtux-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
==> ==> ==    "linux-usb-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" <linux-usb-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
==> ==> ==    "netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" <netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
==> ==> ==Subject: Re: Is this 32-bit NCM?
==> ==> ==
==> ==> ==I do not understand why the wSequence matters. By the way, I think I see some NDPs are right after NTH headers in the windows capture.
==> ==> ==
==> ==> ==________________________________________
==> ==> ==From: Enrico Mioso <mrkiko.rs-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
==> ==> ==Sent: Tuesday, December 2, 2014 21:53
==> ==> ==To: Bjørn Mork
==> ==> ==Cc: Kevin Zhu; Eli Britstein; Alex Strizhevsky; Midge Shaojun  Tan; youtux-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org; linux-usb-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
==> ==> ==Subject: Re: Is this 32-bit NCM?
==> ==> ==
==> ==> ==Thank you very much Bjorn.
==> ==> ==
==> ==> ==
==> ==> ==On Tue, 2 Dec 2014, Bjørn Mork wrote:
==> ==> ==
==> ==> ====Date: Tue, 2 Dec 2014 14:37:03
==> ==> ====From: Bjørn Mork <bjorn-yOkvZcmFvRU@public.gmane.org>
==> ==> ====To: Enrico Mioso <mrkiko.rs-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
==> ==> ====Cc: Kevin Zhu <Mingying.Zhu-6C2+4RG2qWF0ubjbjo6WXg@public.gmane.org>,
==> ==> ====    Eli Britstein <Eli.Britstein-6C2+4RG2qWF0ubjbjo6WXg@public.gmane.org>,
==> ==> ====    Alex Strizhevsky <alexxst-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
==> ==> ====    Midge Shaojun Tan <ShaojunMidge.Tan-6C2+4RG2qWF0ubjbjo6WXg@public.gmane.org>,
==> ==> ====    "youtux-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org" <youtux-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
==> ==> ====    "linux-usb-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" <linux-usb-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
==> ==> ====    "netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" <netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
==> ==> ====Subject: Re: Is this 32-bit NCM?
==> ==> ====
==> ==> ====Enrico Mioso <mrkiko.rs-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
==> ==> ====
==> ==> ====> ... but out of curiosity: are NCM specs allowing to change order of things in
==> ==> ====> the package or not?
==> ==> ====> This is not to start philosofical falames or something, but to understand
==> ==> ====> better how things work. And, if they do: how much arbitrarily?
==> ==> ====
==> ==> ====Only the NTB header has a fixed location. The rest can be anywhere and
==> ==> ====in any order. Quoting from section 3 Data Transport:
==> ==> ====
==> ==> ====  "Within any given NTB, the NTH always must be first; but the other
==> ==> ====   items may occur in arbitrary order."
==> ==> ====
==> ==> ====
==> ==> ====Bjørn
==> ==> ====
==> ==> ==This email and any files transmitted with it are confidential material. They are intended solely for the use of the designated individual or entity to whom they are addressed. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, use, distribution or copying of this communication is strictly prohibited and may be unlawful.
==> ==> ==
==> ==> ==If you have received this email in error please immediately notify the sender and delete or destroy any copy of this message
==> ==> ==
==> ==This email and any files transmitted with it are confidential material. They are intended solely for the use of the designated individual or entity to whom they are addressed. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, use, distribution or copying of this communication is strictly prohibited and may be unlawful.
==> ==
==> ==If you have received this email in error please immediately notify the sender and delete or destroy any copy of this message
==> ==
==This email and any files transmitted with it are confidential material. They are intended solely for the use of the designated individual or entity to whom they are addressed. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, use, distribution or copying of this communication is strictly prohibited and may be unlawful.
==
==If you have received this email in error please immediately notify the sender and delete or destroy any copy of this message
==

^ permalink raw reply

* Re: [PATCH v2 net] bpf: x86: fix epilogue generation for eBPF programs
From: Z Lim @ 2014-12-03  6:38 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: David S. Miller, Eric Dumazet, Daniel Borkmann, H. Peter Anvin,
	Thomas Gleixner, Ingo Molnar, Network Development, LKML
In-Reply-To: <1417301173-23691-1-git-send-email-ast@plumgrid.com>

Hi Alexei,

On Sat, Nov 29, 2014 at 2:46 PM, Alexei Starovoitov <ast@plumgrid.com> wrote:
> classic BPF has a restriction that last insn is always BPF_RET.
> eBPF doesn't have BPF_RET instruction and this restriction.
> It has BPF_EXIT insn which can appear anywhere in the program
> one or more times and it doesn't have to be last insn.

Just to confirm, in valid eBPF, BPF_EXIT *must* be present at least
once, correct?
Does an eBPF JIT implementation need to check for it?

> Fix eBPF JIT to emit epilogue when first BPF_EXIT is seen
> and all other BPF_EXIT instructions will be emitted as jump.
>
> Since jump offset to epilogue is computed as:
> jmp_offset = ctx->cleanup_addr - addrs[i]
> we need to change type of cleanup_addr to signed to compute the offset as:
> (long long) ((int)20 - (int)30)
> instead of:
> (long long) ((unsigned int)20 - (int)30)
>
> Fixes: 622582786c9e ("net: filter: x86: internal BPF JIT")
> Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
> ---
> Note, this bug is applicable only to native eBPF programs
> which first were introduced in 3.18, so no need for stable.
>
> arm64 JIT has the same problem, but the fix is not as trivial,
> so will be done as separate patch.

I'll cook up a patch for arm64 if you haven't already done so.
Any related test case I should run through?

^ permalink raw reply

* [PATCH net-next] virtio-net: don't do header check for dodgy gso packets
From: Jason Wang @ 2014-12-03  6:40 UTC (permalink / raw)
  To: rusty, mst, virtualization, netdev, linux-kernel

There's no need to do header check for virito-net since:

- Host set dodgy for all gso packets from guest and check the header.
- Host should prepare for all kinds of evil packets from guest, since
  malicious guest can send any kinds of packet.

So this patch sets NETIF_F_GSO_ROBUST for virtio-net to skip the check.

Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 drivers/net/virtio_net.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index b0bc8ea..4cd242b 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -1760,6 +1760,8 @@ static int virtnet_probe(struct virtio_device *vdev)
 		if (virtio_has_feature(vdev, VIRTIO_NET_F_HOST_ECN))
 			dev->hw_features |= NETIF_F_TSO_ECN;
 
+		dev->features |= NETIF_F_GSO_ROBUST;
+
 		if (gso)
 			dev->features |= dev->hw_features & NETIF_F_ALL_TSO;
 		/* (!csum && gso) case will be fixed by register_netdev() */
-- 
1.9.1

^ permalink raw reply related

* [Discussion] About over-MTU-sized skb in virtualized env
From: Du Fan @ 2014-12-03  6:39 UTC (permalink / raw)
  To: Florian Westphal
  Cc: Thomas Graf, Michael S. Tsirkin, Jesse Gross, Flavio Leitner,
	davem@davemloft.net, pshelar, netdev, dev@openvswitch.org,
	Du, Fan

Sorry for resend this mail, because my company email is rejected by netdev.


Hi Florian

  214 static int ip_finish_output_gso(struct sk_buff *skb)
  215 {
  216     netdev_features_t features;
  217     struct sk_buff *segs;
  218     int ret = 0;
  219
  220     /* common case: locally created skb or seglen is <= mtu */
  221     if (((IPCB(skb)->flags & IPSKB_FORWARDED) == 0) ||
  222           skb_gso_network_seglen(skb) <= ip_skb_dst_mtu(skb))
  223         return ip_finish_output2(skb);

Could you please state _concrete_ reason why locally created skb
length is _always_ fitting into MTU size? or why we needs this
checking.
Especially in below scenario:
http://docs.openstack.org/admin-guide-cloud/content/figures/14/a/a/common/figures/under-the-hood-scenario-1-ovs-compute.png

As many people have discuss it in this thread:
http://marc.info/?t=141715669700004&r=1&w=2

^ permalink raw reply

* [PATCH net] cxgb4: Update FW version string to match FW binary version 1.12.25.0
From: Hariprasad Shenai @ 2014-12-03  6:59 UTC (permalink / raw)
  To: netdev; +Cc: davem, leedom, anish, nirranjan, kumaras, Hariprasad Shenai

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
---
 drivers/net/ethernet/chelsio/cxgb4/cxgb4.h |    8 ++++----
 1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
index 3c481b2..0514b74 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
@@ -50,13 +50,13 @@
 #include "cxgb4_uld.h"
 
 #define T4FW_VERSION_MAJOR 0x01
-#define T4FW_VERSION_MINOR 0x0B
-#define T4FW_VERSION_MICRO 0x1B
+#define T4FW_VERSION_MINOR 0x0C
+#define T4FW_VERSION_MICRO 0x19
 #define T4FW_VERSION_BUILD 0x00
 
 #define T5FW_VERSION_MAJOR 0x01
-#define T5FW_VERSION_MINOR 0x0B
-#define T5FW_VERSION_MICRO 0x1B
+#define T5FW_VERSION_MINOR 0x0C
+#define T5FW_VERSION_MICRO 0x19
 #define T5FW_VERSION_BUILD 0x00
 
 #define CH_WARN(adap, fmt, ...) dev_warn(adap->pdev_dev, fmt, ## __VA_ARGS__)
-- 
1.7.1

^ permalink raw reply related

* RE: [PATCH net] gso: do GSO for local skb with size bigger than MTU
From: Du, Fan @ 2014-12-03  6:53 UTC (permalink / raw)
  To: David Miller; +Cc: netdev@vger.kernel.org, fw@strlen.de, Du, Fan
In-Reply-To: <20141202.211449.2074426768363313524.davem@davemloft.net>



>-----Original Message-----
>From: David Miller [mailto:davem@davemloft.net]
>Sent: Wednesday, December 3, 2014 1:15 PM
>To: Du, Fan
>Cc: netdev@vger.kernel.org; fw@strlen.de
>Subject: Re: [PATCH net] gso: do GSO for local skb with size bigger than MTU
>
>From: "Du, Fan" <fan.du@intel.com>
>Date: Wed, 3 Dec 2014 04:50:21 +0000
>
>> Do you have any better idea to achieve what you said besides this
>> patch approach without both fragmentation and ICMP message at the same
>> time to cater for all kinds tunnel tech?
>
>I am not obligated to figure out for you how to design a correctly implemented
>patch.
>
>But I am obligated to keep a bad change from going into the tree and that is what I
>am doing.

"bad" is not depending whether you say it or not, but what the real world needs and what
proper solution could be provided at the time being.

^ permalink raw reply

* RE: [PATCH net-next] r8152: reduce memory copy for rx
From: Hayes Wang @ 2014-12-03  7:05 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: netdev@vger.kernel.org, nic_swsd, linux-kernel@vger.kernel.org,
	linux-usb@vger.kernel.org
In-Reply-To: <1417586858.5303.118.camel@edumazet-glaptop2.roam.corp.google.com>

Eric Dumazet [mailto:eric.dumazet@gmail.com] 
> Sent: Wednesday, December 03, 2014 2:08 PM
[...]
> Better performance for what workload exactly ?

I test it by using Chariot with 4 Tx and 4 Rx.
It has about 4% improvement.

> cloning in rx path has many drawbacks, with skb->truesize 
> being usually wrong.

Excuse me. I find the skb_clone() would copy the
truesize from original skb. Do you mean the value
may not be suitable for the cloned skb?

Could other method do the same thing? Or, do you
think keeping the original one is better?
 
Best Regards,
Hayes

^ permalink raw reply

* Re: [PATCH net-next] r8152: reduce memory copy for rx
From: Eric Dumazet @ 2014-12-03  7:15 UTC (permalink / raw)
  To: Hayes Wang
  Cc: netdev@vger.kernel.org, nic_swsd, linux-kernel@vger.kernel.org,
	linux-usb@vger.kernel.org
In-Reply-To: <0835B3720019904CB8F7AA43166CEEB2ED360A@RTITMBSV03.realtek.com.tw>

On Wed, 2014-12-03 at 07:05 +0000, Hayes Wang wrote:
> Eric Dumazet [mailto:eric.dumazet@gmail.com] 
> > Sent: Wednesday, December 03, 2014 2:08 PM
> [...]
> > Better performance for what workload exactly ?
> 
> I test it by using Chariot with 4 Tx and 4 Rx.
> It has about 4% improvement.
> 

Have you tried using more concurrent RX flows, in a possibly lossy
environment (so that TCP is forced to queue packets in out of order
queue) ?

> > cloning in rx path has many drawbacks, with skb->truesize 
> > being usually wrong.
> 
> Excuse me. I find the skb_clone() would copy the
> truesize from original skb. Do you mean the value
> may not be suitable for the cloned skb?

With cloning, (skb->len / skb->truesize) will eventually be very very
small, forcing TCP stack to perform collapses (copies !!!) under
pressure.

> 
> Could other method do the same thing? Or, do you
> think keeping the original one is better?


skb cloning prevents GRO and TCP coalescing from working.

netfilter might also be forced to copy whole frame in case a mangle is
needed (eg with NAT ...)

I would rather try to implement GRO, and/or using fragments instead of
pure linear skbs.

(skb->head would be around 128 or 256 bytes, and you attach to skb the
frame as a page fragment)

^ permalink raw reply

* [PATCH v3 net-net 0/2] Increase the limit of tuntap queues
From: Pankaj Gupta @ 2014-12-03  7:19 UTC (permalink / raw)
  To: linux-kernel, netdev
  Cc: davem, jasowang, mst, dgibson, vfalico, edumazet, vyasevic, hkchu,
	wuzhy, xemul, therbert, bhutchings, xii, stephen, jiri,
	sergei.shtylyov, Pankaj Gupta

Networking under KVM works best if we allocate a per-vCPU rx and tx
queue in a virtual NIC. This requires a per-vCPU queue on the host side.
Modern physical NICs have multiqueue support for large number of queues.
To scale vNIC to run multiple queues parallel to maximum number of vCPU's
we need to increase number of queues support in tuntap.   

Changes from v2:
PATCH 3: David Miller     - flex array adds extra level of indirection
                            for preallocated array.(dropped, as flow array
			    is allocated using kzalloc with failover to zalloc). 
Changes from v1:
PATCH 2: David Miller     - sysctl changes to limit number of queues 
                            not required for unprivileged users(dropped).

Changes from RFC
PATCH 1: Sergei Shtylyov  - Add an empty line after declarations.
PATCH 2: Jiri Pirko -       Do not introduce new module paramaters.
	 Michael.S.Tsirkin- We can use sysctl for limiting max number
                            of queues.

This series is to increase the number of tuntap queues. Original work is being 
done by 'jasowang@redhat.com'. I am taking this 'https://lkml.org/lkml/2013/6/19/29' 
patch series as a reference. As per discussion in the patch series:

There were two reasons which prevented us from increasing number of tun queues:

- The netdev_queue array in netdevice were allocated through kmalloc, which may 
  cause a high order memory allocation too when we have several queues. 
  E.g. sizeof(netdev_queue) is 320, which means a high order allocation would 
  happens when the device has more than 16 queues.

- We store the hash buckets in tun_struct which results a very large size of
  tun_struct, this high order memory allocation fail easily when the memory is
  fragmented.

The patch 60877a32bce00041528576e6b8df5abe9251fa73 increases the number of tx 
queues. Memory allocation fallback to vzalloc() when kmalloc() fails.

This series tries to address following issues:

- Increase the number of netdev_queue queues for rx similarly its done for tx 
  queues by falling back to vzalloc() when memory allocation with kmalloc() fails.

- Increase number of queues to 256, maximum number is equal to maximum number 
  of vCPUS allowed in a guest.

I have done some testing to test any regression with sample program which creates 
tun/tap for single queue / multiqueue device. I have also done testing with multiple 
parallel Netperf sessions from guest to host for different combination of queues 
and CPU's. It seems to be working fine without much increase in cpu load with the 
increase in number of queues. Though i had limitation of 4 physical CPU's. 


For this test vhost threads are pinned to separate CPU's. Below are the results:
Host kernel: 3.18.rc4, Intel(R) Core(TM) i7-3520M CPU @ 2.90GHz, 4 CPUS
NIC : Ethernet controller: Intel Corporation 82579LM Gigabit Network


Patch Applied  %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle  throughput
Single Queue
-------------
Before :all    7.94    0.01    1.79    3.00    0.26    0.15    0.00    3.21    0.00   83.64  64924.94
After  :all    2.15    0.00    0.82    2.21    0.08    0.13    0.00    0.83    0.00   93.79  68799.88

2 Queues
Before :all    6.75    0.06    1.91    3.93    0.23    0.21    0.00    3.84    0.00   83.07  69569.30
After  :all    2.12    0.00    0.92    2.51    0.08    0.15    0.00    1.19    0.00   93.02  71386.79

4 Queues
Before :all    6.09    0.05    1.88    3.83    0.22    0.22    0.00    3.74    0.00   83.98  76170.60
After  :all    2.12    0.00    1.01    2.72    0.09    0.16    0.00    1.47    0.00   92.43  75492.34

8 Queues
Before :all    5.80    0.05    1.91    3.97    0.21    0.23    0.00    3.88    0.00   83.96  70843.88
After  :all    2.06    0.00    1.06    2.77    0.09    0.17    0.00    1.66    0.00   92.19  74486.31
16 Queues
--------------
After  :all    2.04    0.00    1.13    2.90    0.10    0.18    0.00    2.02    0.00   91.63  73227.45

Patches Summary:
  net: allow large number of rx queues
  tuntap: Increase the number of queues in tun

 drivers/net/tun.c |    9 +++++----
 net/core/dev.c    |   19 +++++++++++++------
 2 files changed, 18 insertions(+), 10 deletions(-)

^ permalink raw reply

* [PATCH v3 net-next 1/2] net: allow large number of rx queues
From: Pankaj Gupta @ 2014-12-03  7:19 UTC (permalink / raw)
  To: linux-kernel, netdev
  Cc: davem, jasowang, mst, dgibson, vfalico, edumazet, vyasevic, hkchu,
	wuzhy, xemul, therbert, bhutchings, xii, stephen, jiri,
	sergei.shtylyov, Pankaj Gupta
In-Reply-To: <1417591177-7985-1-git-send-email-pagupta@redhat.com>

netif_alloc_rx_queues() uses kcalloc() to allocate memory
for "struct netdev_queue *_rx" array.
If we are doing large rx queue allocation kcalloc() might
fail, so this patch does a fallback to vzalloc().
Similar implementation is done for tx queue allocation in
netif_alloc_netdev_queues().

We avoid failure of high order memory allocation
with the help of vzalloc(), this allows us to do large
rx and tx queue allocation which in turn helps us to
increase the number of queues in tun.

As vmalloc() adds overhead on a critical network path,
__GFP_REPEAT flag is used with kzalloc() to do this fallback
only when really needed.

Signed-off-by: Pankaj Gupta <pagupta@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: David Gibson <dgibson@redhat.com>
---
 net/core/dev.c | 19 +++++++++++++------
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index e916ba8..abe9560 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -6059,17 +6059,25 @@ void netif_stacked_transfer_operstate(const struct net_device *rootdev,
 EXPORT_SYMBOL(netif_stacked_transfer_operstate);
 
 #ifdef CONFIG_SYSFS
+static void netif_free_rx_queues(struct net_device *dev)
+{
+	kvfree(dev->_rx);
+}
+
 static int netif_alloc_rx_queues(struct net_device *dev)
 {
 	unsigned int i, count = dev->num_rx_queues;
 	struct netdev_rx_queue *rx;
+	size_t sz = count * sizeof(*rx);
 
 	BUG_ON(count < 1);
 
-	rx = kcalloc(count, sizeof(struct netdev_rx_queue), GFP_KERNEL);
-	if (!rx)
-		return -ENOMEM;
-
+	rx = kzalloc(sz, GFP_KERNEL | __GFP_NOWARN | __GFP_REPEAT);
+	if (!rx) {
+		rx = vzalloc(sz);
+		if (!rx)
+			return -ENOMEM;
+	}
 	dev->_rx = rx;
 
 	for (i = 0; i < count; i++)
@@ -6698,9 +6706,8 @@ void free_netdev(struct net_device *dev)
 
 	netif_free_tx_queues(dev);
 #ifdef CONFIG_SYSFS
-	kfree(dev->_rx);
+	netif_free_rx_queues(dev);
 #endif
-
 	kfree(rcu_dereference_protected(dev->ingress_queue, 1));
 
 	/* Flush device addresses */
-- 
1.8.3.1

^ permalink raw reply related

* [PATCH v3 net-next 2/2 tuntap: Increase the number of queues in tun.
From: Pankaj Gupta @ 2014-12-03  7:19 UTC (permalink / raw)
  To: linux-kernel, netdev
  Cc: davem, jasowang, mst, dgibson, vfalico, edumazet, vyasevic, hkchu,
	wuzhy, xemul, therbert, bhutchings, xii, stephen, jiri,
	sergei.shtylyov, Pankaj Gupta
In-Reply-To: <1417591177-7985-1-git-send-email-pagupta@redhat.com>

Networking under kvm works best if we allocate a per-vCPU RX and TX
queue in a virtual NIC. This requires a per-vCPU queue on the host side.

It is now safe to increase the maximum number of queues.
Preceding patche: 'net: allow large number of rx queues'
made sure this won't cause failures due to high order memory
allocations. Increase it to 256: this is the max number of vCPUs
KVM supports.

Signed-off-by: Pankaj Gupta <pagupta@redhat.com>
Reviewed-by: David Gibson <dgibson@redhat.com>
---
 drivers/net/tun.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index e3fa65a..a19dc5f8 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -113,10 +113,11 @@ struct tap_filter {
 	unsigned char	addr[FLT_EXACT_COUNT][ETH_ALEN];
 };
 
-/* DEFAULT_MAX_NUM_RSS_QUEUES were chosen to let the rx/tx queues allocated for
- * the netdevice to be fit in one page. So we can make sure the success of
- * memory allocation. TODO: increase the limit. */
-#define MAX_TAP_QUEUES DEFAULT_MAX_NUM_RSS_QUEUES
+/* MAX_TAP_QUEUES 256 is chosen to allow rx/tx queues to be equal
+ * to max number of vCPUS in guest. Also, we are making sure here
+ * queue memory allocation do not fail.
+ */
+#define MAX_TAP_QUEUES 256
 #define MAX_TAP_FLOWS  4096
 
 #define TUN_FLOW_EXPIRE (3 * HZ)
-- 
1.8.3.1

^ permalink raw reply related

* [PATCH net-next v3] rtnetlink: delay RTM_DELLINK notification until after ndo_uninit()
From: Mahesh Bandewar @ 2014-12-03  7:24 UTC (permalink / raw)
  To: netdev
  Cc: David Miller, Eric Dumazet, Roopa Prabhu, Toshiaki Makita,
	Mahesh Bandewar

The commit 56bfa7ee7c ("unregister_netdevice : move RTM_DELLINK to
until after ndo_uninit") tried to do this ealier but while doing so
it created a problem. Unfortunately the delayed rtmsg_ifinfo() also
delayed call to fill_info(). So this translated into asking driver
to remove private state and then query it's private state. This
could have catastropic consequences.

This change breaks the rtmsg_ifinfo() into two parts - one takes the
precise snapshot of the device by called fill_info() before calling
the ndo_uninit() and the second part sends the notification using
collected snapshot.

It was brought to notice when last link is deleted from an ipvlan device
when it has free-ed the port and the subsequent .fill_info() call is
trying to get the info from the port.

kernel: [  255.139429] ------------[ cut here ]------------
kernel: [  255.139439] WARNING: CPU: 12 PID: 11173 at net/core/rtnetlink.c:2238 rtmsg_ifinfo+0x100/0x110()
kernel: [  255.139493] Modules linked in: ipvlan bonding w1_therm ds2482 wire cdc_acm ehci_pci ehci_hcd i2c_dev i2c_i801 i2c_core msr cpuid bnx2x ptp pps_core mdio libcrc32c
kernel: [  255.139513] CPU: 12 PID: 11173 Comm: ip Not tainted 3.18.0-smp-DEV #167
kernel: [  255.139514] Hardware name: Intel RML,PCH/Ibis_QC_18, BIOS 1.0.10 05/15/2012
kernel: [  255.139515]  0000000000000009 ffff880851b6b828 ffffffff815d87f4 00000000000000e0
kernel: [  255.139516]  0000000000000000 ffff880851b6b868 ffffffff8109c29c 0000000000000000
kernel: [  255.139518]  00000000ffffffa6 00000000000000d0 ffffffff81aaf580 0000000000000011
kernel: [  255.139520] Call Trace:
kernel: [  255.139527]  [<ffffffff815d87f4>] dump_stack+0x46/0x58
kernel: [  255.139531]  [<ffffffff8109c29c>] warn_slowpath_common+0x8c/0xc0
kernel: [  255.139540]  [<ffffffff8109c2ea>] warn_slowpath_null+0x1a/0x20
kernel: [  255.139544]  [<ffffffff8150d570>] rtmsg_ifinfo+0x100/0x110
kernel: [  255.139547]  [<ffffffff814f78b5>] rollback_registered_many+0x1d5/0x2d0
kernel: [  255.139549]  [<ffffffff814f79cf>] unregister_netdevice_many+0x1f/0xb0
kernel: [  255.139551]  [<ffffffff8150acab>] rtnl_dellink+0xbb/0x110
kernel: [  255.139553]  [<ffffffff8150da90>] rtnetlink_rcv_msg+0xa0/0x240
kernel: [  255.139557]  [<ffffffff81329283>] ? rhashtable_lookup_compare+0x43/0x80
kernel: [  255.139558]  [<ffffffff8150d9f0>] ? __rtnl_unlock+0x20/0x20
kernel: [  255.139562]  [<ffffffff8152cb11>] netlink_rcv_skb+0xb1/0xc0
kernel: [  255.139563]  [<ffffffff8150a495>] rtnetlink_rcv+0x25/0x40
kernel: [  255.139565]  [<ffffffff8152c398>] netlink_unicast+0x178/0x230
kernel: [  255.139567]  [<ffffffff8152c75f>] netlink_sendmsg+0x30f/0x420
kernel: [  255.139571]  [<ffffffff814e0b0c>] sock_sendmsg+0x9c/0xd0
kernel: [  255.139575]  [<ffffffff811d1d7f>] ? rw_copy_check_uvector+0x6f/0x130
kernel: [  255.139577]  [<ffffffff814e11c9>] ? copy_msghdr_from_user+0x139/0x1b0
kernel: [  255.139578]  [<ffffffff814e1774>] ___sys_sendmsg+0x304/0x310
kernel: [  255.139581]  [<ffffffff81198723>] ? handle_mm_fault+0xca3/0xde0
kernel: [  255.139585]  [<ffffffff811ebc4c>] ? destroy_inode+0x3c/0x70
kernel: [  255.139589]  [<ffffffff8108e6ec>] ? __do_page_fault+0x20c/0x500
kernel: [  255.139597]  [<ffffffff811e8336>] ? dput+0xb6/0x190
kernel: [  255.139606]  [<ffffffff811f05f6>] ? mntput+0x26/0x40
kernel: [  255.139611]  [<ffffffff811d2b94>] ? __fput+0x174/0x1e0
kernel: [  255.139613]  [<ffffffff814e2129>] __sys_sendmsg+0x49/0x90
kernel: [  255.139615]  [<ffffffff814e2182>] SyS_sendmsg+0x12/0x20
kernel: [  255.139617]  [<ffffffff815df092>] system_call_fastpath+0x12/0x17
kernel: [  255.139619] ---[ end trace 5e6703e87d984f6b ]---

Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Reported-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Roopa Prabhu <roopa@cumulusnetworks.com>
Cc: David S. Miller <davem@davemloft.net>
---
v1:
	Initial version
v2:
	Keep the rtmsg_ifinfo() return type as it is but break the function into
	two minimizing the changes all over places
v3:
	Corrected an error in the code.
---
 include/linux/rtnetlink.h |  5 +++++
 net/core/dev.c            | 12 +++++++++---
 net/core/rtnetlink.c      | 27 +++++++++++++++++++++++----
 3 files changed, 37 insertions(+), 7 deletions(-)

diff --git a/include/linux/rtnetlink.h b/include/linux/rtnetlink.h
index 6cacbce1a06c..19dc0bce9c2b 100644
--- a/include/linux/rtnetlink.h
+++ b/include/linux/rtnetlink.h
@@ -17,6 +17,11 @@ extern int rtnl_put_cacheinfo(struct sk_buff *skb, struct dst_entry *dst,
 			      u32 id, long expires, u32 error);
 
 void rtmsg_ifinfo(int type, struct net_device *dev, unsigned change, gfp_t flags);
+struct sk_buff *rtmsg_ifinfo_build_skb(int type, struct net_device *dev,
+				       unsigned change, gfp_t flags);
+void rtmsg_ifinfo_send(struct sk_buff *skb, struct net_device *dev,
+		       gfp_t flags);
+
 
 /* RTNL is used as a global lock for all changes to network configuration  */
 extern void rtnl_lock(void);
diff --git a/net/core/dev.c b/net/core/dev.c
index ac4836241a96..98f6563b68b6 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -5925,6 +5925,8 @@ static void rollback_registered_many(struct list_head *head)
 	synchronize_net();
 
 	list_for_each_entry(dev, head, unreg_list) {
+		struct sk_buff *skb = NULL;
+
 		/* Shutdown queueing discipline. */
 		dev_shutdown(dev);
 
@@ -5934,6 +5936,11 @@ static void rollback_registered_many(struct list_head *head)
 		*/
 		call_netdevice_notifiers(NETDEV_UNREGISTER, dev);
 
+		if (!dev->rtnl_link_ops ||
+		    dev->rtnl_link_state == RTNL_LINK_INITIALIZED)
+			skb = rtmsg_ifinfo_build_skb(RTM_DELLINK, dev, ~0U,
+						     GFP_KERNEL);
+
 		/*
 		 *	Flush the unicast and multicast chains
 		 */
@@ -5943,9 +5950,8 @@ static void rollback_registered_many(struct list_head *head)
 		if (dev->netdev_ops->ndo_uninit)
 			dev->netdev_ops->ndo_uninit(dev);
 
-		if (!dev->rtnl_link_ops ||
-		    dev->rtnl_link_state == RTNL_LINK_INITIALIZED)
-			rtmsg_ifinfo(RTM_DELLINK, dev, ~0U, GFP_KERNEL);
+		if (skb)
+			rtmsg_ifinfo_send(skb, dev, GFP_KERNEL);
 
 		/* Notifier chain MUST detach us all upper devices. */
 		WARN_ON(netdev_has_any_upper_dev(dev));
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index b9b7dfaf202b..88787c10b8a6 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -2220,8 +2220,8 @@ static int rtnl_dump_all(struct sk_buff *skb, struct netlink_callback *cb)
 	return skb->len;
 }
 
-void rtmsg_ifinfo(int type, struct net_device *dev, unsigned int change,
-		  gfp_t flags)
+struct sk_buff *rtmsg_ifinfo_build_skb(int type, struct net_device *dev,
+				       unsigned int change, gfp_t flags)
 {
 	struct net *net = dev_net(dev);
 	struct sk_buff *skb;
@@ -2239,11 +2239,30 @@ void rtmsg_ifinfo(int type, struct net_device *dev, unsigned int change,
 		kfree_skb(skb);
 		goto errout;
 	}
-	rtnl_notify(skb, net, 0, RTNLGRP_LINK, NULL, flags);
-	return;
+	return skb;
 errout:
 	if (err < 0)
 		rtnl_set_sk_err(net, RTNLGRP_LINK, err);
+	return NULL;
+}
+EXPORT_SYMBOL(rtmsg_ifinfo_build_skb);
+
+void rtmsg_ifinfo_send(struct sk_buff *skb, struct net_device *dev, gfp_t flags)
+{
+	struct net *net = dev_net(dev);
+
+	rtnl_notify(skb, net, 0, RTNLGRP_LINK, NULL, flags);
+}
+EXPORT_SYMBOL(rtmsg_ifinfo_send);
+
+void rtmsg_ifinfo(int type, struct net_device *dev, unsigned int change,
+		  gfp_t flags)
+{
+	struct sk_buff *skb;
+
+	skb = rtmsg_ifinfo_build_skb(type, dev, change, flags);
+	if (skb)
+		rtmsg_ifinfo_send(skb, dev, flags);
 }
 EXPORT_SYMBOL(rtmsg_ifinfo);
 
-- 
2.2.0.rc0.207.ga3a616c

^ permalink raw reply related

* Re: [PATCH] net: less interrupt masking in NAPI
From: Yang Yingliang @ 2014-12-03  7:31 UTC (permalink / raw)
  To: David Miller, eric.dumazet; +Cc: netdev, willemb
In-Reply-To: <20141103.122538.387451917276174830.davem@davemloft.net>

On 2014/11/4 1:25, David Miller wrote:
> From: Eric Dumazet <eric.dumazet@gmail.com>
> Date: Sun, 02 Nov 2014 06:19:33 -0800
> 
>> From: Eric Dumazet <edumazet@google.com>
>>
>> net_rx_action() can mask irqs a single time to transfert sd->poll_list
>> into a private list, for a very short duration.
>>
>> Then, napi_complete() can avoid masking irqs again,
>> and net_rx_action() only needs to mask irq again in slow path.
>>
>> This patch removes 2 couples of irq mask/unmask per typical NAPI run,
>> more if multiple napi were triggered.
>>
>> Note this also allows to give control back to caller (do_softirq())
>> more often, so that other softirq handlers can be called a bit earlier,
>> or ksoftirqd can be wakeup earlier under pressure.
>>
>> This was developed while testing an alternative to RX interrupt
>> mitigation to reduce latencies while keeping or improving GRO
>> aggregation on fast NIC.
>>
>> Idea is to test napi->gro_list at the end of a napi->poll() and
>> reschedule one NAPI poll, but after servicing a full round of
>> softirqs (timers, TX, rcu, ...). This will be allowed only if softirq
>> is currently serviced by idle task or ksoftirqd, and resched not needed.
>>
>> Signed-off-by: Eric Dumazet <edumazet@google.com>
> 
> Also applied, thanks Eric.

This patch can resolve my performance problem.
Will/Can this patch queue for stable ?

Regards,
Yang

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox