Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH V3 4/6] vhost_net: determine whether or not to use zerocopy at one time
From: Jason Wang @ 2013-09-29  9:36 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: netdev, linux-kernel, kvm, virtualization
In-Reply-To: <5243B859.3070302@redhat.com>

On 09/26/2013 12:30 PM, Jason Wang wrote:
> On 09/23/2013 03:16 PM, Michael S. Tsirkin wrote:
>> > On Thu, Sep 05, 2013 at 10:54:44AM +0800, Jason Wang wrote:
>>>> >> > On 09/04/2013 07:59 PM, Michael S. Tsirkin wrote:
>>>>>> >>> > > On Mon, Sep 02, 2013 at 04:40:59PM +0800, Jason Wang wrote:
>>>>>>>> >>>> > >> Currently, even if the packet length is smaller than VHOST_GOODCOPY_LEN, if
>>>>>>>> >>>> > >> upend_idx != done_idx we still set zcopy_used to true and rollback this choice
>>>>>>>> >>>> > >> later. This could be avoided by determining zerocopy once by checking all
>>>>>>>> >>>> > >> conditions at one time before.
>>>>>>>> >>>> > >>
>>>>>>>> >>>> > >> Signed-off-by: Jason Wang <jasowang@redhat.com>
>>>>>>>> >>>> > >> ---
>>>>>>>> >>>> > >>  drivers/vhost/net.c |   47 ++++++++++++++++++++---------------------------
>>>>>>>> >>>> > >>  1 files changed, 20 insertions(+), 27 deletions(-)
>>>>>>>> >>>> > >>
>>>>>>>> >>>> > >> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
>>>>>>>> >>>> > >> index 8a6dd0d..3f89dea 100644
>>>>>>>> >>>> > >> --- a/drivers/vhost/net.c
>>>>>>>> >>>> > >> +++ b/drivers/vhost/net.c
>>>>>>>> >>>> > >> @@ -404,43 +404,36 @@ static void handle_tx(struct vhost_net *net)
>>>>>>>> >>>> > >>  			       iov_length(nvq->hdr, s), hdr_size);
>>>>>>>> >>>> > >>  			break;
>>>>>>>> >>>> > >>  		}
>>>>>>>> >>>> > >> -		zcopy_used = zcopy && (len >= VHOST_GOODCOPY_LEN ||
>>>>>>>> >>>> > >> -				       nvq->upend_idx != nvq->done_idx);
>>>>>>>> >>>> > >> +
>>>>>>>> >>>> > >> +		zcopy_used = zcopy && len >= VHOST_GOODCOPY_LEN
>>>>>>>> >>>> > >> +				   && (nvq->upend_idx + 1) % UIO_MAXIOV !=
>>>>>>>> >>>> > >> +				      nvq->done_idx
>>>>>> >>> > > Thinking about this, this looks strange.
>>>>>> >>> > > The original idea was that once we start doing zcopy, we keep
>>>>>> >>> > > using the heads ring even for short packets until no zcopy is outstanding.
>>>> >> > 
>>>> >> > What's the reason for keep using the heads ring?
>> > To keep completions in order.
> Ok, I will do some test to see the impact.

Since the our of order completion will happen when switching between
zero copy and non zero copy. I test this by using two sessions of
netperf in burst mode, one with 1 byte TCP_RR another with 512 bytes of
TCP_RR. There's no difference with the patch applied.

^ permalink raw reply

* RE: [PATCH net-next] net: skb_is_gso_v6() requires skb_is_gso()
From: Dmitry Kravkov @ 2013-09-29  9:03 UTC (permalink / raw)
  To: Eric Dumazet, David Miller; +Cc: netdev, Eilon Greenstein
In-Reply-To: <1380442892.3596.22.camel@edumazet-glaptop.roam.corp.google.com>

> -----Original Message-----
> From: netdev-owner@vger.kernel.org [mailto:netdev-
> owner@vger.kernel.org] On Behalf Of Eric Dumazet
> Sent: Sunday, September 29, 2013 11:22 AM
> To: David Miller
> Cc: netdev; Eilon Greenstein
> Subject: [PATCH net-next] net: skb_is_gso_v6() requires skb_is_gso()
> 
> From: Eric Dumazet <edumazet@google.com>
> 
> bnx2x makes a dangerous use of skb_is_gso_v6().
> 
> It should first make sure skb is a gso packet
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Eilon Greenstein <eilong@broadcom.com>

Thanks, Eric!

Acked-by: Dmitry Kravkov <dmitry@broadcom.com>

^ permalink raw reply

* RE: ADMINISTRATOR HELP DESK
From: Greene, Tanya @ 2013-09-29  8:26 UTC (permalink / raw)
  To: Greene, Tanya
In-Reply-To: <0FF456CBAA9B324987CEA675439D7EF801151DF4AE@EXCH10-MB3.paterson.k12.nj.us>


Your Mailbox Has Exceeded It Storage Limit As Set By Your Administrator from the server, And You Will Not Be  Able To Receive New Mails until You Re-Validate It click Re-validate<http://postmaster-supportit.coffeecup.com/forms/HELP%20DESK/> .Thank you courtesy © 2013 by Intellectual Reserve, Inc. All rights reserved

^ permalink raw reply

* [PATCH net-next] net: skb_is_gso_v6() requires skb_is_gso()
From: Eric Dumazet @ 2013-09-29  8:21 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Eilon Greenstein

From: Eric Dumazet <edumazet@google.com>

bnx2x makes a dangerous use of skb_is_gso_v6().

It should first make sure skb is a gso packet

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Eilon Greenstein <eilong@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c |   18 +++++++-------
 include/linux/skbuff.h                          |    1 
 2 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
index 61726af..0c7fb1e 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
@@ -3256,14 +3256,16 @@ static u32 bnx2x_xmit_type(struct bnx2x *bp, struct sk_buff *skb)
 	if (prot == IPPROTO_TCP)
 		rc |= XMIT_CSUM_TCP;
 
-	if (skb_is_gso_v6(skb)) {
-		rc |= (XMIT_GSO_V6 | XMIT_CSUM_TCP);
-		if (rc & XMIT_CSUM_ENC)
-			rc |= XMIT_GSO_ENC_V6;
-	} else if (skb_is_gso(skb)) {
-		rc |= (XMIT_GSO_V4 | XMIT_CSUM_TCP);
-		if (rc & XMIT_CSUM_ENC)
-			rc |= XMIT_GSO_ENC_V4;
+	if (skb_is_gso(skb)) {
+		if (skb_is_gso_v6(skb)) {
+			rc |= (XMIT_GSO_V6 | XMIT_CSUM_TCP);
+			if (rc & XMIT_CSUM_ENC)
+				rc |= XMIT_GSO_ENC_V6;
+		} else {
+			rc |= (XMIT_GSO_V4 | XMIT_CSUM_TCP);
+			if (rc & XMIT_CSUM_ENC)
+				rc |= XMIT_GSO_ENC_V4;
+		}
 	}
 
 	return rc;
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 2ddb48d..d48fde30 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2783,6 +2783,7 @@ static inline bool skb_is_gso(const struct sk_buff *skb)
 	return skb_shinfo(skb)->gso_size;
 }
 
+/* Note: Should be called only if skb_is_gso(skb) is true */
 static inline bool skb_is_gso_v6(const struct sk_buff *skb)
 {
 	return skb_shinfo(skb)->gso_type & SKB_GSO_TCPV6;

^ permalink raw reply related

* [PATCH net-next] net: add missing sk_max_pacing_rate doc
From: Eric Dumazet @ 2013-09-29  8:12 UTC (permalink / raw)
  To: David Miller; +Cc: netdev
In-Reply-To: <20130928.153610.2280100769720851440.davem@davemloft.net>

From: Eric Dumazet <edumazet@google.com>

Warning(include/net/sock.h:411): No description found for parameter
'sk_max_pacing_rate'

Lets please "make htmldocs" and kbuild bot.

Reported-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 include/net/sock.h |    1 +
 1 file changed, 1 insertion(+)

diff --git a/include/net/sock.h b/include/net/sock.h
index 240aa3f..cf91c8e 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -233,6 +233,7 @@ struct cg_proto;
   *	@sk_ll_usec: usecs to busypoll when there is no data
   *	@sk_allocation: allocation mode
   *	@sk_pacing_rate: Pacing rate (if supported by transport/packet scheduler)
+  *	@sk_max_pacing_rate: Maximum pacing rate (%SO_MAX_PACING_RATE)
   *	@sk_sndbuf: size of send buffer in bytes
   *	@sk_flags: %SO_LINGER (l_onoff), %SO_BROADCAST, %SO_KEEPALIVE,
   *		   %SO_OOBINLINE settings, %SO_TIMESTAMPING settings

^ permalink raw reply related

* Re: [PATCH] ipv6: gre: correct calculation of max_headroom
From: Eric Dumazet @ 2013-09-29  8:05 UTC (permalink / raw)
  To: Hannes Frederic Sowa; +Cc: netdev, xeb
In-Reply-To: <20130929034050.GB8602@order.stressinduktion.org>

On Sun, 2013-09-29 at 05:40 +0200, Hannes Frederic Sowa wrote:
> gre_hlen already accounts for sizeof(struct ipv6_hdr) + gre header,
> so initialize max_headroom to zero. Otherwise the
> 
> 	if (encap_limit >= 0) {
> 		max_headroom += 8;
> 		mtu -= 8;
> 	}
> 
> increments an uninitialized variable before max_headroom was reset.
> 
> Found with coverity: 728539
> 
> Cc: Dmitry Kozlov <xeb@mail.ru>
> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
> ---

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply

* Re: [PATCH] ipv6: sit: ensure prefixlen <= 128 before calling ipv6_addr_prefix
From: Hannes Frederic Sowa @ 2013-09-29  3:53 UTC (permalink / raw)
  To: netdev
In-Reply-To: <20130929031535.GA8602@order.stressinduktion.org>

On Sun, Sep 29, 2013 at 05:15:35AM +0200, Hannes Frederic Sowa wrote:
> Found with coverity: CID 139492
> 
> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>

Disregard, I was fooled by the report. This is not necessary.

^ permalink raw reply

* [PATCH] ipv6: gre: correct calculation of max_headroom
From: Hannes Frederic Sowa @ 2013-09-29  3:40 UTC (permalink / raw)
  To: netdev; +Cc: xeb

gre_hlen already accounts for sizeof(struct ipv6_hdr) + gre header,
so initialize max_headroom to zero. Otherwise the

	if (encap_limit >= 0) {
		max_headroom += 8;
		mtu -= 8;
	}

increments an uninitialized variable before max_headroom was reset.

Found with coverity: 728539

Cc: Dmitry Kozlov <xeb@mail.ru>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
---
 net/ipv6/ip6_gre.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/ipv6/ip6_gre.c b/net/ipv6/ip6_gre.c
index 6b26e9f..7bb5446 100644
--- a/net/ipv6/ip6_gre.c
+++ b/net/ipv6/ip6_gre.c
@@ -618,7 +618,7 @@ static netdev_tx_t ip6gre_xmit2(struct sk_buff *skb,
 	struct ip6_tnl *tunnel = netdev_priv(dev);
 	struct net_device *tdev;    /* Device to other host */
 	struct ipv6hdr  *ipv6h;     /* Our new IP header */
-	unsigned int max_headroom;  /* The extra header space needed */
+	unsigned int max_headroom = 0; /* The extra header space needed */
 	int    gre_hlen;
 	struct ipv6_tel_txoption opt;
 	int    mtu;
@@ -693,7 +693,7 @@ static netdev_tx_t ip6gre_xmit2(struct sk_buff *skb,
 
 	skb_scrub_packet(skb, !net_eq(tunnel->net, dev_net(dev)));
 
-	max_headroom = LL_RESERVED_SPACE(tdev) + gre_hlen + dst->header_len;
+	max_headroom += LL_RESERVED_SPACE(tdev) + gre_hlen + dst->header_len;
 
 	if (skb_headroom(skb) < max_headroom || skb_shared(skb) ||
 	    (skb_cloned(skb) && !skb_clone_writable(skb, 0))) {
-- 
1.8.3.1

^ permalink raw reply related

* Re: [PATCH] {iproute2, xfrm}: Use memcpy to suppress gcc phony buffer overflow warning
From: Sohny Thomas @ 2013-09-29  3:21 UTC (permalink / raw)
  To: Fan Du; +Cc: David Laight, stephen, netdev
In-Reply-To: <52478710.702@windriver.com>

On Sunday 29 September 2013 07:19 AM, Fan Du wrote:
>
>
> On 2013年09月28日 21:24, Sohny Thomas wrote:
>> On Friday 27 September 2013 01:56 PM, David Laight wrote:
>>>> ip xfrm state add causes a SIGABRT due to a strncpy_chk .
>>>> This happens since strncpy doesn't account for the '\0' .
>>>> I have fixed this using sizeof instead of strlen .
>>>>
>>>> There is a redhat bug which documents this issue
>>>>
>>>> https://bugzilla.redhat.com/show_bug.cgi?id=982761
>>>>
>>>> Signed-off-by: Sohny Thomas <sohthoma@in.ibm.com>
>>>>
>>>> --------------
>>>>
>>>> diff --git a/ip/xfrm_state.c b/ip/xfrm_state.c
>>>> index 389942c..7dd8799 100644
>>>> --- a/ip/xfrm_state.c
>>>> +++ b/ip/xfrm_state.c
>>>> @@ -117,7 +117,7 @@ static int xfrm_algo_parse(struct xfrm_algo *alg,
>>>> enum xfrm_attr_type_t type,
>>>> char *name, char *key, char *buf, int max)
>>>> {
>>>> int len;
>>>> - int slen = strlen(key);
>>>> + int slen = sizeof(key);
>>>
>>> you definitely don't want sizeof(key) - that is either 4 or 8.
>> oh damn my bad.
>> I think i will go with strlen(key) + 1.
>>
>> or i will pass slen+1 to strncpy .
>
> You must be kidding about this slen+1 or strlen(key) + 1 , right?
strncpy fails with an abort at strncpy_chk.
So I was assuming it was because of the '\0' , so the +1
Thanks for the below patch
>
>
>
>  From 3895b3d88bcb873988089392079645f2c9665e95 Mon Sep 17 00:00:00 2001
> From: Fan Du <fan.du@windriver.com>
> Date: Sun, 29 Sep 2013 09:38:12 +0800
> Subject: [PATCH] {iproute2, xfrm}: Use memcpy to suppress gcc phony buffer
>   overflow warning.
>
> This bug is reported from below link:
> https://bugzilla.redhat.com/show_bug.cgi?id=982761
>
> An simplified command from its original reproducing method in bugzilla:
> ip xfrm state add src 10.0.0.2 dst 10.0.0.1 proto ah spi 0x12345678 auth
> md5 12
> will cause below spew from gcc:
>
> *** buffer overflow detected ***: ./ip terminated
> ======= Backtrace: =========
> /lib/x86_64-linux-gnu/libc.so.6(__fortify_fail+0x37)[0x7f894ae2b817]
> /lib/x86_64-linux-gnu/libc.so.6(+0x109710)[0x7f894ae2a710]
> /lib/x86_64-linux-gnu/libc.so.6(+0x1089f6)[0x7f894ae299f6]
> ./ip[0x42119d]
> ./ip(do_xfrm_state+0x231)[0x4217f1]
> ./ip[0x405d05]
> ./ip(main+0x2b4)[0x405934]
> /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xed)[0x7f894ad4276d]
> ./ip[0x405bb9]
> ======= Memory map: ========
> 00400000-0043c000 r-xp 00000000 08:01 1997183
> /workbench/iproute2/ip/ip
> 0063b000-0063c000 r--p 0003b000 08:01 1997183
> /workbench/iproute2/ip/ip
> 0063c000-00640000 rw-p 0003c000 08:01 1997183
> /workbench/iproute2/ip/ip
> 00640000-00643000 rw-p 00000000 00:00 0
> 017c7000-017e8000 rw-p 00000000 00:00 0
> [heap]
> 7f894ab0b000-7f894ab20000 r-xp 00000000 08:01 6032986
> /lib/x86_64-linux-gnu/libgcc_s.so.1
> 7f894ab20000-7f894ad1f000 ---p 00015000 08:01 6032986
> /lib/x86_64-linux-gnu/libgcc_s.so.1
> 7f894ad1f000-7f894ad20000 r--p 00014000 08:01 6032986
> /lib/x86_64-linux-gnu/libgcc_s.so.1
> 7f894ad20000-7f894ad21000 rw-p 00015000 08:01 6032986
> /lib/x86_64-linux-gnu/libgcc_s.so.1
> 7f894ad21000-7f894aed6000 r-xp 00000000 08:01 6033273
> /lib/x86_64-linux-gnu/libc-2.15.so
> 7f894aed6000-7f894b0d5000 ---p 001b5000 08:01 6033273
> /lib/x86_64-linux-gnu/libc-2.15.so
> 7f894b0d5000-7f894b0d9000 r--p 001b4000 08:01 6033273
> /lib/x86_64-linux-gnu/libc-2.15.so
> 7f894b0d9000-7f894b0db000 rw-p 001b8000 08:01 6033273
> /lib/x86_64-linux-gnu/libc-2.15.so
> 7f894b0db000-7f894b0e0000 rw-p 00000000 00:00 0
> 7f894b0e0000-7f894b0e2000 r-xp 00000000 08:01 6033272
> /lib/x86_64-linux-gnu/libdl-2.15.so
> 7f894b0e2000-7f894b2e2000 ---p 00002000 08:01 6033272
> /lib/x86_64-linux-gnu/libdl-2.15.so
> 7f894b2e2000-7f894b2e3000 r--p 00002000 08:01 6033272
> /lib/x86_64-linux-gnu/libdl-2.15.so
> 7f894b2e3000-7f894b2e4000 rw-p 00003000 08:01 6033272
> /lib/x86_64-linux-gnu/libdl-2.15.so
> 7f894b2e4000-7f894b306000 r-xp 00000000 08:01 6033287
> /lib/x86_64-linux-gnu/ld-2.15.so
> 7f894b4e3000-7f894b4e6000 rw-p 00000000 00:00 0
> 7f894b503000-7f894b506000 rw-p 00000000 00:00 0
> 7f894b506000-7f894b507000 r--p 00022000 08:01 6033287
> /lib/x86_64-linux-gnu/ld-2.15.so
> 7f894b507000-7f894b509000 rw-p 00023000 08:01 6033287
> /lib/x86_64-linux-gnu/ld-2.15.so
> 7fff9a3c2000-7fff9a3e3000 rw-p 00000000 00:00 0
> [stack]
> 7fff9a3ff000-7fff9a400000 r-xp 00000000 00:00 0
> [vdso]
> ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0
> [vsyscall]
>
> This is a false positive warning as the destination pointer "buf"
> pointers to
> an ZERO length array, which actually will occupy alg.buf mostly.
> Fix this by using memcpy.
>
> struct xfrm_algo {
>          char            alg_name[64];
>          unsigned int    alg_key_len;    /* in bits */
>          char            alg_key[0];
> };
>
> struct {
>          union {
>                  struct xfrm_algo alg;
>                  struct xfrm_algo_aead aead;
>                  struct xfrm_algo_auth auth;
>          } u;
>          char buf[XFRM_ALGO_KEY_BUF_SIZE];
> } alg = {};
>
> buf = alg.u.alg.alg_key;
> ---
>   ip/xfrm_state.c |    2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/ip/xfrm_state.c b/ip/xfrm_state.c
> index 0d98e78..5cc87d3 100644
> --- a/ip/xfrm_state.c
> +++ b/ip/xfrm_state.c
> @@ -159,7 +159,7 @@ static int xfrm_algo_parse(struct xfrm_algo *alg,
> enum xfrm_attr_type_t type,
>               if (len > max)
>                   invarg("\"ALGO-KEY\" makes buffer overflow\n", key);
>
> -            strncpy(buf, key, len);
> +            memcpy(buf, key, len);
>           }
>       }
>

^ permalink raw reply

* [PATCH] ipv6: sit: ensure prefixlen <= 128 before calling ipv6_addr_prefix
From: Hannes Frederic Sowa @ 2013-09-29  3:15 UTC (permalink / raw)
  To: netdev

Found with coverity: CID 139492

Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
---
 net/ipv6/sit.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/net/ipv6/sit.c b/net/ipv6/sit.c
index 7ee5cb9..4011178 100644
--- a/net/ipv6/sit.c
+++ b/net/ipv6/sit.c
@@ -1004,6 +1004,9 @@ static int ipip6_tunnel_update_6rd(struct ip_tunnel *t,
 	    ip6rd->prefixlen + (32 - ip6rd->relay_prefixlen) > 64)
 		return -EINVAL;
 
+	if (ip6rd->prefixlen > 128)
+		return -EINVAL;
+
 	ipv6_addr_prefix(&prefix, &ip6rd->prefix, ip6rd->prefixlen);
 	if (!ipv6_addr_equal(&prefix, &ip6rd->prefix))
 		return -EINVAL;
-- 
1.8.3.1

^ permalink raw reply related

* Re: [PATCH net-next] virtio-net: switch to use XPS to choose txq
From: Jason Wang @ 2013-09-29  3:14 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: netdev, linux-kernel, virtualization
In-Reply-To: <20130927143529.GA30007@redhat.com>

On 09/27/2013 10:35 PM, Michael S. Tsirkin wrote:
> On Fri, Sep 27, 2013 at 01:57:24PM +0800, Jason Wang wrote:
>> We used to use a percpu structure vq_index to record the cpu to queue
>> mapping, this is suboptimal since it duplicates the work of XPS and
>> loses all other XPS functionality such as allowing use to configure
>> their own transmission steering strategy.
>>
>> So this patch switches to use XPS and suggest a default mapping when
>> the number of cpus is equal to the number of queues. With XPS support,
>> there's no need for keeping per-cpu vq_index and .ndo_select_queue(),
>> so they were removed also.
>>
>> Cc: Rusty Russell <rusty@rustcorp.com.au>
>> Cc: Michael S. Tsirkin <mst@redhat.com>
>> Signed-off-by: Jason Wang <jasowang@redhat.com>
> More lines deleted that added is good :)
> But how does the result perform?
> About the same?
>

Yes, the same.
>> ---
>>  drivers/net/virtio_net.c |   55 +++++++--------------------------------------
>>  1 files changed, 9 insertions(+), 46 deletions(-)
>>
>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
>> index defec2b..4102c1b 100644
>> --- a/drivers/net/virtio_net.c
>> +++ b/drivers/net/virtio_net.c
>> @@ -127,9 +127,6 @@ struct virtnet_info {
>>  	/* Does the affinity hint is set for virtqueues? */
>>  	bool affinity_hint_set;
>>  
>> -	/* Per-cpu variable to show the mapping from CPU to virtqueue */
>> -	int __percpu *vq_index;
>> -
>>  	/* CPU hot plug notifier */
>>  	struct notifier_block nb;
>>  };
>> @@ -1063,7 +1060,6 @@ static int virtnet_vlan_rx_kill_vid(struct net_device *dev,
>>  static void virtnet_clean_affinity(struct virtnet_info *vi, long hcpu)
>>  {
>>  	int i;
>> -	int cpu;
>>  
>>  	if (vi->affinity_hint_set) {
>>  		for (i = 0; i < vi->max_queue_pairs; i++) {
>> @@ -1073,20 +1069,11 @@ static void virtnet_clean_affinity(struct virtnet_info *vi, long hcpu)
>>  
>>  		vi->affinity_hint_set = false;
>>  	}
>> -
>> -	i = 0;
>> -	for_each_online_cpu(cpu) {
>> -		if (cpu == hcpu) {
>> -			*per_cpu_ptr(vi->vq_index, cpu) = -1;
>> -		} else {
>> -			*per_cpu_ptr(vi->vq_index, cpu) =
>> -				++i % vi->curr_queue_pairs;
>> -		}
>> -	}
>>  }
>>  
>>  static void virtnet_set_affinity(struct virtnet_info *vi)
>>  {
>> +	cpumask_var_t cpumask;
>>  	int i;
>>  	int cpu;
>>  
>> @@ -1100,15 +1087,21 @@ static void virtnet_set_affinity(struct virtnet_info *vi)
>>  		return;
>>  	}
>>  
>> +	if (!alloc_cpumask_var(&cpumask, GFP_KERNEL))
>> +		return;
>> +
>>  	i = 0;
>>  	for_each_online_cpu(cpu) {
>>  		virtqueue_set_affinity(vi->rq[i].vq, cpu);
>>  		virtqueue_set_affinity(vi->sq[i].vq, cpu);
>> -		*per_cpu_ptr(vi->vq_index, cpu) = i;
>> +		cpumask_clear(cpumask);
>> +		cpumask_set_cpu(cpu, cpumask);
>> +		netif_set_xps_queue(vi->dev, cpumask, i);
>>  		i++;
>>  	}
>>  
>>  	vi->affinity_hint_set = true;
>> +	free_cpumask_var(cpumask);
>>  }
>>  
>>  static int virtnet_cpu_callback(struct notifier_block *nfb,
>> @@ -1217,28 +1210,6 @@ static int virtnet_change_mtu(struct net_device *dev, int new_mtu)
>>  	return 0;
>>  }
>>  
>> -/* To avoid contending a lock hold by a vcpu who would exit to host, select the
>> - * txq based on the processor id.
>> - */
>> -static u16 virtnet_select_queue(struct net_device *dev, struct sk_buff *skb)
>> -{
>> -	int txq;
>> -	struct virtnet_info *vi = netdev_priv(dev);
>> -
>> -	if (skb_rx_queue_recorded(skb)) {
>> -		txq = skb_get_rx_queue(skb);
>> -	} else {
>> -		txq = *__this_cpu_ptr(vi->vq_index);
>> -		if (txq == -1)
>> -			txq = 0;
>> -	}
>> -
>> -	while (unlikely(txq >= dev->real_num_tx_queues))
>> -		txq -= dev->real_num_tx_queues;
>> -
>> -	return txq;
>> -}
>> -
>>  static const struct net_device_ops virtnet_netdev = {
>>  	.ndo_open            = virtnet_open,
>>  	.ndo_stop   	     = virtnet_close,
>> @@ -1250,7 +1221,6 @@ static const struct net_device_ops virtnet_netdev = {
>>  	.ndo_get_stats64     = virtnet_stats,
>>  	.ndo_vlan_rx_add_vid = virtnet_vlan_rx_add_vid,
>>  	.ndo_vlan_rx_kill_vid = virtnet_vlan_rx_kill_vid,
>> -	.ndo_select_queue     = virtnet_select_queue,
>>  #ifdef CONFIG_NET_POLL_CONTROLLER
>>  	.ndo_poll_controller = virtnet_netpoll,
>>  #endif
>> @@ -1559,10 +1529,6 @@ static int virtnet_probe(struct virtio_device *vdev)
>>  	if (vi->stats == NULL)
>>  		goto free;
>>  
>> -	vi->vq_index = alloc_percpu(int);
>> -	if (vi->vq_index == NULL)
>> -		goto free_stats;
>> -
>>  	mutex_init(&vi->config_lock);
>>  	vi->config_enable = true;
>>  	INIT_WORK(&vi->config_work, virtnet_config_changed_work);
>> @@ -1589,7 +1555,7 @@ static int virtnet_probe(struct virtio_device *vdev)
>>  	/* Allocate/initialize the rx/tx queues, and invoke find_vqs */
>>  	err = init_vqs(vi);
>>  	if (err)
>> -		goto free_index;
>> +		goto free_stats;
>>  
>>  	netif_set_real_num_tx_queues(dev, 1);
>>  	netif_set_real_num_rx_queues(dev, 1);
>> @@ -1640,8 +1606,6 @@ free_recv_bufs:
>>  free_vqs:
>>  	cancel_delayed_work_sync(&vi->refill);
>>  	virtnet_del_vqs(vi);
>> -free_index:
>> -	free_percpu(vi->vq_index);
>>  free_stats:
>>  	free_percpu(vi->stats);
>>  free:
>> @@ -1678,7 +1642,6 @@ static void virtnet_remove(struct virtio_device *vdev)
>>  
>>  	flush_work(&vi->config_work);
>>  
>> -	free_percpu(vi->vq_index);
>>  	free_percpu(vi->stats);
>>  	free_netdev(vi->dev);
>>  }
>> -- 
>> 1.7.1

^ permalink raw reply

* Re: [PATCH v2.40 0/7] MPLS actions and matches
From: Joe Stringer @ 2013-09-29  2:35 UTC (permalink / raw)
  To: Simon Horman
  Cc: dev-yBygre7rU0TnMu66kgdUjQ@public.gmane.org, Ravi K,
	netdev-u79uwXL29TY76Z2rM5mHXA, Isaku Yamahata
In-Reply-To: <1380241116-7661-1-git-send-email-horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>


[-- Attachment #1.1: Type: text/plain, Size: 9109 bytes --]

Hi Simon,

I've dropped some comments on a couple of the userspace patches, but I
probably don't need to comment on everything as Ben's given some pretty
clear feedback. The crux would be dropping #3 in favour of something a bit
tidier. Feel free to prod my brain if need be :-).



On Fri, Sep 27, 2013 at 12:18 PM, Simon Horman <horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org> wrote:

> Hi,
>
> This series implements MPLS actions and matches based on work by
> Ravi K, Leo Alterman, Yamahata-san and Joe Stringer.
>
> This series provides two changes
>
> * Patches 1 - 5
>
>   Provide user-space support for the VLAN/MPLS tag insertion order
>   up to and including OpenFlow 1.2, and the different ordering
>   specified from OpenFlow 1.3. In a nutshell the datapath always
>   uses the OpenFlow 1.3 ordering, which is to always insert tags
>   immediately after the L2 header, regardless of the presence of other
>   tags. And ovs-vswtichd provides compatibility for the behaviour up
>   to OpenFlow 1.2, which is that MPLS tags should follow VLAN tags
>   if present.
>
>   Ben, these are for you to review.
>
> * Patches 6 and 7
>
>   Adding basic MPLS action and match support to the kernel datapath
>
>   Jesse, these are for you to review.
>
>
> Differences between v2.40 and v2.39:
>
> * Rebase for:
>   + New dev_queue_xmit compat code
>   + Updated put_vlan()
>   + Removal of mpls_depth field from struct flow
> * As suggested by Jesse Gross
>   + Remove bogus mac_len update from push_mpls()
>   + Slightly simplify push_mpls() by using eth_hdr()
>   + Remove dubious condition !eth_p_mpls(inner_protocol) on
>     an skb being considered to be MPLS in netdev_send()
>   + Only use compatibility code for MPLS GSO segmentation on kernels
>     older than 3.11
>   + Revamp setting of inner_protocol
>     1. Do not unconditionally set inner_protocol to the value of
>        skb->protocol in ovs_execute_actions().
>     2. Initialise inner_protocol it to zero only if compatibility code is
> in
>        use. In the case where compatibility code is not in use it will
> either
>        be zero due since the allocation of the skb or some other value set
>        by some other user.
>     3. Conditionally set the inner_protocol in push_mpls() to the value of
>        skb->protocol when entering push_mpls(). The condition is that
>        inner_protocol is zero and the value of skb->protocol is not an MPLS
>        ethernet type.
>     - This new scheme:
>       + Pushes logic to set inner_protocol closer to the case where it is
>         needed.
>       + Avoids over-writing values set by other users.
> * As suggested by Pravin Shelar
>   + Only set and restore skb->protocol in rpl___skb_gso_segment() in the
>     case of MPLS
>   + Add inner_protocol field to struct ovs_gso_cb instead of ovs_skb_cb.
>     This moves compatibility code closer to where it is used
>     and creates fewer differences with mainline.
> * Update comment on mac_len updates in datapath/actions.c
> * Remove HAVE_INNER_PROCOTOL and instead just check
>   against kernel version 3.11 directly.
>   HAVE_INNER_PROCOTOL is a hang-over from work done prior
>   to the merge of inner_protocol into the kernel.
> * Remove dubious condition !eth_p_mpls(inner_protocol) on
>   using inner_protocol as the type in rpl_skb_network_protocol()
> * Do not update type of features in rpl_dev_queue_xmit.
>   Though arguably correct this is not an inherent part of
>   the changes made by this patch.
> * Use skb_cow_head() in push_mpls()
>   + Call skb_cow_head(skb, MPLS_HLEN) instead of
>     make_writable(skb, skb->mac_len) to ensure that there is enough head
>     room to push an MPLS LSE regardless of whether the skb is cloned or
> not.
>   + This is consistent with the behaviour of rpl__vlan_put_tag().
>   + This is a fix for crashes reported when performing mpls_push
>     with headroom less than 4. This problem was introduced in v3.36.
> * Skip popping in mpls_pop if the skb is too short to contain an MPLS LSE
>
>
> Differences between v2.39 and v2.38:
>
> * Rebase for removal of vlan, checksum and skb->mark compat code
>   - This includes adding adding a new patch,
>     "[PATCH v2.39 6/7] datapath: Break out deacceleration portion of
>     vlan_push" to allow re-use of some existing code.
>
>
> Differences between v2.38 and v2.37:
>
> * Rebase for SCTP support
> * Refactor validate_tp_port() to iterate over eth_types rather
>   than open-coding the loop. With the addition of SCTP this logic
>   is now used three times.
>
>
> Differences between v2.37 and v2.36:
>
> * Rebase
>
>
> Differences between v2.36 and v2.35:
>
> * Rebase
>
> * Do not add set_ethertype() to datapath/actions.c.
>   As this patch has evolved this function had devolved into
>   to sets of functionality wrapped into a single function with
>   only one line of common code. Refactor things to simply
>   open-code setting the ether type in the two locations where
>   set_ethertype() was previously used. The aim here is to improve
>   readability.
>
> * Update setting skb->ethertype after mpls push and pop.
>   - In the case of push_mpls it should be set unconditionally
>     as in v2.35 the behaviour of this function to always push
>     an MPLS LSE before any VLAN tags.
>   - In the case of mpls_pop eth_p_mpls(skb->protocol) is a better
>     test than skb->protocol != htons(ETH_P_8021Q) as it will give the
>     correct behaviour in the presence of other VLAN ethernet types,
>     for example 0x88a8 which is used by 802.1ad. Moreover, it seems
>     correct to update the ethernet type if it was previously set
>     according to the top-most MPLS LSE.
>
> * Deaccelerate VLANs when pushing MPLS tags the
>   - Since v2.35 MPLS push will insert an MPLS LSE before any VLAN tags.
>     This means that if an accelerated tag is present it should be
>     deaccelerated to ensure it ends up in the correct position.
>
> * Update skb->mac_len in push_mpls() so that it will be correct
>   when used by a subsequent call to pop_mpls().
>
>   As things stand I do not believe this is strictly necessary as
>   ovs-vswitchd will not send a pop MPLS action after a push MPLS action.
>   However, I have added this in order to code more defensively as I believe
>   that if such a sequence did occur it would be rather unobvious why
>   it didn't work.
>
> * Do not add skb_cow_head() call in push_mpls().
>   It is unnecessary as there is a make_writable() call.
>   This change was also made in v2.30 but some how the
>   code regressed between then and v2.35.
>
>
> Differences between v2.35 and v2.34:
>
> * Add support for the tag ordering specified up until OpenFlow 1.2 and
>   the ordering specified from OpenFlow 1.3.
>
> * Correct error in datapath patch's handling of GSO in the presence
>   of MPLS and absence of VLANs.
>
>
> Pre-requisites.
>
> This series applies on top of "[PATCH v3] Remove mpls_depth field from
> flow"
>
> To aid review this series and its pre-requisite is available in git at:
>
> git://github.com/horms/openvswitch.git devel/mpls-v2.40
>
>
> Patch list and overall diffstat:
>
> Joe Stringer (5):
>   odp: Only pass vlan_tci to commit_vlan_action()
>   odp: Allow VLAN actions after MPLS actions
>   ofp-actions: Add OFPUTIL_OFPAT13_PUSH_MPLS
>   ofp-actions: Add separate OpenFlow 1.3 action parser
>   lib: Push MPLS tags in the OpenFlow 1.3 ordering
>
> Simon Horman (2):
>   datapath: Break out deacceleration portion of vlan_push
>   datapath: Add basic MPLS support to kernel
>
>  datapath/Modules.mk                             |   1 +
>  datapath/actions.c                              | 156 ++++++++-
>  datapath/datapath.c                             | 259 ++++++++++++--
>  datapath/datapath.h                             |   2 +
>  datapath/flow.c                                 |  58 ++-
>  datapath/flow.h                                 |  17 +-
>  datapath/linux/compat/gso.c                     | 117 ++++++-
>  datapath/linux/compat/gso.h                     |  53 +++
>  datapath/linux/compat/include/linux/netdevice.h |  14 +-
>  datapath/linux/compat/netdevice.c               |  28 --
>  datapath/mpls.h                                 |  15 +
>  include/linux/openvswitch.h                     |   7 +-
>  lib/flow.c                                      |   2 +-
>  lib/odp-util.c                                  |  21 +-
>  lib/odp-util.h                                  |   2 +-
>  lib/ofp-actions.c                               |  68 +++-
>  lib/ofp-parse.c                                 |   1 +
>  lib/ofp-util.c                                  |   3 +
>  lib/ofp-util.h                                  |   1 +
>  lib/packets.c                                   |  10 +-
>  lib/packets.h                                   |   2 +-
>  ofproto/ofproto-dpif-xlate.c                    |  98 ++++--
>  ofproto/ofproto-dpif-xlate.h                    |   5 +
>  tests/ofproto-dpif.at                           | 446
> ++++++++++++++++++++++++
>  24 files changed, 1246 insertions(+), 140 deletions(-)
>  create mode 100644 datapath/mpls.h
>
>

[-- Attachment #1.2: Type: text/html, Size: 10319 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply

* Re: [PATCH v2.40 3/7] ofp-actions: Add OFPUTIL_OFPAT13_PUSH_MPLS
From: Joe Stringer @ 2013-09-29  2:07 UTC (permalink / raw)
  To: Ben Pfaff
  Cc: dev-yBygre7rU0TnMu66kgdUjQ@public.gmane.org, Ravi K,
	netdev-u79uwXL29TY76Z2rM5mHXA, Isaku Yamahata
In-Reply-To: <20130927193010.GB17506-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org>


[-- Attachment #1.1: Type: text/plain, Size: 928 bytes --]

I recall having a bit of discussion regarding this approach and whether to
change it, but I don't recall the issues around this.

Your suggestion sounds sane. What are your thoughts, Simon?


On Sat, Sep 28, 2013 at 7:30 AM, Ben Pfaff <blp-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org> wrote:

> On Fri, Sep 27, 2013 at 09:18:32AM +0900, Simon Horman wrote:
> > From: Joe Stringer <joe-Q1GJJQv1iO6lP80pJB477g@public.gmane.org>
> >
> > This patch adds a new compatibility enum for use with MPLS, so that the
> > differing behaviour between OpenFlow 1.2 and 1.3 can be implemented in
> > ofproto-dpif-xlate.
>
> It seems a little awkward to me to do this via a new OFPACT_, mostly
> because there isn't currently any distinction between OF1.1 and OF1.3 in
> terms of OFPACT_ definitions.  Did you consider adding a new field to
> struct ofpact_push_mpls that would say whether the label should be added
> before or after a VLAN tag?
>

[-- Attachment #1.2: Type: text/html, Size: 1424 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply

* Re: [PATCH v2.40 2/7] odp: Allow VLAN actions after MPLS actions
From: Joe Stringer @ 2013-09-29  1:51 UTC (permalink / raw)
  To: Ben Pfaff
  Cc: dev-yBygre7rU0TnMu66kgdUjQ@public.gmane.org, Ravi K,
	netdev-u79uwXL29TY76Z2rM5mHXA, Isaku Yamahata
In-Reply-To: <20130927192108.GA17506-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org>


[-- Attachment #1.1: Type: text/plain, Size: 3126 bytes --]

On Sat, Sep 28, 2013 at 7:21 AM, Ben Pfaff <blp-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org> wrote:

> On Fri, Sep 27, 2013 at 09:18:31AM +0900, Simon Horman wrote:
> > From: Joe Stringer <joe-Q1GJJQv1iO6lP80pJB477g@public.gmane.org>
> >
> > OpenFlow 1.2 and 1.3 differ on their handling of MPLS actions in the
> > presence of VLAN tags. To allow correct behaviour to be committed in
> > each situation, this patch adds a second round of VLAN tag action
> > handling to commit_odp_actions(), which occurs after MPLS actions. This
> > is implemented with a new field in 'struct xlate_in' called 'vlan_tci'.
> >
> > When an push_mpls action is composed, the flow's current VLAN state is
> > stored into xin->vlan_tci, and flow->vlan_tci is set to 0 (pop_vlan). If
> > a VLAN tag is present, it is stripped; if not, then there is no change.
> > Any later modifications to the VLAN state is written to xin->vlan_tci.
> > When committing the actions, flow->vlan_tci is used before MPLS actions,
> > and xin->vlan_tci is used afterwards. This retains the current datapath
> > behaviour, but allows VLAN actions to be applied in a more flexible
> > manner.
> >
> > Signed-off-by: Joe Stringer <joe-Q1GJJQv1iO6lP80pJB477g@public.gmane.org>
> > Signed-off-by: Simon Horman <horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
>
> The commit message talks about handling OpenFlow 1.2 and 1.3 both
> properly, but I don't see how the code distinguishes between the cases.
> Can you explain?  Maybe this is something added in a later patch, in
> which case it would be nice to mention that in the commit message.
>

It is added in patch #5. IIRC this patch should cause no difference in
behaviour, but set up infrastructure to be used later.

 There seems to be a typo in the comment in vlan_tci_restore() here:
> > +    /* If MPLS actions were executed after MPLS, copy the final
> vlan_tci out
> > +     * and restore the intermediate VLAN state. */
>

Right, that should probably be "...executed after VLAN actions...".

I was a little confused by the new local variable 'vlan_tci' in
> do_xlate_actions().  Partly this was because the fact that I didn't find
> it obvious that sometimes it points to different VLAN tags.  I think
> this would be easier to see if it were initially assigned just under the
> big comment rather than in an initializer, so that one would know to
> look at the comment.
>

Perhaps the big comment could be rearranged and put above the initialiser,
something like the following:-

/* VLAN actions are stored to '*vlan_tci'. This variable initially points
to 'xin->flow->vlan_tci', so that
 * VLAN actions are applied before any MPLS actions. When an MPLS action is
translated,
 * 'vlan_tci' is updated to point to 'xin->vlan_tci'. This causes later
VLAN actions to be applied after MPLS actions.
 * Each time through the loop, 'xin->vlan_tci' is updated to reflect the
final VLAN state of the flow. */

Then, the place where 'xin->vlan_tci' is updated to '*vlan_tci' could have
a simple comment to refer back:-

/* Update the final vlan state to match the current state. */

Simon, do you mind handling the change?

[-- Attachment #1.2: Type: text/html, Size: 4864 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply

* [PATCH] {iproute2, xfrm}: Use memcpy to suppress gcc phony buffer overflow warning
From: Fan Du @ 2013-09-29  1:49 UTC (permalink / raw)
  To: Sohny Thomas; +Cc: David Laight, stephen, netdev
In-Reply-To: <5246D88F.7090906@linux.vnet.ibm.com>



On 2013年09月28日 21:24, Sohny Thomas wrote:
> On Friday 27 September 2013 01:56 PM, David Laight wrote:
>>> ip xfrm state add causes a SIGABRT due to a strncpy_chk .
>>> This happens since strncpy doesn't account for the '\0' .
>>> I have fixed this using sizeof instead of strlen .
>>>
>>> There is a redhat bug which documents this issue
>>>
>>> https://bugzilla.redhat.com/show_bug.cgi?id=982761
>>>
>>> Signed-off-by: Sohny Thomas <sohthoma@in.ibm.com>
>>>
>>> --------------
>>>
>>> diff --git a/ip/xfrm_state.c b/ip/xfrm_state.c
>>> index 389942c..7dd8799 100644
>>> --- a/ip/xfrm_state.c
>>> +++ b/ip/xfrm_state.c
>>> @@ -117,7 +117,7 @@ static int xfrm_algo_parse(struct xfrm_algo *alg,
>>> enum xfrm_attr_type_t type,
>>> char *name, char *key, char *buf, int max)
>>> {
>>> int len;
>>> - int slen = strlen(key);
>>> + int slen = sizeof(key);
>>
>> you definitely don't want sizeof(key) - that is either 4 or 8.
> oh damn my bad.
> I think i will go with strlen(key) + 1.
>
> or i will pass slen+1 to strncpy .

You must be kidding about this slen+1 or strlen(key) + 1 , right?



 From 3895b3d88bcb873988089392079645f2c9665e95 Mon Sep 17 00:00:00 2001
From: Fan Du <fan.du@windriver.com>
Date: Sun, 29 Sep 2013 09:38:12 +0800
Subject: [PATCH] {iproute2, xfrm}: Use memcpy to suppress gcc phony buffer
  overflow warning.

This bug is reported from below link:
https://bugzilla.redhat.com/show_bug.cgi?id=982761

An simplified command from its original reproducing method in bugzilla:
ip xfrm state add src 10.0.0.2 dst 10.0.0.1 proto ah spi 0x12345678 auth md5 12
will cause below spew from gcc:

*** buffer overflow detected ***: ./ip terminated
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(__fortify_fail+0x37)[0x7f894ae2b817]
/lib/x86_64-linux-gnu/libc.so.6(+0x109710)[0x7f894ae2a710]
/lib/x86_64-linux-gnu/libc.so.6(+0x1089f6)[0x7f894ae299f6]
./ip[0x42119d]
./ip(do_xfrm_state+0x231)[0x4217f1]
./ip[0x405d05]
./ip(main+0x2b4)[0x405934]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xed)[0x7f894ad4276d]
./ip[0x405bb9]
======= Memory map: ========
00400000-0043c000 r-xp 00000000 08:01 1997183                            /workbench/iproute2/ip/ip
0063b000-0063c000 r--p 0003b000 08:01 1997183                            /workbench/iproute2/ip/ip
0063c000-00640000 rw-p 0003c000 08:01 1997183                            /workbench/iproute2/ip/ip
00640000-00643000 rw-p 00000000 00:00 0
017c7000-017e8000 rw-p 00000000 00:00 0                                  [heap]
7f894ab0b000-7f894ab20000 r-xp 00000000 08:01 6032986                    /lib/x86_64-linux-gnu/libgcc_s.so.1
7f894ab20000-7f894ad1f000 ---p 00015000 08:01 6032986                    /lib/x86_64-linux-gnu/libgcc_s.so.1
7f894ad1f000-7f894ad20000 r--p 00014000 08:01 6032986                    /lib/x86_64-linux-gnu/libgcc_s.so.1
7f894ad20000-7f894ad21000 rw-p 00015000 08:01 6032986                    /lib/x86_64-linux-gnu/libgcc_s.so.1
7f894ad21000-7f894aed6000 r-xp 00000000 08:01 6033273                    /lib/x86_64-linux-gnu/libc-2.15.so
7f894aed6000-7f894b0d5000 ---p 001b5000 08:01 6033273                    /lib/x86_64-linux-gnu/libc-2.15.so
7f894b0d5000-7f894b0d9000 r--p 001b4000 08:01 6033273                    /lib/x86_64-linux-gnu/libc-2.15.so
7f894b0d9000-7f894b0db000 rw-p 001b8000 08:01 6033273                    /lib/x86_64-linux-gnu/libc-2.15.so
7f894b0db000-7f894b0e0000 rw-p 00000000 00:00 0
7f894b0e0000-7f894b0e2000 r-xp 00000000 08:01 6033272                    /lib/x86_64-linux-gnu/libdl-2.15.so
7f894b0e2000-7f894b2e2000 ---p 00002000 08:01 6033272                    /lib/x86_64-linux-gnu/libdl-2.15.so
7f894b2e2000-7f894b2e3000 r--p 00002000 08:01 6033272                    /lib/x86_64-linux-gnu/libdl-2.15.so
7f894b2e3000-7f894b2e4000 rw-p 00003000 08:01 6033272                    /lib/x86_64-linux-gnu/libdl-2.15.so
7f894b2e4000-7f894b306000 r-xp 00000000 08:01 6033287                    /lib/x86_64-linux-gnu/ld-2.15.so
7f894b4e3000-7f894b4e6000 rw-p 00000000 00:00 0
7f894b503000-7f894b506000 rw-p 00000000 00:00 0
7f894b506000-7f894b507000 r--p 00022000 08:01 6033287                    /lib/x86_64-linux-gnu/ld-2.15.so
7f894b507000-7f894b509000 rw-p 00023000 08:01 6033287                    /lib/x86_64-linux-gnu/ld-2.15.so
7fff9a3c2000-7fff9a3e3000 rw-p 00000000 00:00 0                          [stack]
7fff9a3ff000-7fff9a400000 r-xp 00000000 00:00 0                          [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscall]

This is a false positive warning as the destination pointer "buf" pointers to
an ZERO length array, which actually will occupy alg.buf mostly.
Fix this by using memcpy.

struct xfrm_algo {
         char            alg_name[64];
         unsigned int    alg_key_len;    /* in bits */
         char            alg_key[0];
};

struct {
         union {
                 struct xfrm_algo alg;
                 struct xfrm_algo_aead aead;
                 struct xfrm_algo_auth auth;
         } u;
         char buf[XFRM_ALGO_KEY_BUF_SIZE];
} alg = {};

buf = alg.u.alg.alg_key;
---
  ip/xfrm_state.c |    2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/ip/xfrm_state.c b/ip/xfrm_state.c
index 0d98e78..5cc87d3 100644
--- a/ip/xfrm_state.c
+++ b/ip/xfrm_state.c
@@ -159,7 +159,7 @@ static int xfrm_algo_parse(struct xfrm_algo *alg, enum xfrm_attr_type_t type,
  			if (len > max)
  				invarg("\"ALGO-KEY\" makes buffer overflow\n", key);

-			strncpy(buf, key, len);
+			memcpy(buf, key, len);
  		}
  	}

-- 
1.7.9.5




> Regards,
> Sohny
>>
>> David
>>
>>
>>
>>
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>

-- 
浮沉随浪只记今朝笑

--fan

^ permalink raw reply related

* Re: [PATCH v5] IPv6 NAT: Do not drop DNATed 6to4/6rd packets
From: Joe Perches @ 2013-09-29  0:24 UTC (permalink / raw)
  To: David Miller; +Cc: hannes, catab, netdev, yoshfuji
In-Reply-To: <20130928.155750.1130089685321379918.davem@davemloft.net>

On Sat, 2013-09-28 at 15:57 -0400, David Miller wrote:
> Applied, but Catalin please strictly refer to changes in the following
> precise format:
> 
>         commit $SHA1_ID ("Commit message header line text")
> 
> Because SHA1_IDs are ambiguous, especially when the change in question
> is backported into various -stable branches.

There are now enough commits that using only
8 byte SHA1 IDs generates some collisions so
please use 12 bytes or so of the SHA1.

^ permalink raw reply

* Re: [PATCH v2] declance: Remove `incompatible pointer type' warnings
From: Maciej W. Rozycki @ 2013-09-29  0:09 UTC (permalink / raw)
  To: David Miller; +Cc: sergei.shtylyov, netdev
In-Reply-To: <20130928.153316.1766304011894422920.davem@davemloft.net>

On Sat, 28 Sep 2013, David Miller wrote:

> >  Thanks, by Sergei's request please use this version instead that has the 
> > reference to the original commit updated (no change to the patch itself).
> 
> I can't undo the original commit once it has been installed in my tree.

 No worries, and good to know, thanks.

  Maciej

^ permalink raw reply

* Re: [PATCH v2] bgmac: add support for Byte Queue Limits
From: Eric Dumazet @ 2013-09-28 23:22 UTC (permalink / raw)
  To: Hauke Mehrtens; +Cc: davem, zajec5, netdev
In-Reply-To: <1380409400.3596.9.camel@edumazet-glaptop.roam.corp.google.com>

On Sat, 2013-09-28 at 16:03 -0700, Eric Dumazet wrote:

> Unfortunately, skb->len is unsafe : hardware might already sent the
> packet and TX completion have freed it.
> 
> You must cache skb->len in a variable before allowing hardware to send
> the frame.
> 

More exactly you must call netdev_sent_queue() _before_ giving skb to
hardware, or netdev_completed_queue() might panic.

^ permalink raw reply

* Re: [PATCH v2] bgmac: add support for Byte Queue Limits
From: Eric Dumazet @ 2013-09-28 23:03 UTC (permalink / raw)
  To: Hauke Mehrtens; +Cc: davem, zajec5, netdev
In-Reply-To: <1380407726-20691-1-git-send-email-hauke@hauke-m.de>

On Sun, 2013-09-29 at 00:35 +0200, Hauke Mehrtens wrote:
> This makes it possible to use some more advanced queuing
> techniques with this driver.
> 
> When multi queue support will be added some changes to Byte Queue
> handling is needed.
> 
> Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
> ---
>  drivers/net/ethernet/broadcom/bgmac.c |   10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/drivers/net/ethernet/broadcom/bgmac.c b/drivers/net/ethernet/broadcom/bgmac.c
> index 249468f..e5519f1 100644
> --- a/drivers/net/ethernet/broadcom/bgmac.c
> +++ b/drivers/net/ethernet/broadcom/bgmac.c
> @@ -164,6 +164,8 @@ static netdev_tx_t bgmac_dma_tx_add(struct bgmac *bgmac,
>  	if (--free_slots == 1)
>  		netif_stop_queue(net_dev);
>  
> +	netdev_sent_queue(net_dev, skb->len);
> +
>  	return NETDEV_TX_OK;
>  

Unfortunately, skb->len is unsafe : hardware might already sent the
packet and TX completion have freed it.

You must cache skb->len in a variable before allowing hardware to send
the frame.

^ permalink raw reply

* [PATCH v2] bgmac: add support for Byte Queue Limits
From: Hauke Mehrtens @ 2013-09-28 22:35 UTC (permalink / raw)
  To: davem; +Cc: zajec5, eric.dumazet, netdev, Hauke Mehrtens

This makes it possible to use some more advanced queuing
techniques with this driver.

When multi queue support will be added some changes to Byte Queue
handling is needed.

Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
---
 drivers/net/ethernet/broadcom/bgmac.c |   10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/drivers/net/ethernet/broadcom/bgmac.c b/drivers/net/ethernet/broadcom/bgmac.c
index 249468f..e5519f1 100644
--- a/drivers/net/ethernet/broadcom/bgmac.c
+++ b/drivers/net/ethernet/broadcom/bgmac.c
@@ -164,6 +164,8 @@ static netdev_tx_t bgmac_dma_tx_add(struct bgmac *bgmac,
 	if (--free_slots == 1)
 		netif_stop_queue(net_dev);
 
+	netdev_sent_queue(net_dev, skb->len);
+
 	return NETDEV_TX_OK;
 
 err_stop_drop:
@@ -178,6 +180,7 @@ static void bgmac_dma_tx_free(struct bgmac *bgmac, struct bgmac_dma_ring *ring)
 	struct device *dma_dev = bgmac->core->dma_dev;
 	int empty_slot;
 	bool freed = false;
+	unsigned bytes_compl = 0, pkts_compl = 0;
 
 	/* The last slot that hardware didn't consume yet */
 	empty_slot = bgmac_read(bgmac, ring->mmio_base + BGMAC_DMA_TX_STATUS);
@@ -195,6 +198,9 @@ static void bgmac_dma_tx_free(struct bgmac *bgmac, struct bgmac_dma_ring *ring)
 					 slot->skb->len, DMA_TO_DEVICE);
 			slot->dma_addr = 0;
 
+			bytes_compl += slot->skb->len;
+			pkts_compl++;
+
 			/* Free memory! :) */
 			dev_kfree_skb(slot->skb);
 			slot->skb = NULL;
@@ -208,6 +214,8 @@ static void bgmac_dma_tx_free(struct bgmac *bgmac, struct bgmac_dma_ring *ring)
 		freed = true;
 	}
 
+	netdev_completed_queue(bgmac->net_dev, pkts_compl, bytes_compl);
+
 	if (freed && netif_queue_stopped(bgmac->net_dev))
 		netif_wake_queue(bgmac->net_dev);
 }
@@ -988,6 +996,8 @@ static void bgmac_chip_reset(struct bgmac *bgmac)
 	bgmac_miiconfig(bgmac);
 	bgmac_phy_init(bgmac);
 
+	netdev_reset_queue(bgmac->net_dev);
+
 	bgmac->int_status = 0;
 }
 
-- 
1.7.10.4

^ permalink raw reply related

* Re: [PATCH v2 net-next] net: introduce SO_MAX_PACING_RATE
From: David Miller @ 2013-09-28 22:36 UTC (permalink / raw)
  To: eric.dumazet; +Cc: netdev, sesse, mtk.manpages
In-Reply-To: <1380036052.3165.71.camel@edumazet-glaptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Tue, 24 Sep 2013 08:20:52 -0700

> From: Eric Dumazet <edumazet@google.com>
> 
> As mentioned in commit afe4fd062416b ("pkt_sched: fq: Fair Queue packet
> scheduler"), this patch adds a new socket option.
> 
> SO_MAX_PACING_RATE offers the application the ability to cap the
> rate computed by transport layer. Value is in bytes per second.
> 
> u32 val = 1000000;
> setsockopt(sockfd, SOL_SOCKET, SO_MAX_PACING_RATE, &val, sizeof(val));
> 
> To be effectively paced, a flow must use FQ packet scheduler.
> 
> Note that a packet scheduler takes into account the headers for its
> computations. The effective payload rate depends on MSS and retransmits
> if any.
> 
> I chose to make this pacing rate a SOL_SOCKET option instead of a
> TCP one because this can be used by other protocols.
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Applied, thanks Eric.

^ permalink raw reply

* Re: [PATCH v2] declance: Remove `incompatible pointer type' warnings
From: David Miller @ 2013-09-28 22:33 UTC (permalink / raw)
  To: macro; +Cc: sergei.shtylyov, netdev
In-Reply-To: <alpine.LFD.2.03.1309222113300.16797@linux-mips.org>

From: "Maciej W. Rozycki" <macro@linux-mips.org>
Date: Sun, 22 Sep 2013 21:19:01 +0100 (BST)

> Revert damage caused by 43d620c82985b19008d87a437b4cf83f356264f7 
> [drivers/net: Remove casts of void *]:
> 
> .../declance.c: In function 'cp_to_buf':
> .../declance.c:347: warning: assignment from incompatible pointer type
> .../declance.c:348: warning: assignment from incompatible pointer type
> .../declance.c: In function 'cp_from_buf':
> .../declance.c:406: warning: assignment from incompatible pointer type
> .../declance.c:407: warning: assignment from incompatible pointer type
> 
> Also add a `const' qualifier where applicable and adjust formatting.
> 
> Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
> ---
>> Applied to net-next
> 
>  Thanks, by Sergei's request please use this version instead that has the 
> reference to the original commit updated (no change to the patch itself).

I can't undo the original commit once it has been installed in my tree.

^ permalink raw reply

* Re: [PATCH net-next 0/9] bonding: remove bond_next_slave()
From: David Miller @ 2013-09-28 22:29 UTC (permalink / raw)
  To: vfalico; +Cc: netdev, nikolay, bhutchings, fubar, andy
In-Reply-To: <1380291125-5671-1-git-send-email-vfalico@redhat.com>

From: Veaceslav Falico <vfalico@redhat.com>
Date: Fri, 27 Sep 2013 16:11:56 +0200

> (on top of "[PATCH net-next 0/2] bonding: fix 3ad slave (de)init" - the
> patchset is essential)
> 
> Hi,
> 
> As Ben Hutchings and Nikolay Aleksandrov correctly noted -
> bond_next_slave() already is not O(1), but rather O(n), so it's only useful
> for one-off operations and shouldn't be used widely - for example, for list
> traversal, which will take O(n^2) time, which will be disastrous for any
> hot path with a large number of slaves.
> 
> Currently, bond_next_slave() is used several times in 802.3ad and for
> seq_read - bond_info_seq_next().
> 
> The 802.3ad part uses it only in constructs like:
> 
> for (X = __get_first_X(); X; X = __get_next_X()) {
> 
> where __get_next_X() uses bond_next_slave().
> 
> This for can (and should) actually be replaced by the standard
> 
> bond_for_each_slave(bond, slave, iter) {
> 	X = __get_X_by_slave(slave);
> 
> it's faster, easier to read, debug and maintain. Also, removes a lot of
> code lines.
> 
> After replacing it, the only user of bond_next_slave() is
> bond_info_seq_next() - which can, actually, implement it by itself, and not
> call another function.
> 
> So, that way, we can completely remove the bond_next_slave(), cleanup and
> optimize a bit.
> 
> p.s. the 802.3ad code is horrible, both style-wise and from the logical
> point of view - so I've decided to *not* change anything except that what
> this patch is intended to provide. The cleanup and some refactoring should
> be done in another patchset, which I've began to work on already.
> 
> Thank you!
> 
> CC: Jay Vosburgh <fubar@us.ibm.com>
> CC: Andy Gospodarek <andy@greyhouse.net>
> Signed-off-by: Veaceslav Falico <vfalico@redhat.com>

Also applied, thanks.

^ permalink raw reply

* Re: [PATCH net-next 0/2] bonding: fix 3ad slave (de)init
From: David Miller @ 2013-09-28 22:28 UTC (permalink / raw)
  To: vfalico; +Cc: netdev, fubar, andy
In-Reply-To: <1380287458-3488-1-git-send-email-vfalico@redhat.com>

From: Veaceslav Falico <vfalico@redhat.com>
Date: Fri, 27 Sep 2013 15:10:56 +0200

> After 1f718f0f4f97145f4072d2d72dcf85069ca7226d ("bonding: populate
> neighbour's private on enslave") the (de)linking of slaves in
> bond_enslave/bond_release_one happens in the correct places - after we've
> completely initialized the slave (for bond_enslave) and before we've even
> began to de-init the slave (for bond_release_one, respectively).
> 
> This was done to prevent any RCU readers to see the half-initialized slave
> (because the RCU readers aren't blocked by bond->lock or rtnl_lock
> usually).
> 
> However, 802.3ad logic, in several places, relied on the fact that the
> slave is still linked to the bond.
> 
> Fix it by correctly handling these cases - we shouldn't rely that the slave
> is linked before fully initialized and, respectively, that the slave is
> still linked while it's being removed.
> 
> CC: Jay Vosburgh <fubar@us.ibm.com>
> CC: Andy Gospodarek <andy@greyhouse.net>
> Signed-off-by: Veaceslav Falico <vfalico@redhat.com>

Series applied.

^ permalink raw reply

* Re: [PATCH] bgmac: add support for Byte Queue Limits
From: Eric Dumazet @ 2013-09-28 22:18 UTC (permalink / raw)
  To: Hauke Mehrtens; +Cc: davem, zajec5, netdev
In-Reply-To: <1380403393-12113-1-git-send-email-hauke@hauke-m.de>

On Sat, 2013-09-28 at 23:23 +0200, Hauke Mehrtens wrote:
> This makes it possible to use some more advanced queuing
> techniques with this driver.

> @@ -1198,6 +1206,8 @@ static netdev_tx_t bgmac_start_xmit(struct sk_buff *skb,
>  	struct bgmac *bgmac = netdev_priv(net_dev);
>  	struct bgmac_dma_ring *ring;
>  
> +	netdev_sent_queue(net_dev, skb->len);
> +
>  	/* No QOS support yet */
>  	ring = &bgmac->tx_ring[0];
>  	return bgmac_dma_tx_add(bgmac, ring, skb);


Well, no.

You must call netdev_sent_queue() only if packet is really queued.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox