Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH net] net: qmi_wwan: add ZTE MF60
From: David Miller @ 2012-07-05 10:16 UTC (permalink / raw)
  To: bjorn; +Cc: netdev
In-Reply-To: <1341251597-21124-1-git-send-email-bjorn@mork.no>

From: Bjørn Mork <bjorn@mork.no>
Date: Mon,  2 Jul 2012 19:53:17 +0200

> +static const struct driver_info	qmi_wwan_force_int2 = {

Please get rid of that strange tab character and resubmit.

^ permalink raw reply

* Re: [PATCH 0/5] rtcache remove respin
From: David Miller @ 2012-07-05 10:15 UTC (permalink / raw)
  To: eric.dumazet; +Cc: netdev
In-Reply-To: <1341225841.5269.69.camel@edumazet-glaptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Mon, 02 Jul 2012 12:44:01 +0200

> If we still want __refcnt being on cache line boundary, we might find a
> better way to accomplish this.

Back to this issue again.

Eric, if you take a look at net-next right now, I left a dummy padding
in dst_entry where the neighbour pointer used to be.

Can you come up with some way to make use of that new space?

^ permalink raw reply

* Re: [PATCH net-next 2/2] drivers/ieee802154/at231rf230: remove unused return status
From: David Miller @ 2012-07-05 10:13 UTC (permalink / raw)
  To: alex.bluesman.smirnov; +Cc: netdev, dbaryshkov
In-Reply-To: <1341209912-6030-3-git-send-email-alex.bluesman.smirnov@gmail.com>

From: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
Date: Mon,  2 Jul 2012 10:18:32 +0400

> Remove excessive variable used for the return status.
> 
> Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>

Applied.

^ permalink raw reply

* Re: [PATCH net-next] 6lowpan: revert 'reuse eth_mac_addr()'
From: David Miller @ 2012-07-05 10:13 UTC (permalink / raw)
  To: alex.bluesman.smirnov; +Cc: netdev, danny.kukawka, dbaryshkov
In-Reply-To: <1341208726-2793-1-git-send-email-alex.bluesman.smirnov@gmail.com>

From: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
Date: Mon,  2 Jul 2012 09:58:46 +0400

> This reverts the commit cdf49c283e2e105da86ca575ad35b453f5ff24ea which
> replaces lowpan '.ndo_set_mac_address' method by ethernet's one.
> 
> Accorind to the IEEE 802.15.4 standard, device has 8-byte length address,
> so this hook loses the last 2 bytes which may rise a compatibility problems
> with other IEEE 802.15.4 standard implementations.
> 
> Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>

Applied.

^ permalink raw reply

* Re: [PATCH net-next 2/2] dccp: remove unnecessary codes in ipv6.c
From: David Miller @ 2012-07-05 10:13 UTC (permalink / raw)
  To: eric.dumazet; +Cc: roy.qing.li, netdev
In-Reply-To: <1341220130.5269.38.camel@edumazet-glaptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Mon, 02 Jul 2012 11:08:50 +0200

> On Mon, 2012-07-02 at 11:19 +0800, roy.qing.li@gmail.com wrote:
>> From: RongQing.Li <roy.qing.li@gmail.com>
>> 
>> opt always equals np->opts, so it is meaningless to define opt, and
>> check if opt does not equal np->opts and then try to free opt.
>> 
>> Signed-off-by: RongQing.Li <roy.qing.li@gmail.com>
>> ---
> 
> Acked-by: Eric Dumazet <edumazet@google.com>

Also applied, thank you.

^ permalink raw reply

* Re: [PATCH net-next 1/2] ipv6: remove unnecessary codes in tcp_ipv6.c
From: David Miller @ 2012-07-05 10:13 UTC (permalink / raw)
  To: eric.dumazet; +Cc: roy.qing.li, netdev
In-Reply-To: <1341220067.5269.37.camel@edumazet-glaptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Mon, 02 Jul 2012 11:07:47 +0200

> On Mon, 2012-07-02 at 11:18 +0800, roy.qing.li@gmail.com wrote:
>> From: RongQing.Li <roy.qing.li@gmail.com>
>> 
>> opt always equals np->opts, so it is meaningless to define opt, and
>> check if opt does not equal np->opts and then try to free opt.
>> 
>> Signed-off-by: RongQing.Li <roy.qing.li@gmail.com>
>> ---
>>  net/ipv6/tcp_ipv6.c |   16 +++-------------
>>  1 files changed, 3 insertions(+), 13 deletions(-)
> 
> Acked-by: Eric Dumazet <edumazet@google.com>

Ok I now understand better why these changes are correct,
applied.

^ permalink raw reply

* Re: [PATCH net-next 00/11] default maximal number of RSS queues in mq drivers
From: David Miller @ 2012-07-05 10:07 UTC (permalink / raw)
  To: yuvalmin
  Cc: netdev, eilong, divy, ogerlitz, jdmason, anirban.chakraborty,
	jitendra.kalsaria, ron.mercer, jeffrey.t.kirsher, mason, gallatin,
	sathya.perla, subbu.seetharaman, ajit.khaparde, mcarlson, mchan,
	eric.dumazet, bhutchings
In-Reply-To: <1341148740-7375-1-git-send-email-yuvalmin@broadcom.com>

From: "Yuval Mintz" <yuvalmin@broadcom.com>
Date: Sun, 1 Jul 2012 16:18:49 +0300

> Different vendors support different number of RSS queues by default. Today,
> there exists an ethtool API through which users can change the number of
> channels their driver supports; This enables us to pursue the goal of using
> a default number of RSS queues in various multi-queue drivers.
> 
> This patch intendeds to achieve the above default, by upper-limiting the number
> of interrupts multi-queue drivers request (by default, not via the new API) 
> with correlation to the number of cpus on the machine.

Applied to net-next, thanks a lot.

^ permalink raw reply

* Re: [PATCH net-next v2] ipv4: defer fib_compute_spec_dst() call
From: David Miller @ 2012-07-05 10:03 UTC (permalink / raw)
  To: eric.dumazet; +Cc: netdev
In-Reply-To: <1341477009.2583.3406.camel@edumazet-glaptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Thu, 05 Jul 2012 10:30:09 +0200

> From: Eric Dumazet <edumazet@google.com>
> 
> ip_options_compile() can avoid calling fib_compute_spec_dst()
> by default, and perform the call only if needed.
> 
> David suggested to add a helper to make the call only once.
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Applied, thanks a lot Eric.

^ permalink raw reply

* Re: TCP transmit performance regression
From: David Miller @ 2012-07-05 10:02 UTC (permalink / raw)
  To: eric.dumazet; +Cc: tom.leiming, netdev
In-Reply-To: <1341481760.2583.3579.camel@edumazet-glaptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Thu, 05 Jul 2012 11:49:20 +0200

> -			ax_skb->data = packet;

That's really scary.

^ permalink raw reply

* Re: [PATCH next-next] ppp: change default for incoming protocol filter to NPMODE_DROP
From: David Miller @ 2012-07-05 10:00 UTC (permalink / raw)
  To: bcrl; +Cc: netdev, linux-ppp
In-Reply-To: <20120704013258.GA26225@kvack.org>

From: Benjamin LaHaise <bcrl@kvack.org>
Date: Tue, 3 Jul 2012 21:32:58 -0400

> By default, the ppp_generic code initializes the npmode array that filters
> incoming packet to accept packets for all protocols.  This behaviour is
> incorrect, as it results in packets for protocols that an older version
> of a PPP implementation may not be aware of to be incorrectly accepted.
> This behaviour is visible, for example, when sending IPv6 packets across a
> ppp link where pppd has only been configured to use IPv4.
> 
> This change should be safe since pppd will correctly set the protocols it
> negotiates to NPMODE_PASS as the appropriate protocols transition to an Up
> state.
> 
> Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>

As far as I can tell, this has been this way for a very long time.

Therefore it is the applications responsibility to adjust the filters
to suit their needs and we really can't make such adjustments to this
behavior.

^ permalink raw reply

* Re: [PATCH 0/19] Disconnect neigh from dst_entry
From: David Miller @ 2012-07-05  9:55 UTC (permalink / raw)
  To: netdev
In-Reply-To: <20120703.024543.1597240990462633709.davem@davemloft.net>

From: David Miller <davem@davemloft.net>
Date: Tue, 03 Jul 2012 02:45:43 -0700 (PDT)

> This finally severs neighbour table entries from dst_entry enough that
> we no longer depend upon them outside of the individual protocols.

I'm pushing this now to net-next, with three minor changes.

1) I fubar'd the neigh lookup in the sch_teql changes, I needed to
   add the following code block to __teql_resolve():

       if (dst->dev != dev) {
                struct neighbour *mn;

                mn = __neigh_lookup_errno(n->tbl, n->primary_key, dev);
                neigh_release(n);
                if (IS_ERR(mn))
                        return PTR_ERR(mn);
                n = mn;
        }

2) I adjusted the comment in the neigh backlog handler of
   neigh_update() to read as follows:


	/* Why not just use 'neigh' as-is?  The problem is that
	 * things such as shaper, eql, and sch_teql can end up
	 * using alternative, different, neigh objects to output
	 * the packet in the output path.  So what we need to do
	 * here is re-lookup the top-level neigh in the path so
	 * we can reinject the packet there.
	 */

3) The redirect network event needs to also pass in the path
   destination address so that we can have it available for
   all callers of t3_l2t_get().

^ permalink raw reply

* Re: TCP transmit performance regression
From: Eric Dumazet @ 2012-07-05  9:49 UTC (permalink / raw)
  To: Ming Lei; +Cc: Network Development, David Miller
In-Reply-To: <CACVXFVPTXB7t=zwkm+HTgDaF3bA02bzff_52S+UAr51PfpvpCg@mail.gmail.com>

On Thu, 2012-07-05 at 16:42 +0800, Ming Lei wrote:
> On Thu, Jul 5, 2012 at 4:33 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> > On Thu, 2012-07-05 at 16:27 +0800, Ming Lei wrote:
> >
> >> After some investigation, the problem is caused by enabling
> >> DEBUG_SLAB, so it is not a regression.
> >>
> >
> > Strange, unless your machine is a _very_ slow one maybe ?
> 
> It is a beagle-xm board, and its cpu is ARMv7, 1GHz.

OK, driver seems buggy, please try following patch (on both sides if
possible)

 drivers/net/usb/smsc95xx.c |   11 ++++-------
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/drivers/net/usb/smsc95xx.c b/drivers/net/usb/smsc95xx.c
index b1112e7..0a4ae35 100644
--- a/drivers/net/usb/smsc95xx.c
+++ b/drivers/net/usb/smsc95xx.c
@@ -1084,26 +1084,23 @@ static int smsc95xx_rx_fixup(struct usbnet *dev, struct sk_buff *skb)
 			if (skb->len == size) {
 				if (dev->net->features & NETIF_F_RXCSUM)
 					smsc95xx_rx_csum_offload(skb);
-				skb_trim(skb, skb->len - 4); /* remove fcs */
+				__skb_trim(skb, skb->len - 4); /* remove fcs */
 				skb->truesize = size + sizeof(struct sk_buff);
 
 				return 1;
 			}
 
-			ax_skb = skb_clone(skb, GFP_ATOMIC);
+			ax_skb = netdev_alloc_skb_ip_align(dev->net, size);
 			if (unlikely(!ax_skb)) {
 				netdev_warn(dev->net, "Error allocating skb\n");
 				return 0;
 			}
 
-			ax_skb->len = size;
-			ax_skb->data = packet;
-			skb_set_tail_pointer(ax_skb, size);
+			memcpy(skb_put(ax_skb, size), packet, size);
 
 			if (dev->net->features & NETIF_F_RXCSUM)
 				smsc95xx_rx_csum_offload(ax_skb);
-			skb_trim(ax_skb, ax_skb->len - 4); /* remove fcs */
-			ax_skb->truesize = size + sizeof(struct sk_buff);
+			__skb_trim(ax_skb, ax_skb->len - 4); /* remove fcs */
 
 			usbnet_skb_return(dev, ax_skb);
 		}

^ permalink raw reply related

* [PATCH v2] cgroup: fix panic in netprio_cgroup
From: Gao feng @ 2012-07-05  9:28 UTC (permalink / raw)
  To: davem; +Cc: netdev, linux-kernel, nhorman, tj, lizefan, eric.dumazet,
	Gao feng

we set max_prioidx to the first zero bit index of prioidx_map in
function get_prioidx.

So when we delete the low index netprio cgroup and adding a new
netprio cgroup again,the max_prioidx will be set to the low index.

when we set the high index cgroup's net_prio.ifpriomap,the function
write_priomap will call update_netdev_tables to alloc memory which
size is sizeof(struct netprio_map) + sizeof(u32) * (max_prioidx + 1),
so the size of array that map->priomap point to is max_prioidx +1,
which is low than what we actually need.

fix this by adding check in get_prioidx,only set max_prioidx when
max_prioidx low than the new prioidx.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
---
 net/core/netprio_cgroup.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/net/core/netprio_cgroup.c b/net/core/netprio_cgroup.c
index 5b8aa2f..aa907ed 100644
--- a/net/core/netprio_cgroup.c
+++ b/net/core/netprio_cgroup.c
@@ -49,8 +49,9 @@ static int get_prioidx(u32 *prio)
 		return -ENOSPC;
 	}
 	set_bit(prioidx, prioidx_map);
+	if (atomic_read(&max_prioidx) < prioidx)
+		atomic_set(&max_prioidx, prioidx);
 	spin_unlock_irqrestore(&prioidx_map_lock, flags);
-	atomic_set(&max_prioidx, prioidx);
 	*prio = prioidx;
 	return 0;
 }
-- 
1.7.7.6

^ permalink raw reply related

* Re: [PATCH] cgroup: fix panic in netprio_cgroup
From: Gao feng @ 2012-07-05  9:15 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel, nhorman, tj, lizefan
In-Reply-To: <20120705.015841.2231353345763821829.davem@davemloft.net>

于 2012年07月05日 16:58, David Miller 写道:
> 
> Why did you post this twice?

Sorry to confuse you, there are something wrong with my git sendmail config.
I sent the first patch but I can't find it in the maillist,so I
sent it again.


> 
> Is there a difference between the first patch and the second
> one you posted?  If so, what is that difference?

there isn't a difference between them.
Sorry again.

Thanks.

^ permalink raw reply

* Re: [PATCH] cgroup: fix panic in netprio_cgroup
From: Gao feng @ 2012-07-05  9:10 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: davem, netdev, linux-kernel, nhorman, tj, lizefan
In-Reply-To: <1341477809.2583.3437.camel@edumazet-glaptop>

于 2012年07月05日 16:43, Eric Dumazet 写道:
> On Thu, 2012-07-05 at 16:31 +0800, Gao feng wrote:
>> we set max_prioidx to the first zero bit index of prioidx_map in
>> function get_prioidx.
>>
>> So when we delete the low index netprio cgroup and adding a new
>> netprio cgroup again,the max_prioidx will be set to the low index.
>>
>> when we set the high index cgroup's net_prio.ifpriomap,the function
>> write_priomap will call update_netdev_tables to alloc memory which
>> size is sizeof(struct netprio_map) + sizeof(u32) * (max_prioidx + 1),
>> so the size of array that map->priomap point to is max_prioidx +1,
>> which is low than what we actually need.
>>
>> fix this by adding check in get_prioidx,only set max_prioidx when
>> max_prioidx low than the new prioidx.
>>
>> Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
>> ---
>>  net/core/netprio_cgroup.c |    3 ++-
>>  1 files changed, 2 insertions(+), 1 deletions(-)
>>
>> diff --git a/net/core/netprio_cgroup.c b/net/core/netprio_cgroup.c
>> index 5b8aa2f..586f7d9 100644
>> --- a/net/core/netprio_cgroup.c
>> +++ b/net/core/netprio_cgroup.c
>> @@ -50,7 +50,8 @@ static int get_prioidx(u32 *prio)
>>  	}
>>  	set_bit(prioidx, prioidx_map);
>>  	spin_unlock_irqrestore(&prioidx_map_lock, flags);
>> -	atomic_set(&max_prioidx, prioidx);
>> +	if (atomic_read(&max_prioidx) < prioidx)
>> +		atomic_set(&max_prioidx, prioidx);
>>  	*prio = prioidx;
>>  	return 0;
>>  }
> 
> This is still racy.
> 
> Please do this before the 
> spin_unlock_irqrestore(&prioidx_map_lock, flags);
> 

Thanks Eric,you are right
I will fix and resent it.

^ permalink raw reply

* Re: [PATCH] cgroup: fix panic in netprio_cgroup
From: David Miller @ 2012-07-05  8:58 UTC (permalink / raw)
  To: gaofeng; +Cc: netdev, linux-kernel, nhorman, tj, lizefan
In-Reply-To: <1341477102-16988-1-git-send-email-gaofeng@cn.fujitsu.com>


Why did you post this twice?

Is there a difference between the first patch and the second
one you posted?  If so, what is that difference?

^ permalink raw reply

* Re: [PATCH] cgroup: fix panic in netprio_cgroup
From: Eric Dumazet @ 2012-07-05  8:43 UTC (permalink / raw)
  To: Gao feng; +Cc: davem, netdev, linux-kernel, nhorman, tj, lizefan
In-Reply-To: <1341477102-16988-1-git-send-email-gaofeng@cn.fujitsu.com>

On Thu, 2012-07-05 at 16:31 +0800, Gao feng wrote:
> we set max_prioidx to the first zero bit index of prioidx_map in
> function get_prioidx.
> 
> So when we delete the low index netprio cgroup and adding a new
> netprio cgroup again,the max_prioidx will be set to the low index.
> 
> when we set the high index cgroup's net_prio.ifpriomap,the function
> write_priomap will call update_netdev_tables to alloc memory which
> size is sizeof(struct netprio_map) + sizeof(u32) * (max_prioidx + 1),
> so the size of array that map->priomap point to is max_prioidx +1,
> which is low than what we actually need.
> 
> fix this by adding check in get_prioidx,only set max_prioidx when
> max_prioidx low than the new prioidx.
> 
> Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
> ---
>  net/core/netprio_cgroup.c |    3 ++-
>  1 files changed, 2 insertions(+), 1 deletions(-)
> 
> diff --git a/net/core/netprio_cgroup.c b/net/core/netprio_cgroup.c
> index 5b8aa2f..586f7d9 100644
> --- a/net/core/netprio_cgroup.c
> +++ b/net/core/netprio_cgroup.c
> @@ -50,7 +50,8 @@ static int get_prioidx(u32 *prio)
>  	}
>  	set_bit(prioidx, prioidx_map);
>  	spin_unlock_irqrestore(&prioidx_map_lock, flags);
> -	atomic_set(&max_prioidx, prioidx);
> +	if (atomic_read(&max_prioidx) < prioidx)
> +		atomic_set(&max_prioidx, prioidx);
>  	*prio = prioidx;
>  	return 0;
>  }

This is still racy.

Please do this before the 
spin_unlock_irqrestore(&prioidx_map_lock, flags);

^ permalink raw reply

* Re: TCP transmit performance regression
From: Ming Lei @ 2012-07-05  8:42 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Network Development, David Miller
In-Reply-To: <1341477192.2583.3415.camel@edumazet-glaptop>

[-- Attachment #1: Type: text/plain, Size: 759 bytes --]

On Thu, Jul 5, 2012 at 4:33 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Thu, 2012-07-05 at 16:27 +0800, Ming Lei wrote:
>
>> After some investigation, the problem is caused by enabling
>> DEBUG_SLAB, so it is not a regression.
>>
>
> Strange, unless your machine is a _very_ slow one maybe ?

It is a beagle-xm board, and its cpu is ARMv7, 1GHz.

>
>>
>> Looks no improvement. I still don't know why the window size becomes so
>> small even in good situation(disabling DEBUG_SLAB), and the small
>> window size will cause almost every tcp data packet acked.
>
> You are probably missing the fact that window scaling is enabled.
>
> If you dont post a pcap, I am afraid we cant really help.

See attachment for the pcap trace.


Thanks,
-- 
Ming Lei

[-- Attachment #2: tcp.pcap --]
[-- Type: application/octet-stream, Size: 97922 bytes --]

^ permalink raw reply

* Re: TCP transmit performance regression
From: Eric Dumazet @ 2012-07-05  8:33 UTC (permalink / raw)
  To: Ming Lei; +Cc: Network Development, David Miller
In-Reply-To: <CACVXFVNxcdEYd-KmkUe9=8+x_9s-ZVuoM=FfZ=QXa7w_qRiTnw@mail.gmail.com>

On Thu, 2012-07-05 at 16:27 +0800, Ming Lei wrote:

> After some investigation, the problem is caused by enabling
> DEBUG_SLAB, so it is not a regression.
> 

Strange, unless your machine is a _very_ slow one maybe ?

> 
> Looks no improvement. I still don't know why the window size becomes so
> small even in good situation(disabling DEBUG_SLAB), and the small
> window size will cause almost every tcp data packet acked.

You are probably missing the fact that window scaling is enabled.

If you dont post a pcap, I am afraid we cant really help.

^ permalink raw reply

* [PATCH] cgroup: fix panic in netprio_cgroup
From: Gao feng @ 2012-07-05  8:31 UTC (permalink / raw)
  To: davem; +Cc: netdev, linux-kernel, nhorman, tj, lizefan, Gao feng

we set max_prioidx to the first zero bit index of prioidx_map in
function get_prioidx.

So when we delete the low index netprio cgroup and adding a new
netprio cgroup again,the max_prioidx will be set to the low index.

when we set the high index cgroup's net_prio.ifpriomap,the function
write_priomap will call update_netdev_tables to alloc memory which
size is sizeof(struct netprio_map) + sizeof(u32) * (max_prioidx + 1),
so the size of array that map->priomap point to is max_prioidx +1,
which is low than what we actually need.

fix this by adding check in get_prioidx,only set max_prioidx when
max_prioidx low than the new prioidx.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
---
 net/core/netprio_cgroup.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/net/core/netprio_cgroup.c b/net/core/netprio_cgroup.c
index 5b8aa2f..586f7d9 100644
--- a/net/core/netprio_cgroup.c
+++ b/net/core/netprio_cgroup.c
@@ -50,7 +50,8 @@ static int get_prioidx(u32 *prio)
 	}
 	set_bit(prioidx, prioidx_map);
 	spin_unlock_irqrestore(&prioidx_map_lock, flags);
-	atomic_set(&max_prioidx, prioidx);
+	if (atomic_read(&max_prioidx) < prioidx)
+		atomic_set(&max_prioidx, prioidx);
 	*prio = prioidx;
 	return 0;
 }
-- 
1.7.7.6

^ permalink raw reply related

* [PATCH net-next v2] ipv4: defer fib_compute_spec_dst() call
From: Eric Dumazet @ 2012-07-05  8:30 UTC (permalink / raw)
  To: David Miller; +Cc: netdev

From: Eric Dumazet <edumazet@google.com>

ip_options_compile() can avoid calling fib_compute_spec_dst()
by default, and perform the call only if needed.

David suggested to add a helper to make the call only once.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 net/ipv4/ip_options.c |   15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/net/ipv4/ip_options.c b/net/ipv4/ip_options.c
index 1f02251..a19d647 100644
--- a/net/ipv4/ip_options.c
+++ b/net/ipv4/ip_options.c
@@ -242,6 +242,15 @@ void ip_options_fragment(struct sk_buff *skb)
 	opt->ts_needtime = 0;
 }
 
+/* helper used by ip_options_compile() to call fib_compute_spec_dst()
+ * at most one time.
+ */
+static void spec_dst_fill(__be32 *spec_dst, struct sk_buff *skb)
+{
+	if (*spec_dst == htonl(INADDR_ANY))
+		*spec_dst = fib_compute_spec_dst(skb);
+}
+
 /*
  * Verify options and fill pointers in struct options.
  * Caller should clear *opt, and set opt->data.
@@ -251,7 +260,7 @@ void ip_options_fragment(struct sk_buff *skb)
 int ip_options_compile(struct net *net,
 		       struct ip_options *opt, struct sk_buff *skb)
 {
-	__be32 spec_dst = (__force __be32) 0;
+	__be32 spec_dst = htonl(INADDR_ANY);
 	unsigned char *pp_ptr = NULL;
 	struct rtable *rt = NULL;
 	unsigned char *optptr;
@@ -260,8 +269,6 @@ int ip_options_compile(struct net *net,
 
 	if (skb != NULL) {
 		rt = skb_rtable(skb);
-		if (rt)
-			spec_dst = fib_compute_spec_dst(skb);
 		optptr = (unsigned char *)&(ip_hdr(skb)[1]);
 	} else
 		optptr = opt->__data;
@@ -334,6 +341,7 @@ int ip_options_compile(struct net *net,
 					goto error;
 				}
 				if (rt) {
+					spec_dst_fill(&spec_dst, skb);
 					memcpy(&optptr[optptr[2]-1], &spec_dst, 4);
 					opt->is_changed = 1;
 				}
@@ -376,6 +384,7 @@ int ip_options_compile(struct net *net,
 					}
 					opt->ts = optptr - iph;
 					if (rt)  {
+						spec_dst_fill(&spec_dst, skb);
 						memcpy(&optptr[optptr[2]-1], &spec_dst, 4);
 						timeptr = &optptr[optptr[2]+3];
 					}

^ permalink raw reply related

* Re: TCP transmit performance regression
From: Ming Lei @ 2012-07-05  8:27 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Network Development, David Miller
In-Reply-To: <1341474192.2583.3299.camel@edumazet-glaptop>

On Thu, Jul 5, 2012 at 3:43 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Thu, 2012-07-05 at 09:45 +0800, Ming Lei wrote:
>> Hi,
>>
>> I observed that on both 3.5-rc5 and 3.5-rc5-next, TCP transmit performance
>> degrades a lot, see my below simple test:
>>
>> 1, test box
>> NIC: 100M USB, normally can reach > 90Mbits/sec
>>
>
> What was the last "OK" kernel version ?

After some investigation, the problem is caused by enabling
DEBUG_SLAB, so it is not a regression.

>
> What NIC driver is it ?
>
>> 2, run below command on the box:
>> [root@root]#iperf -c 192.168.0.103 -w 131072 -t 10
>> ------------------------------------------------------------
>> Client connecting to 192.168.0.103, TCP port 5001
>> TCP window size:   256 KByte (WARNING: requested   128 KByte)
>> ------------------------------------------------------------
>> [  3] local 192.168.0.108 port 59315 connected with 192.168.0.103 port 5001
>> [ ID] Interval       Transfer     Bandwidth
>> [  3]  0.0-10.0 sec  40.4 MBytes  33.9 Mbits/sec
>>
>> note: 192.168.0.103 is another production machine running 'iperf -s -w 131072'
>>
>> 3, from traffic captured in wireshark, the window size of most of tcp packets
>> from the test box to 192.168.0.103 is set as 229, looks very weird and should
>> be the cause of performance regression.
>>
>
> Packets sent to 192.168.0.103 announce the window suitable for packets
> in the other way, so not relevant to your problem.
>
> Could you do
>
> # tcpdump -i eth0 -s 100 -c 1000 -w tcp.pcap host 192.168.0.103 &
> # iperf -c 192.168.0.103 -w 131072 -t 10
>
> and post the tcp.pcap file ?
>
> By the way, if you remove -w 131072 (on both sides), I guess throughput
> will increase.

Looks no improvement. I still don't know why the window size becomes so
small even in good situation(disabling DEBUG_SLAB), and the small
window size will cause almost every tcp data packet acked.


Thanks,
-- 
Ming Lei

^ permalink raw reply

* [patch] [AX.25]: small cleanup in ax25_addr_parse()
From: Dan Carpenter @ 2012-07-05  8:27 UTC (permalink / raw)
  To: Ralf Baechle; +Cc: David S. Miller, linux-hams, netdev, kernel-janitors

The comments were wrong here because "AX25_MAX_DIGIS" is 8 but the
comments say 6.  Also I've changed the "7" to "AX25_ADDR_LEN".

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>

diff --git a/net/ax25/ax25_addr.c b/net/ax25/ax25_addr.c
index 9162409..e7c9b0e 100644
--- a/net/ax25/ax25_addr.c
+++ b/net/ax25/ax25_addr.c
@@ -189,8 +189,10 @@ const unsigned char *ax25_addr_parse(const unsigned char *buf, int len,
 	digi->ndigi      = 0;
 
 	while (!(buf[-1] & AX25_EBIT)) {
-		if (d >= AX25_MAX_DIGIS)  return NULL;	/* Max of 6 digis */
-		if (len < 7) return NULL;	/* Short packet */
+		if (d >= AX25_MAX_DIGIS)
+			return NULL;
+		if (len < AX25_ADDR_LEN)
+			return NULL;
 
 		memcpy(&digi->calls[d], buf, AX25_ADDR_LEN);
 		digi->ndigi = d + 1;

^ permalink raw reply related

* Re: [PATCH] ipv4: Create and use fib_compute_spec_dst() helper.
From: Eric Dumazet @ 2012-07-05  8:10 UTC (permalink / raw)
  To: David Miller; +Cc: ja, netdev
In-Reply-To: <20120705.005940.1078811938047681715.davem@davemloft.net>

On Thu, 2012-07-05 at 00:59 -0700, David Miller wrote:

> Yes, this is a great idea.  Actually in some obscure cases your
> change can cause us to compute it more than once I think.
> 
> I'd suggest we do something like create a helper function above this
> code in ip_options.c that checks whether spec_dst is INADDR_ANY or
> not, to guard computing it multiple times.
> 
> Could you put together a quick patch like that?

Sure I'll do that.

^ permalink raw reply

* Re: AF_BUS socket address family
From: Linus Walleij @ 2012-07-05  7:59 UTC (permalink / raw)
  To: Vincent Sanders
  Cc: netdev, linux-kernel, David S. Miller, Arve Hjønnevåg,
	Daniel Walker, John Stultz, Anton Vorontsov, Greg Kroah-Hartman
In-Reply-To: <1340988354-26981-1-git-send-email-vincent.sanders@collabora.co.uk>

2012/6/29 Vincent Sanders <vincent.sanders@collabora.co.uk>:

> AF_BUS is a message oriented inter process communication system.

We have a very huge and important in-kernel IPC message passer
in drivers/staging/android/binder.c

It's deployed in some 400 million devices according to latest reports.
John Stultz & Anton Vorontsov are trying to look after these Android
drivers a bit...

I and others discussed this in the past with the Android folks. Dianne
makes an excellent summary of how it works here:
https://lkml.org/lkml/2009/6/25/3

If we could all be convinced that this thing also fulfills the needs
of what binder does, this is a pretty solid case for it too. I can
sure see that some of the shortcuts that Android is taking with
binder try to address the same issue of high-speed IPC loopholes
through the kernel and some kind of security model.

Whether Android would actually use it (or wrap it) is a totally
different question, but what I think we need to know is whether it
*could*. And staging code has to move forward, maybe this
is the direction it should move?

Yours,
Linus Walleij

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox