* Re: TCP transmit performance regression
From: David Miller @ 2012-07-05 10:02 UTC (permalink / raw)
To: eric.dumazet; +Cc: tom.leiming, netdev
In-Reply-To: <1341481760.2583.3579.camel@edumazet-glaptop>
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Thu, 05 Jul 2012 11:49:20 +0200
> - ax_skb->data = packet;
That's really scary.
^ permalink raw reply
* Re: [PATCH next-next] ppp: change default for incoming protocol filter to NPMODE_DROP
From: David Miller @ 2012-07-05 10:00 UTC (permalink / raw)
To: bcrl; +Cc: netdev, linux-ppp
In-Reply-To: <20120704013258.GA26225@kvack.org>
From: Benjamin LaHaise <bcrl@kvack.org>
Date: Tue, 3 Jul 2012 21:32:58 -0400
> By default, the ppp_generic code initializes the npmode array that filters
> incoming packet to accept packets for all protocols. This behaviour is
> incorrect, as it results in packets for protocols that an older version
> of a PPP implementation may not be aware of to be incorrectly accepted.
> This behaviour is visible, for example, when sending IPv6 packets across a
> ppp link where pppd has only been configured to use IPv4.
>
> This change should be safe since pppd will correctly set the protocols it
> negotiates to NPMODE_PASS as the appropriate protocols transition to an Up
> state.
>
> Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>
As far as I can tell, this has been this way for a very long time.
Therefore it is the applications responsibility to adjust the filters
to suit their needs and we really can't make such adjustments to this
behavior.
^ permalink raw reply
* Re: [PATCH 0/19] Disconnect neigh from dst_entry
From: David Miller @ 2012-07-05 9:55 UTC (permalink / raw)
To: netdev
In-Reply-To: <20120703.024543.1597240990462633709.davem@davemloft.net>
From: David Miller <davem@davemloft.net>
Date: Tue, 03 Jul 2012 02:45:43 -0700 (PDT)
> This finally severs neighbour table entries from dst_entry enough that
> we no longer depend upon them outside of the individual protocols.
I'm pushing this now to net-next, with three minor changes.
1) I fubar'd the neigh lookup in the sch_teql changes, I needed to
add the following code block to __teql_resolve():
if (dst->dev != dev) {
struct neighbour *mn;
mn = __neigh_lookup_errno(n->tbl, n->primary_key, dev);
neigh_release(n);
if (IS_ERR(mn))
return PTR_ERR(mn);
n = mn;
}
2) I adjusted the comment in the neigh backlog handler of
neigh_update() to read as follows:
/* Why not just use 'neigh' as-is? The problem is that
* things such as shaper, eql, and sch_teql can end up
* using alternative, different, neigh objects to output
* the packet in the output path. So what we need to do
* here is re-lookup the top-level neigh in the path so
* we can reinject the packet there.
*/
3) The redirect network event needs to also pass in the path
destination address so that we can have it available for
all callers of t3_l2t_get().
^ permalink raw reply
* Re: TCP transmit performance regression
From: Eric Dumazet @ 2012-07-05 9:49 UTC (permalink / raw)
To: Ming Lei; +Cc: Network Development, David Miller
In-Reply-To: <CACVXFVPTXB7t=zwkm+HTgDaF3bA02bzff_52S+UAr51PfpvpCg@mail.gmail.com>
On Thu, 2012-07-05 at 16:42 +0800, Ming Lei wrote:
> On Thu, Jul 5, 2012 at 4:33 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> > On Thu, 2012-07-05 at 16:27 +0800, Ming Lei wrote:
> >
> >> After some investigation, the problem is caused by enabling
> >> DEBUG_SLAB, so it is not a regression.
> >>
> >
> > Strange, unless your machine is a _very_ slow one maybe ?
>
> It is a beagle-xm board, and its cpu is ARMv7, 1GHz.
OK, driver seems buggy, please try following patch (on both sides if
possible)
drivers/net/usb/smsc95xx.c | 11 ++++-------
1 file changed, 4 insertions(+), 7 deletions(-)
diff --git a/drivers/net/usb/smsc95xx.c b/drivers/net/usb/smsc95xx.c
index b1112e7..0a4ae35 100644
--- a/drivers/net/usb/smsc95xx.c
+++ b/drivers/net/usb/smsc95xx.c
@@ -1084,26 +1084,23 @@ static int smsc95xx_rx_fixup(struct usbnet *dev, struct sk_buff *skb)
if (skb->len == size) {
if (dev->net->features & NETIF_F_RXCSUM)
smsc95xx_rx_csum_offload(skb);
- skb_trim(skb, skb->len - 4); /* remove fcs */
+ __skb_trim(skb, skb->len - 4); /* remove fcs */
skb->truesize = size + sizeof(struct sk_buff);
return 1;
}
- ax_skb = skb_clone(skb, GFP_ATOMIC);
+ ax_skb = netdev_alloc_skb_ip_align(dev->net, size);
if (unlikely(!ax_skb)) {
netdev_warn(dev->net, "Error allocating skb\n");
return 0;
}
- ax_skb->len = size;
- ax_skb->data = packet;
- skb_set_tail_pointer(ax_skb, size);
+ memcpy(skb_put(ax_skb, size), packet, size);
if (dev->net->features & NETIF_F_RXCSUM)
smsc95xx_rx_csum_offload(ax_skb);
- skb_trim(ax_skb, ax_skb->len - 4); /* remove fcs */
- ax_skb->truesize = size + sizeof(struct sk_buff);
+ __skb_trim(ax_skb, ax_skb->len - 4); /* remove fcs */
usbnet_skb_return(dev, ax_skb);
}
^ permalink raw reply related
* [PATCH v2] cgroup: fix panic in netprio_cgroup
From: Gao feng @ 2012-07-05 9:28 UTC (permalink / raw)
To: davem; +Cc: netdev, linux-kernel, nhorman, tj, lizefan, eric.dumazet,
Gao feng
we set max_prioidx to the first zero bit index of prioidx_map in
function get_prioidx.
So when we delete the low index netprio cgroup and adding a new
netprio cgroup again,the max_prioidx will be set to the low index.
when we set the high index cgroup's net_prio.ifpriomap,the function
write_priomap will call update_netdev_tables to alloc memory which
size is sizeof(struct netprio_map) + sizeof(u32) * (max_prioidx + 1),
so the size of array that map->priomap point to is max_prioidx +1,
which is low than what we actually need.
fix this by adding check in get_prioidx,only set max_prioidx when
max_prioidx low than the new prioidx.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
---
net/core/netprio_cgroup.c | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)
diff --git a/net/core/netprio_cgroup.c b/net/core/netprio_cgroup.c
index 5b8aa2f..aa907ed 100644
--- a/net/core/netprio_cgroup.c
+++ b/net/core/netprio_cgroup.c
@@ -49,8 +49,9 @@ static int get_prioidx(u32 *prio)
return -ENOSPC;
}
set_bit(prioidx, prioidx_map);
+ if (atomic_read(&max_prioidx) < prioidx)
+ atomic_set(&max_prioidx, prioidx);
spin_unlock_irqrestore(&prioidx_map_lock, flags);
- atomic_set(&max_prioidx, prioidx);
*prio = prioidx;
return 0;
}
--
1.7.7.6
^ permalink raw reply related
* Re: [PATCH] cgroup: fix panic in netprio_cgroup
From: Gao feng @ 2012-07-05 9:15 UTC (permalink / raw)
To: David Miller; +Cc: netdev, linux-kernel, nhorman, tj, lizefan
In-Reply-To: <20120705.015841.2231353345763821829.davem@davemloft.net>
于 2012年07月05日 16:58, David Miller 写道:
>
> Why did you post this twice?
Sorry to confuse you, there are something wrong with my git sendmail config.
I sent the first patch but I can't find it in the maillist,so I
sent it again.
>
> Is there a difference between the first patch and the second
> one you posted? If so, what is that difference?
there isn't a difference between them.
Sorry again.
Thanks.
^ permalink raw reply
* Re: [PATCH] cgroup: fix panic in netprio_cgroup
From: Gao feng @ 2012-07-05 9:10 UTC (permalink / raw)
To: Eric Dumazet; +Cc: davem, netdev, linux-kernel, nhorman, tj, lizefan
In-Reply-To: <1341477809.2583.3437.camel@edumazet-glaptop>
于 2012年07月05日 16:43, Eric Dumazet 写道:
> On Thu, 2012-07-05 at 16:31 +0800, Gao feng wrote:
>> we set max_prioidx to the first zero bit index of prioidx_map in
>> function get_prioidx.
>>
>> So when we delete the low index netprio cgroup and adding a new
>> netprio cgroup again,the max_prioidx will be set to the low index.
>>
>> when we set the high index cgroup's net_prio.ifpriomap,the function
>> write_priomap will call update_netdev_tables to alloc memory which
>> size is sizeof(struct netprio_map) + sizeof(u32) * (max_prioidx + 1),
>> so the size of array that map->priomap point to is max_prioidx +1,
>> which is low than what we actually need.
>>
>> fix this by adding check in get_prioidx,only set max_prioidx when
>> max_prioidx low than the new prioidx.
>>
>> Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
>> ---
>> net/core/netprio_cgroup.c | 3 ++-
>> 1 files changed, 2 insertions(+), 1 deletions(-)
>>
>> diff --git a/net/core/netprio_cgroup.c b/net/core/netprio_cgroup.c
>> index 5b8aa2f..586f7d9 100644
>> --- a/net/core/netprio_cgroup.c
>> +++ b/net/core/netprio_cgroup.c
>> @@ -50,7 +50,8 @@ static int get_prioidx(u32 *prio)
>> }
>> set_bit(prioidx, prioidx_map);
>> spin_unlock_irqrestore(&prioidx_map_lock, flags);
>> - atomic_set(&max_prioidx, prioidx);
>> + if (atomic_read(&max_prioidx) < prioidx)
>> + atomic_set(&max_prioidx, prioidx);
>> *prio = prioidx;
>> return 0;
>> }
>
> This is still racy.
>
> Please do this before the
> spin_unlock_irqrestore(&prioidx_map_lock, flags);
>
Thanks Eric,you are right
I will fix and resent it.
^ permalink raw reply
* Re: [PATCH] cgroup: fix panic in netprio_cgroup
From: David Miller @ 2012-07-05 8:58 UTC (permalink / raw)
To: gaofeng; +Cc: netdev, linux-kernel, nhorman, tj, lizefan
In-Reply-To: <1341477102-16988-1-git-send-email-gaofeng@cn.fujitsu.com>
Why did you post this twice?
Is there a difference between the first patch and the second
one you posted? If so, what is that difference?
^ permalink raw reply
* Re: [PATCH] cgroup: fix panic in netprio_cgroup
From: Eric Dumazet @ 2012-07-05 8:43 UTC (permalink / raw)
To: Gao feng; +Cc: davem, netdev, linux-kernel, nhorman, tj, lizefan
In-Reply-To: <1341477102-16988-1-git-send-email-gaofeng@cn.fujitsu.com>
On Thu, 2012-07-05 at 16:31 +0800, Gao feng wrote:
> we set max_prioidx to the first zero bit index of prioidx_map in
> function get_prioidx.
>
> So when we delete the low index netprio cgroup and adding a new
> netprio cgroup again,the max_prioidx will be set to the low index.
>
> when we set the high index cgroup's net_prio.ifpriomap,the function
> write_priomap will call update_netdev_tables to alloc memory which
> size is sizeof(struct netprio_map) + sizeof(u32) * (max_prioidx + 1),
> so the size of array that map->priomap point to is max_prioidx +1,
> which is low than what we actually need.
>
> fix this by adding check in get_prioidx,only set max_prioidx when
> max_prioidx low than the new prioidx.
>
> Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
> ---
> net/core/netprio_cgroup.c | 3 ++-
> 1 files changed, 2 insertions(+), 1 deletions(-)
>
> diff --git a/net/core/netprio_cgroup.c b/net/core/netprio_cgroup.c
> index 5b8aa2f..586f7d9 100644
> --- a/net/core/netprio_cgroup.c
> +++ b/net/core/netprio_cgroup.c
> @@ -50,7 +50,8 @@ static int get_prioidx(u32 *prio)
> }
> set_bit(prioidx, prioidx_map);
> spin_unlock_irqrestore(&prioidx_map_lock, flags);
> - atomic_set(&max_prioidx, prioidx);
> + if (atomic_read(&max_prioidx) < prioidx)
> + atomic_set(&max_prioidx, prioidx);
> *prio = prioidx;
> return 0;
> }
This is still racy.
Please do this before the
spin_unlock_irqrestore(&prioidx_map_lock, flags);
^ permalink raw reply
* Re: TCP transmit performance regression
From: Ming Lei @ 2012-07-05 8:42 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Network Development, David Miller
In-Reply-To: <1341477192.2583.3415.camel@edumazet-glaptop>
[-- Attachment #1: Type: text/plain, Size: 759 bytes --]
On Thu, Jul 5, 2012 at 4:33 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Thu, 2012-07-05 at 16:27 +0800, Ming Lei wrote:
>
>> After some investigation, the problem is caused by enabling
>> DEBUG_SLAB, so it is not a regression.
>>
>
> Strange, unless your machine is a _very_ slow one maybe ?
It is a beagle-xm board, and its cpu is ARMv7, 1GHz.
>
>>
>> Looks no improvement. I still don't know why the window size becomes so
>> small even in good situation(disabling DEBUG_SLAB), and the small
>> window size will cause almost every tcp data packet acked.
>
> You are probably missing the fact that window scaling is enabled.
>
> If you dont post a pcap, I am afraid we cant really help.
See attachment for the pcap trace.
Thanks,
--
Ming Lei
[-- Attachment #2: tcp.pcap --]
[-- Type: application/octet-stream, Size: 97922 bytes --]
^ permalink raw reply
* Re: TCP transmit performance regression
From: Eric Dumazet @ 2012-07-05 8:33 UTC (permalink / raw)
To: Ming Lei; +Cc: Network Development, David Miller
In-Reply-To: <CACVXFVNxcdEYd-KmkUe9=8+x_9s-ZVuoM=FfZ=QXa7w_qRiTnw@mail.gmail.com>
On Thu, 2012-07-05 at 16:27 +0800, Ming Lei wrote:
> After some investigation, the problem is caused by enabling
> DEBUG_SLAB, so it is not a regression.
>
Strange, unless your machine is a _very_ slow one maybe ?
>
> Looks no improvement. I still don't know why the window size becomes so
> small even in good situation(disabling DEBUG_SLAB), and the small
> window size will cause almost every tcp data packet acked.
You are probably missing the fact that window scaling is enabled.
If you dont post a pcap, I am afraid we cant really help.
^ permalink raw reply
* [PATCH] cgroup: fix panic in netprio_cgroup
From: Gao feng @ 2012-07-05 8:31 UTC (permalink / raw)
To: davem; +Cc: netdev, linux-kernel, nhorman, tj, lizefan, Gao feng
we set max_prioidx to the first zero bit index of prioidx_map in
function get_prioidx.
So when we delete the low index netprio cgroup and adding a new
netprio cgroup again,the max_prioidx will be set to the low index.
when we set the high index cgroup's net_prio.ifpriomap,the function
write_priomap will call update_netdev_tables to alloc memory which
size is sizeof(struct netprio_map) + sizeof(u32) * (max_prioidx + 1),
so the size of array that map->priomap point to is max_prioidx +1,
which is low than what we actually need.
fix this by adding check in get_prioidx,only set max_prioidx when
max_prioidx low than the new prioidx.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
---
net/core/netprio_cgroup.c | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)
diff --git a/net/core/netprio_cgroup.c b/net/core/netprio_cgroup.c
index 5b8aa2f..586f7d9 100644
--- a/net/core/netprio_cgroup.c
+++ b/net/core/netprio_cgroup.c
@@ -50,7 +50,8 @@ static int get_prioidx(u32 *prio)
}
set_bit(prioidx, prioidx_map);
spin_unlock_irqrestore(&prioidx_map_lock, flags);
- atomic_set(&max_prioidx, prioidx);
+ if (atomic_read(&max_prioidx) < prioidx)
+ atomic_set(&max_prioidx, prioidx);
*prio = prioidx;
return 0;
}
--
1.7.7.6
^ permalink raw reply related
* [PATCH net-next v2] ipv4: defer fib_compute_spec_dst() call
From: Eric Dumazet @ 2012-07-05 8:30 UTC (permalink / raw)
To: David Miller; +Cc: netdev
From: Eric Dumazet <edumazet@google.com>
ip_options_compile() can avoid calling fib_compute_spec_dst()
by default, and perform the call only if needed.
David suggested to add a helper to make the call only once.
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
net/ipv4/ip_options.c | 15 ++++++++++++---
1 file changed, 12 insertions(+), 3 deletions(-)
diff --git a/net/ipv4/ip_options.c b/net/ipv4/ip_options.c
index 1f02251..a19d647 100644
--- a/net/ipv4/ip_options.c
+++ b/net/ipv4/ip_options.c
@@ -242,6 +242,15 @@ void ip_options_fragment(struct sk_buff *skb)
opt->ts_needtime = 0;
}
+/* helper used by ip_options_compile() to call fib_compute_spec_dst()
+ * at most one time.
+ */
+static void spec_dst_fill(__be32 *spec_dst, struct sk_buff *skb)
+{
+ if (*spec_dst == htonl(INADDR_ANY))
+ *spec_dst = fib_compute_spec_dst(skb);
+}
+
/*
* Verify options and fill pointers in struct options.
* Caller should clear *opt, and set opt->data.
@@ -251,7 +260,7 @@ void ip_options_fragment(struct sk_buff *skb)
int ip_options_compile(struct net *net,
struct ip_options *opt, struct sk_buff *skb)
{
- __be32 spec_dst = (__force __be32) 0;
+ __be32 spec_dst = htonl(INADDR_ANY);
unsigned char *pp_ptr = NULL;
struct rtable *rt = NULL;
unsigned char *optptr;
@@ -260,8 +269,6 @@ int ip_options_compile(struct net *net,
if (skb != NULL) {
rt = skb_rtable(skb);
- if (rt)
- spec_dst = fib_compute_spec_dst(skb);
optptr = (unsigned char *)&(ip_hdr(skb)[1]);
} else
optptr = opt->__data;
@@ -334,6 +341,7 @@ int ip_options_compile(struct net *net,
goto error;
}
if (rt) {
+ spec_dst_fill(&spec_dst, skb);
memcpy(&optptr[optptr[2]-1], &spec_dst, 4);
opt->is_changed = 1;
}
@@ -376,6 +384,7 @@ int ip_options_compile(struct net *net,
}
opt->ts = optptr - iph;
if (rt) {
+ spec_dst_fill(&spec_dst, skb);
memcpy(&optptr[optptr[2]-1], &spec_dst, 4);
timeptr = &optptr[optptr[2]+3];
}
^ permalink raw reply related
* Re: TCP transmit performance regression
From: Ming Lei @ 2012-07-05 8:27 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Network Development, David Miller
In-Reply-To: <1341474192.2583.3299.camel@edumazet-glaptop>
On Thu, Jul 5, 2012 at 3:43 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Thu, 2012-07-05 at 09:45 +0800, Ming Lei wrote:
>> Hi,
>>
>> I observed that on both 3.5-rc5 and 3.5-rc5-next, TCP transmit performance
>> degrades a lot, see my below simple test:
>>
>> 1, test box
>> NIC: 100M USB, normally can reach > 90Mbits/sec
>>
>
> What was the last "OK" kernel version ?
After some investigation, the problem is caused by enabling
DEBUG_SLAB, so it is not a regression.
>
> What NIC driver is it ?
>
>> 2, run below command on the box:
>> [root@root]#iperf -c 192.168.0.103 -w 131072 -t 10
>> ------------------------------------------------------------
>> Client connecting to 192.168.0.103, TCP port 5001
>> TCP window size: 256 KByte (WARNING: requested 128 KByte)
>> ------------------------------------------------------------
>> [ 3] local 192.168.0.108 port 59315 connected with 192.168.0.103 port 5001
>> [ ID] Interval Transfer Bandwidth
>> [ 3] 0.0-10.0 sec 40.4 MBytes 33.9 Mbits/sec
>>
>> note: 192.168.0.103 is another production machine running 'iperf -s -w 131072'
>>
>> 3, from traffic captured in wireshark, the window size of most of tcp packets
>> from the test box to 192.168.0.103 is set as 229, looks very weird and should
>> be the cause of performance regression.
>>
>
> Packets sent to 192.168.0.103 announce the window suitable for packets
> in the other way, so not relevant to your problem.
>
> Could you do
>
> # tcpdump -i eth0 -s 100 -c 1000 -w tcp.pcap host 192.168.0.103 &
> # iperf -c 192.168.0.103 -w 131072 -t 10
>
> and post the tcp.pcap file ?
>
> By the way, if you remove -w 131072 (on both sides), I guess throughput
> will increase.
Looks no improvement. I still don't know why the window size becomes so
small even in good situation(disabling DEBUG_SLAB), and the small
window size will cause almost every tcp data packet acked.
Thanks,
--
Ming Lei
^ permalink raw reply
* [patch] [AX.25]: small cleanup in ax25_addr_parse()
From: Dan Carpenter @ 2012-07-05 8:27 UTC (permalink / raw)
To: Ralf Baechle; +Cc: David S. Miller, linux-hams, netdev, kernel-janitors
The comments were wrong here because "AX25_MAX_DIGIS" is 8 but the
comments say 6. Also I've changed the "7" to "AX25_ADDR_LEN".
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
diff --git a/net/ax25/ax25_addr.c b/net/ax25/ax25_addr.c
index 9162409..e7c9b0e 100644
--- a/net/ax25/ax25_addr.c
+++ b/net/ax25/ax25_addr.c
@@ -189,8 +189,10 @@ const unsigned char *ax25_addr_parse(const unsigned char *buf, int len,
digi->ndigi = 0;
while (!(buf[-1] & AX25_EBIT)) {
- if (d >= AX25_MAX_DIGIS) return NULL; /* Max of 6 digis */
- if (len < 7) return NULL; /* Short packet */
+ if (d >= AX25_MAX_DIGIS)
+ return NULL;
+ if (len < AX25_ADDR_LEN)
+ return NULL;
memcpy(&digi->calls[d], buf, AX25_ADDR_LEN);
digi->ndigi = d + 1;
^ permalink raw reply related
* Re: [PATCH] ipv4: Create and use fib_compute_spec_dst() helper.
From: Eric Dumazet @ 2012-07-05 8:10 UTC (permalink / raw)
To: David Miller; +Cc: ja, netdev
In-Reply-To: <20120705.005940.1078811938047681715.davem@davemloft.net>
On Thu, 2012-07-05 at 00:59 -0700, David Miller wrote:
> Yes, this is a great idea. Actually in some obscure cases your
> change can cause us to compute it more than once I think.
>
> I'd suggest we do something like create a helper function above this
> code in ip_options.c that checks whether spec_dst is INADDR_ANY or
> not, to guard computing it multiple times.
>
> Could you put together a quick patch like that?
Sure I'll do that.
^ permalink raw reply
* Re: AF_BUS socket address family
From: Linus Walleij @ 2012-07-05 7:59 UTC (permalink / raw)
To: Vincent Sanders
Cc: netdev, linux-kernel, David S. Miller, Arve Hjønnevåg,
Daniel Walker, John Stultz, Anton Vorontsov, Greg Kroah-Hartman
In-Reply-To: <1340988354-26981-1-git-send-email-vincent.sanders@collabora.co.uk>
2012/6/29 Vincent Sanders <vincent.sanders@collabora.co.uk>:
> AF_BUS is a message oriented inter process communication system.
We have a very huge and important in-kernel IPC message passer
in drivers/staging/android/binder.c
It's deployed in some 400 million devices according to latest reports.
John Stultz & Anton Vorontsov are trying to look after these Android
drivers a bit...
I and others discussed this in the past with the Android folks. Dianne
makes an excellent summary of how it works here:
https://lkml.org/lkml/2009/6/25/3
If we could all be convinced that this thing also fulfills the needs
of what binder does, this is a pretty solid case for it too. I can
sure see that some of the shortcuts that Android is taking with
binder try to address the same issue of high-speed IPC loopholes
through the kernel and some kind of security model.
Whether Android would actually use it (or wrap it) is a totally
different question, but what I think we need to know is whether it
*could*. And staging code has to move forward, maybe this
is the direction it should move?
Yours,
Linus Walleij
^ permalink raw reply
* Re: [PATCH] ipv4: Create and use fib_compute_spec_dst() helper.
From: David Miller @ 2012-07-05 7:59 UTC (permalink / raw)
To: eric.dumazet; +Cc: ja, netdev
In-Reply-To: <1341474745.2583.3325.camel@edumazet-glaptop>
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Thu, 05 Jul 2012 09:52:25 +0200
> [PATCH] ipv4: defer fib_compute_spec_dst() call
>
> ip_options_compile() can avoid calling fib_compute_spec_dst()
> by default, and perform the call if needed.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
Yes, this is a great idea. Actually in some obscure cases your
change can cause us to compute it more than once I think.
I'd suggest we do something like create a helper function above this
code in ip_options.c that checks whether spec_dst is INADDR_ANY or
not, to guard computing it multiple times.
Could you put together a quick patch like that?
^ permalink raw reply
* Re: [PATCH] ipv4: Create and use fib_compute_spec_dst() helper.
From: Eric Dumazet @ 2012-07-05 7:52 UTC (permalink / raw)
To: David Miller; +Cc: ja, netdev
In-Reply-To: <20120704.161335.1503971699878518173.davem@davemloft.net>
On Wed, 2012-07-04 at 16:13 -0700, David Miller wrote:
> ====================
> ipv4: Fix crashes in ip_options_compile().
>
> The spec_dst uses should be guarded by skb_rtable() being non-NULL
> not just the SKB being non-null.
>
> Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
> Signed-off-by: David S. Miller <davem@davemloft.net>
> ---
Seems good to me thanks.
By the way, maybe we can defer fib_compute_spec_dst() call to the point
we really need it ?
[PATCH] ipv4: defer fib_compute_spec_dst() call
ip_options_compile() can avoid calling fib_compute_spec_dst()
by default, and perform the call if needed.
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
diff --git a/net/ipv4/ip_options.c b/net/ipv4/ip_options.c
index 1f02251..54ab83f 100644
--- a/net/ipv4/ip_options.c
+++ b/net/ipv4/ip_options.c
@@ -260,8 +260,6 @@ int ip_options_compile(struct net *net,
if (skb != NULL) {
rt = skb_rtable(skb);
- if (rt)
- spec_dst = fib_compute_spec_dst(skb);
optptr = (unsigned char *)&(ip_hdr(skb)[1]);
} else
optptr = opt->__data;
@@ -334,6 +332,7 @@ int ip_options_compile(struct net *net,
goto error;
}
if (rt) {
+ spec_dst = fib_compute_spec_dst(skb);
memcpy(&optptr[optptr[2]-1], &spec_dst, 4);
opt->is_changed = 1;
}
@@ -376,6 +375,7 @@ int ip_options_compile(struct net *net,
}
opt->ts = optptr - iph;
if (rt) {
+ spec_dst = fib_compute_spec_dst(skb);
memcpy(&optptr[optptr[2]-1], &spec_dst, 4);
timeptr = &optptr[optptr[2]+3];
}
^ permalink raw reply related
* Re: TCP transmit performance regression
From: Eric Dumazet @ 2012-07-05 7:43 UTC (permalink / raw)
To: Ming Lei; +Cc: Network Development, David Miller
In-Reply-To: <CACVXFVNM-Db=_793SVfRj+nxGtNG0pRFrwc_F9TGbU0FfES63A@mail.gmail.com>
On Thu, 2012-07-05 at 09:45 +0800, Ming Lei wrote:
> Hi,
>
> I observed that on both 3.5-rc5 and 3.5-rc5-next, TCP transmit performance
> degrades a lot, see my below simple test:
>
> 1, test box
> NIC: 100M USB, normally can reach > 90Mbits/sec
>
What was the last "OK" kernel version ?
What NIC driver is it ?
> 2, run below command on the box:
> [root@root]#iperf -c 192.168.0.103 -w 131072 -t 10
> ------------------------------------------------------------
> Client connecting to 192.168.0.103, TCP port 5001
> TCP window size: 256 KByte (WARNING: requested 128 KByte)
> ------------------------------------------------------------
> [ 3] local 192.168.0.108 port 59315 connected with 192.168.0.103 port 5001
> [ ID] Interval Transfer Bandwidth
> [ 3] 0.0-10.0 sec 40.4 MBytes 33.9 Mbits/sec
>
> note: 192.168.0.103 is another production machine running 'iperf -s -w 131072'
>
> 3, from traffic captured in wireshark, the window size of most of tcp packets
> from the test box to 192.168.0.103 is set as 229, looks very weird and should
> be the cause of performance regression.
>
Packets sent to 192.168.0.103 announce the window suitable for packets
in the other way, so not relevant to your problem.
Could you do
# tcpdump -i eth0 -s 100 -c 1000 -w tcp.pcap host 192.168.0.103 &
# iperf -c 192.168.0.103 -w 131072 -t 10
and post the tcp.pcap file ?
By the way, if you remove -w 131072 (on both sides), I guess throughput
will increase.
^ permalink raw reply
* Re: [PATCH net 3/7] qlge: Garbage values shown in extra info during selftest.
From: David Miller @ 2012-07-05 7:23 UTC (permalink / raw)
To: jitendra.kalsaria; +Cc: netdev, ron.mercer, Dept_NX_Linux_NIC_Driver
In-Reply-To: <1341272514-5156-4-git-send-email-jitendra.kalsaria@qlogic.com>
Why are you posting an arbitrary patch from a patch series,
yet not the rest of that series?
This needs to be sent alongside the rest of the series.
^ permalink raw reply
* RE: BISECTED: Re: REGRESSION: 3.4.0->3.5.0-rc2 kernel WARNING on cable plug on Acer Aspire One, no network
From: Marek Szyprowski @ 2012-07-05 6:58 UTC (permalink / raw)
To: 'Alex Villacís Lasso', 'Francois Romieu',
netdev
In-Reply-To: <4FF514B2.4050000@palosanto.com>
Hello,
On Thursday, July 05, 2012 6:15 AM Alex Villacís Lasso wrote:
> El 04/07/12 02:02, Marek Szyprowski escribió:
> > Hello,
> >
> > On Tuesday, July 03, 2012 4:27 PM Alex Villací¬s Lasso wrote:
> >
> >> El 03/07/12 00:40, Marek Szyprowski escribió:
> >>> Hi Alex,
> >>>
> >>> On Tuesday, July 03, 2012 4:45 AM Alex Villacís Lasso wrote:
> >>>
> >>>> -------- Mensaje original --------
> >>>> Asunto: BISECTED: Re: REGRESSION: 3.4.0->3.5.0-rc2 kernel WARNING on cable
> >>>> plug on Acer Aspire One, no network Fecha: Mon, 02 Jul 2012 21:33:41 -0500 De:
> >>>> Alex Villacís Lasso <a_villacis@palosanto.com> Para: Francois Romieu
> >>>> <romieu@fr.zoreil.com> CC: netdev@vger.kernel.org
> >>>> El 01/07/12 08:50, Alex Villacís Lasso escribió:
> >>>>> El 11/06/12 16:38, Francois Romieu escribió:
> >>>>>> Alex Villacís Lasso <a_villacis@palosanto.com> :
> >>>>>> [...]
> >>>>>>> $ grep XID dmesg-3.5.0-rc2.txt
> >>>>>>> [ 15.873858] r8169 0000:02:00.0: eth0: RTL8102e at 0xf7c0e000,
> >>>>>>> 00:1e:68:e5:5d:b1, XID 04a00000 IRQ 44
> >>>>>> The 8102e has not been touched by that many suspect patches but I do
> >>>>>> not see where the problem is :o(
> >>>>>>
> >>>>>> Can you peel off the r8169 patches between 3.4.0 and 3.5-rc ?
> >>>>>>
> >>>>> Still present in 3.5-rc5. Bisection still in progress.
> >>>>>
> >>>>> --
> >>>>> To unsubscribe from this list: send the line "unsubscribe netdev" in
> >>>>> the body of a message to majordomo@vger.kernel.org
> >>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
> >>>> My full bisection points to this commit:
> >>>>
> >>>> commit 0a2b9a6ea93650b8a00f9fd5ee8fdd25671e2df6
> >>>> Author: Marek Szyprowski <m.szyprowski@samsung.com>
> >>>> Date: Thu Dec 29 13:09:51 2011 +0100
> >>>>
> >>>> X86: integrate CMA with DMA-mapping subsystem
> >>>>
> >>>> This patch adds support for CMA to dma-mapping subsystem for x86
> >>>> architecture that uses common pci-dma/pci-nommu implementation. This
> >>>> allows to test CMA on KVM/QEMU and a lot of common x86 boxes.
> >>>>
> >>>> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
> >>>> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
> >>>> CC: Michal Nazarewicz <mina86@mina86.com>
> >>>> Acked-by: Arnd Bergmann <arnd@arndb.de>
> >>>>
> >>>> Is this commit somehow messing with the network card DMA?
> >>> This commit in fact touches DMA-mapping subsystem and introduces a bug,
> >>> which has been finally fixed by commit c080e26edc3a2a3 merged to v3.5-rc3.
> >>> After applying it the DMA-mapping subsystem should work exactly the same was
> >>> as in v3.4. Could you please check if it fixes this issue?
> >>>
> >>> Best regards
> >> No. It still fails in 3.5-rc5, as mentioned before.
> > Hmm. I was a bit confused, because both the subject and git bisect log pointed to v3.5-rc2,
> > which had that bug. Maybe there is one some other issue present in v3.5-rc5 not related to
> > my patches?
> >
> > Could you check with v3.5-rc5 if reverting patch c080e26edc3a2a3cdfa4c430c663ee1c3bbd8fae
> > and 0a2b9a6ea93650b8a00f9fd5ee8fdd25671e2df6 fixes the problems with rtl driver?
> >
> > Best regards
> Reverting the two patches indeed fixes the bug on -rc5.
That's really strange. Could you check if you have CMA disabled in the config? After preparing
a c080e26edc3a2a3cdfa4c430c663ee1c3bbd8fae fixup patch, I was really convinced that there are
no functional changes in x86 dma mapping code when CMA is disabled. I will provide some
patches to revert different parts of my changes, so we will find which line causes issues.
Best regards
--
Marek Szyprowski
Samsung Poland R&D Center
^ permalink raw reply
* Re: BISECTED: Re: REGRESSION: 3.4.0->3.5.0-rc2 kernel WARNING on cable plug on Acer Aspire One, no network
From: Alex Villacís Lasso @ 2012-07-05 4:14 UTC (permalink / raw)
To: Marek Szyprowski, Francois Romieu, netdev
In-Reply-To: <000901cd59b2$f2a542e0$d7efc8a0$%szyprowski@samsung.com>
El 04/07/12 02:02, Marek Szyprowski escribió:
> Hello,
>
> On Tuesday, July 03, 2012 4:27 PM Alex Villací¬s Lasso wrote:
>
>> El 03/07/12 00:40, Marek Szyprowski escribió:
>>> Hi Alex,
>>>
>>> On Tuesday, July 03, 2012 4:45 AM Alex Villacís Lasso wrote:
>>>
>>>> -------- Mensaje original --------
>>>> Asunto: BISECTED: Re: REGRESSION: 3.4.0->3.5.0-rc2 kernel WARNING on cable
>>>> plug on Acer Aspire One, no network Fecha: Mon, 02 Jul 2012 21:33:41 -0500 De:
>>>> Alex Villacís Lasso <a_villacis@palosanto.com> Para: Francois Romieu
>>>> <romieu@fr.zoreil.com> CC: netdev@vger.kernel.org
>>>> El 01/07/12 08:50, Alex Villacís Lasso escribió:
>>>>> El 11/06/12 16:38, Francois Romieu escribió:
>>>>>> Alex Villacís Lasso <a_villacis@palosanto.com> :
>>>>>> [...]
>>>>>>> $ grep XID dmesg-3.5.0-rc2.txt
>>>>>>> [ 15.873858] r8169 0000:02:00.0: eth0: RTL8102e at 0xf7c0e000,
>>>>>>> 00:1e:68:e5:5d:b1, XID 04a00000 IRQ 44
>>>>>> The 8102e has not been touched by that many suspect patches but I do
>>>>>> not see where the problem is :o(
>>>>>>
>>>>>> Can you peel off the r8169 patches between 3.4.0 and 3.5-rc ?
>>>>>>
>>>>> Still present in 3.5-rc5. Bisection still in progress.
>>>>>
>>>>> --
>>>>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>>>>> the body of a message to majordomo@vger.kernel.org
>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>> My full bisection points to this commit:
>>>>
>>>> commit 0a2b9a6ea93650b8a00f9fd5ee8fdd25671e2df6
>>>> Author: Marek Szyprowski <m.szyprowski@samsung.com>
>>>> Date: Thu Dec 29 13:09:51 2011 +0100
>>>>
>>>> X86: integrate CMA with DMA-mapping subsystem
>>>>
>>>> This patch adds support for CMA to dma-mapping subsystem for x86
>>>> architecture that uses common pci-dma/pci-nommu implementation. This
>>>> allows to test CMA on KVM/QEMU and a lot of common x86 boxes.
>>>>
>>>> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
>>>> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
>>>> CC: Michal Nazarewicz <mina86@mina86.com>
>>>> Acked-by: Arnd Bergmann <arnd@arndb.de>
>>>>
>>>> Is this commit somehow messing with the network card DMA?
>>> This commit in fact touches DMA-mapping subsystem and introduces a bug,
>>> which has been finally fixed by commit c080e26edc3a2a3 merged to v3.5-rc3.
>>> After applying it the DMA-mapping subsystem should work exactly the same was
>>> as in v3.4. Could you please check if it fixes this issue?
>>>
>>> Best regards
>> No. It still fails in 3.5-rc5, as mentioned before.
> Hmm. I was a bit confused, because both the subject and git bisect log pointed to v3.5-rc2,
> which had that bug. Maybe there is one some other issue present in v3.5-rc5 not related to
> my patches?
>
> Could you check with v3.5-rc5 if reverting patch c080e26edc3a2a3cdfa4c430c663ee1c3bbd8fae
> and 0a2b9a6ea93650b8a00f9fd5ee8fdd25671e2df6 fixes the problems with rtl driver?
>
> Best regards
Reverting the two patches indeed fixes the bug on -rc5.
^ permalink raw reply
* RE: [PATCH 1/2] be2net: Fix Endian
From: Somnath.Kotur @ 2012-07-05 4:00 UTC (permalink / raw)
To: roy.qing.li, netdev
In-Reply-To: <1341453942-4198-1-git-send-email-roy.qing.li@gmail.com>
> -----Original Message-----
> From: roy.qing.li@gmail.com [mailto:roy.qing.li@gmail.com]
> Sent: Thursday, July 05, 2012 7:36 AM
> To: netdev@vger.kernel.org
> Cc: Kotur, Somnath
> Subject: [PATCH 1/2] be2net: Fix Endian
>
> From: Li RongQing <roy.qing.li@gmail.com>
>
> ETH_P_IP is host Endian, skb->protocol is big Endian, when compare them,
> we should change ETH_P_IP from host endian to big endian, htons, not
> ntohs.
>
> CC: Somnath Kotur <somnath.kotur@emulex.com>
> Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Oops! Unintended...Thanks!
Acked-by: Somnath Kotur <somnath.kotur@emulex.com>
^ permalink raw reply
* Re: [PATCH 4 2/4] NET ethernet introduce mac_platform helper
From: Joe Perches @ 2012-07-05 3:25 UTC (permalink / raw)
To: Andy Green
Cc: linux-omap, s-jan, arnd, patches, tony, netdev, linux-kernel,
rostedt, linux-arm-kernel
In-Reply-To: <4FF507FF.3000604@linaro.org>
On Thu, 2012-07-05 at 11:20 +0800, Andy Green wrote:
> On 05/07/12 11:12, the mail apparently from Joe Perches included:
[]
> >> diff --git a/net/ethernet/mac-platform.c b/net/ethernet/mac-platform.c
> > []
> >> +static int mac_platform_netdev_event(struct notifier_block *this,
> >> + unsigned long event, void *ptr)
> >
> > alignment to parenthesis please.
>
> OK. Although different places in the kernel seem to have different
> expectations about that.
net and drivers/net is pretty consistent.
Most of the exceptions are old code.
Some of those exceptions are being slowly updated too.
cheers, Joe
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox