Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH] net: neighbour: fix neigh_dump_info()
From: David Miller @ 2012-06-07 20:03 UTC (permalink / raw)
  To: eric.dumazet; +Cc: denys, netdev, shemminger
In-Reply-To: <1339081115.5083.21.camel@edumazet-glaptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Thu, 07 Jun 2012 16:58:35 +0200

> From: Eric Dumazet <edumazet@google.com>
> 
> Denys found out "ip neigh" output was truncated to
> about 54 neighbours.
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Reported-by: Denys Fedoryshchenko <denys@visp.net.lb>

Applied.

^ permalink raw reply

* Re: [PATCH] net: l2tp_eth: fix kernel panic on rmmod l2tp_eth
From: David Miller @ 2012-06-07 20:02 UTC (permalink / raw)
  To: eric.dumazet; +Cc: denys, netdev, jchapman
In-Reply-To: <1339063640.26966.113.camel@edumazet-glaptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Thu, 07 Jun 2012 12:07:20 +0200

> From: Eric Dumazet <edumazet@google.com>
> 
> We must prevent module unloading if some devices are still attached to
> l2tp_eth driver.
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Reported-by: Denys Fedoryshchenko <denys@visp.net.lb>
> Tested-by: Denys Fedoryshchenko <denys@visp.net.lb>
Applied.

^ permalink raw reply

* Re: [PATCH] fix kernel crash in the macvlan driver
From: Eric W. Biederman @ 2012-06-07 19:49 UTC (permalink / raw)
  To: Ani Sinha; +Cc: netdev, Francesco Ruggeri
In-Reply-To: <alpine.OSX.2.00.1206071137360.86561@animac.local>

Ani Sinha <ani@aristanetworks.com> writes:

> Hello folks :
>
> We noticed a consistently reproducable kernel crash in the macvlan driver
> when we were running our test script. I believe I have found the reason
> for the crash and a patch that fixes it. I am attaching the patch for your
> comments and opinions.

I don't completely follow the logic of your change.  Crashing in
macvlan_addr_busy does seem to indicate you are using a corrupted data
structure.

My compiled version of macvlan_addr_busy is much smaller than yours so I
can't guess based on your disassembly what is wrong.  But by reading the
code it must either be port->dev->dev_addr or the rcu
macvlan_hash_lookup.

Regardless all I see your patch doing is moving the decrement of
port->count earlier and possibly allowing newlink in
MACVLAN_MODE_PASSTHRU to succeed a smidge earlier.

I might just be dense today but I can't possibly see how moving that
decrement would solve the crash you have reported below.

Eric


> commit cd28ce3cb624ddaaf97935c1f34d44bb13ffb786
> Author: Anirban Sinha <ani@aristanetworks.com>
> Date:   Thu Jun 7 11:21:02 2012 -0700
>
>     macvlan : The patch d5cd92448fded12c91f7574e49747c5f7d975a8d introduced reference
>     counting for macvlan_port. This patch fixes an issue where the reference
>     counts were being decremented incorrectly from macvlan_uninit() and not from
>     macvlan_dellink(). This was causing the kernel crash shown below :
>
>     BUG: unable to handle kernel paging request at 0000000100000000
>     IP: [<ffffffffa031f8e5>] macvlan_addr_busy+0x58/0x8d [macvlan]
>     PGD 3a2aa067 PUD 0
>     Oops: 0000 [#1] SMP
>     last sysfs file: /sys/devices/LNXSYSTM:00/device:00/PNP0C0A:00/power_supply/BAT1/energy_full
>     CPU 0
>     Modules linked in: macvlan rfcomm sco bnep l2cap sunrpc ipt_REJECT iptable_filter ip6t_REJECT xt_tcpudp
>
>     Pid: 2490, comm: ip Not tainted 2.6.38.8-705892.2012aniArora.7.fc14.x86_64 #1
>     RIP: 0010:[<ffffffffa031f8e5>]  [<ffffffffa031f8e5>] macvlan_addr_busy+0x58/0x8d [macvlan]
>     RSP: 0018:ffff880037d2b698  EFLAGS: 00010296
>     RAX: 0000000100000000 RBX: ffff88003bf54000 RCX: 0000000000000000
>     RDX: 0000111111111102 RSI: 0000000000000092 RDI: 0000000000000246
>     RBP: ffff880037d2b6a8 R08: 0000000000000040 R09: ffffffff81a73f18
>     R10: ffffffff81e03a20 R11: 0000000000000020 R12: ffff88003c8e33d0
>     R13: ffff880037ecd000 R14: 00000000fffffff0 R15: 0000000000000001
>     FS:  0000000000000000(0000) GS:ffff88003e200000(0063) knlGS:00000000f75d26c0
>     CS:  0010 DS: 002b ES: 002b CR0: 0000000080050033
>     CR2: 0000000100000000 CR3: 00000000377e7000 CR4: 00000000000006f0
>     DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>     DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>     Process ip (pid: 2490, threadinfo ffff880037d2a000, task ffff88003d1206c0)
>     Stack:
>      ffff88003d031000 ffff88003d031680 ffff880037d2b6d8 ffffffffa031fbcc
>      ffff88003d031000 ffffffffa03211c0 ffff88003d031078 0000000000000000
>      ffff880037d2b708 ffffffff81345cf7 ffff88003d031000 0000000000001003
>     Call Trace:
>      [<ffffffffa031fbcc>] macvlan_open+0x7e/0x116 [macvlan]
>      [<ffffffff81345cf7>] __dev_open+0x90/0xc7
>      [<ffffffff81345f26>] __dev_change_flags+0xa8/0x12c
>      [<ffffffff81346021>] dev_change_flags+0x1c/0x52
>      [<ffffffff8134fce1>] do_setlink+0x2b4/0x67d
>      [<ffffffff813bc85f>] ? inet6_fill_link_af+0x1a/0x22
>      [<ffffffff8134f6bf>] ? rtnl_fill_ifinfo+0x99f/0xa7d
>      [<ffffffff81350d02>] rtnl_newlink+0x247/0x40d
>      [<ffffffff81350b7a>] ? rtnl_newlink+0xbf/0x40d
>      [<ffffffff81337ced>] ? sock_rmalloc+0x2e/0x90
>      [<ffffffff810dc46c>] ? arch_local_irq_save+0x16/0x1c
>      [<ffffffff810685fe>] ? arch_local_irq_save+0x18/0x1e
>      [<ffffffff813508dc>] rtnetlink_rcv_msg+0x1e6/0x1fc
>      [<ffffffff810af70a>] ? get_page_from_freelist+0x4dd/0x68d
>      [<ffffffff813506f6>] ? rtnetlink_rcv_msg+0x0/0x1fc
>      [<ffffffff81364239>] netlink_rcv_skb+0x40/0x8b
>      [<ffffffff813502c7>] rtnetlink_rcv+0x21/0x28
>      [<ffffffff81363d27>] netlink_unicast+0xec/0x155
>      [<ffffffff81364041>] netlink_sendmsg+0x2b1/0x2cf
>      [<ffffffff81332ec2>] ? __sock_recvmsg+0x75/0x84
>      [<ffffffff81332d70>] __sock_sendmsg+0x66/0x72
>      [<ffffffff81333521>] sock_sendmsg+0xa3/0xbc
>      [<ffffffff810b25da>] ? lru_cache_add_lru+0x3c/0x3e
>      [<ffffffff810cc1e3>] ? page_add_new_anon_rmap+0x5b/0x6d
>      [<ffffffff810c17cb>] ? set_pte_at+0x9/0xd
>      [<ffffffff810c2f31>] ? do_wp_page+0x496/0x541
>      [<ffffffff81333ec0>] ? move_addr_to_kernel+0x44/0x49
>      [<ffffffff813585a4>] ? verify_compat_iovec+0x6d/0xb9
>      [<ffffffff81334cac>] sys_sendmsg+0x230/0x2ae
>      [<ffffffff810c1965>] ? pmd_offset+0x14/0x3b
>      [<ffffffff810c4c0a>] ? handle_mm_fault+0x13a/0x14f
>      [<ffffffff81334756>] ? sys_sendto+0x13f/0x16c
>      [<ffffffff81334d76>] ? sys_recvmsg+0x4c/0x5b
>      [<ffffffff81358deb>] compat_sys_sendmsg+0xf/0x11
>      [<ffffffff81359005>] compat_sys_socketcall+0x14f/0x186
>      [<ffffffff8102c2b0>] sysenter_dispatch+0x7/0x2e
>     Code: 0e e1 48 8b 03 4c 89 e2 48 c7 c7 6c 10 32 a0 48 8b b0 80 02 00 00 31 c0 e8 f1 17 0e e1 48 8b 03 49 8b 14 24 48 8b 80 80 02 00 00 <48> 33 10 b8 01 00 00 00 48 c1 e2 10 74 22 48 c7 c7 93 10 32 a0
>
>     Signed-off-by: Anirban Sinha <ani@aristanetworks.com>
>
> diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c
> index 66a9bfe..d880bc8 100644
> --- a/drivers/net/macvlan.c
> +++ b/drivers/net/macvlan.c
> @@ -481,7 +481,6 @@ static void macvlan_uninit(struct net_device *dev)
>
>  	free_percpu(vlan->pcpu_stats);
>
> -	port->count -= 1;
>  	if (!port->count)
>  		macvlan_port_destroy(port->dev);
>  }
> @@ -795,7 +794,9 @@ static int macvlan_newlink(struct net *src_net, struct net_device *dev,
>  void macvlan_dellink(struct net_device *dev, struct list_head *head)
>  {
>  	struct macvlan_dev *vlan = netdev_priv(dev);
> +	struct macvlan_port *port = vlan->port;
>
> +	port->count -= 1;
>  	list_del(&vlan->list);
>  	unregister_netdevice_queue(dev, head);
>  }

^ permalink raw reply

* Re: pull request: can-next 2012-06-07
From: David Miller @ 2012-06-07 19:02 UTC (permalink / raw)
  To: mkl; +Cc: netdev, linux-can
In-Reply-To: <4FD062D3.5010107@pengutronix.de>

From: Marc Kleine-Budde <mkl@pengutronix.de>
Date: Thu, 07 Jun 2012 10:14:11 +0200

> here two patches for net-next, by AnilKumar Ch, they add support for
> Bosch's d_can hardware to the existing c_can driver.

Pulled, thanks.

^ permalink raw reply

* [PATCH] fix kernel crash in the macvlan driver
From: Ani Sinha @ 2012-06-07 18:45 UTC (permalink / raw)
  To: netdev; +Cc: Eric Biederman, Francesco Ruggeri

Hello folks :

We noticed a consistently reproducable kernel crash in the macvlan driver
when we were running our test script. I believe I have found the reason
for the crash and a patch that fixes it. I am attaching the patch for your
comments and opinions.

Cheers,
Ani



commit cd28ce3cb624ddaaf97935c1f34d44bb13ffb786
Author: Anirban Sinha <ani@aristanetworks.com>
Date:   Thu Jun 7 11:21:02 2012 -0700

    macvlan : The patch d5cd92448fded12c91f7574e49747c5f7d975a8d introduced reference
    counting for macvlan_port. This patch fixes an issue where the reference
    counts were being decremented incorrectly from macvlan_uninit() and not from
    macvlan_dellink(). This was causing the kernel crash shown below :

    BUG: unable to handle kernel paging request at 0000000100000000
    IP: [<ffffffffa031f8e5>] macvlan_addr_busy+0x58/0x8d [macvlan]
    PGD 3a2aa067 PUD 0
    Oops: 0000 [#1] SMP
    last sysfs file: /sys/devices/LNXSYSTM:00/device:00/PNP0C0A:00/power_supply/BAT1/energy_full
    CPU 0
    Modules linked in: macvlan rfcomm sco bnep l2cap sunrpc ipt_REJECT iptable_filter ip6t_REJECT xt_tcpudp

    Pid: 2490, comm: ip Not tainted 2.6.38.8-705892.2012aniArora.7.fc14.x86_64 #1
    RIP: 0010:[<ffffffffa031f8e5>]  [<ffffffffa031f8e5>] macvlan_addr_busy+0x58/0x8d [macvlan]
    RSP: 0018:ffff880037d2b698  EFLAGS: 00010296
    RAX: 0000000100000000 RBX: ffff88003bf54000 RCX: 0000000000000000
    RDX: 0000111111111102 RSI: 0000000000000092 RDI: 0000000000000246
    RBP: ffff880037d2b6a8 R08: 0000000000000040 R09: ffffffff81a73f18
    R10: ffffffff81e03a20 R11: 0000000000000020 R12: ffff88003c8e33d0
    R13: ffff880037ecd000 R14: 00000000fffffff0 R15: 0000000000000001
    FS:  0000000000000000(0000) GS:ffff88003e200000(0063) knlGS:00000000f75d26c0
    CS:  0010 DS: 002b ES: 002b CR0: 0000000080050033
    CR2: 0000000100000000 CR3: 00000000377e7000 CR4: 00000000000006f0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
    Process ip (pid: 2490, threadinfo ffff880037d2a000, task ffff88003d1206c0)
    Stack:
     ffff88003d031000 ffff88003d031680 ffff880037d2b6d8 ffffffffa031fbcc
     ffff88003d031000 ffffffffa03211c0 ffff88003d031078 0000000000000000
     ffff880037d2b708 ffffffff81345cf7 ffff88003d031000 0000000000001003
    Call Trace:
     [<ffffffffa031fbcc>] macvlan_open+0x7e/0x116 [macvlan]
     [<ffffffff81345cf7>] __dev_open+0x90/0xc7
     [<ffffffff81345f26>] __dev_change_flags+0xa8/0x12c
     [<ffffffff81346021>] dev_change_flags+0x1c/0x52
     [<ffffffff8134fce1>] do_setlink+0x2b4/0x67d
     [<ffffffff813bc85f>] ? inet6_fill_link_af+0x1a/0x22
     [<ffffffff8134f6bf>] ? rtnl_fill_ifinfo+0x99f/0xa7d
     [<ffffffff81350d02>] rtnl_newlink+0x247/0x40d
     [<ffffffff81350b7a>] ? rtnl_newlink+0xbf/0x40d
     [<ffffffff81337ced>] ? sock_rmalloc+0x2e/0x90
     [<ffffffff810dc46c>] ? arch_local_irq_save+0x16/0x1c
     [<ffffffff810685fe>] ? arch_local_irq_save+0x18/0x1e
     [<ffffffff813508dc>] rtnetlink_rcv_msg+0x1e6/0x1fc
     [<ffffffff810af70a>] ? get_page_from_freelist+0x4dd/0x68d
     [<ffffffff813506f6>] ? rtnetlink_rcv_msg+0x0/0x1fc
     [<ffffffff81364239>] netlink_rcv_skb+0x40/0x8b
     [<ffffffff813502c7>] rtnetlink_rcv+0x21/0x28
     [<ffffffff81363d27>] netlink_unicast+0xec/0x155
     [<ffffffff81364041>] netlink_sendmsg+0x2b1/0x2cf
     [<ffffffff81332ec2>] ? __sock_recvmsg+0x75/0x84
     [<ffffffff81332d70>] __sock_sendmsg+0x66/0x72
     [<ffffffff81333521>] sock_sendmsg+0xa3/0xbc
     [<ffffffff810b25da>] ? lru_cache_add_lru+0x3c/0x3e
     [<ffffffff810cc1e3>] ? page_add_new_anon_rmap+0x5b/0x6d
     [<ffffffff810c17cb>] ? set_pte_at+0x9/0xd
     [<ffffffff810c2f31>] ? do_wp_page+0x496/0x541
     [<ffffffff81333ec0>] ? move_addr_to_kernel+0x44/0x49
     [<ffffffff813585a4>] ? verify_compat_iovec+0x6d/0xb9
     [<ffffffff81334cac>] sys_sendmsg+0x230/0x2ae
     [<ffffffff810c1965>] ? pmd_offset+0x14/0x3b
     [<ffffffff810c4c0a>] ? handle_mm_fault+0x13a/0x14f
     [<ffffffff81334756>] ? sys_sendto+0x13f/0x16c
     [<ffffffff81334d76>] ? sys_recvmsg+0x4c/0x5b
     [<ffffffff81358deb>] compat_sys_sendmsg+0xf/0x11
     [<ffffffff81359005>] compat_sys_socketcall+0x14f/0x186
     [<ffffffff8102c2b0>] sysenter_dispatch+0x7/0x2e
    Code: 0e e1 48 8b 03 4c 89 e2 48 c7 c7 6c 10 32 a0 48 8b b0 80 02 00 00 31 c0 e8 f1 17 0e e1 48 8b 03 49 8b 14 24 48 8b 80 80 02 00 00 <48> 33 10 b8 01 00 00 00 48 c1 e2 10 74 22 48 c7 c7 93 10 32 a0

    Signed-off-by: Anirban Sinha <ani@aristanetworks.com>

diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c
index 66a9bfe..d880bc8 100644
--- a/drivers/net/macvlan.c
+++ b/drivers/net/macvlan.c
@@ -481,7 +481,6 @@ static void macvlan_uninit(struct net_device *dev)

 	free_percpu(vlan->pcpu_stats);

-	port->count -= 1;
 	if (!port->count)
 		macvlan_port_destroy(port->dev);
 }
@@ -795,7 +794,9 @@ static int macvlan_newlink(struct net *src_net, struct net_device *dev,
 void macvlan_dellink(struct net_device *dev, struct list_head *head)
 {
 	struct macvlan_dev *vlan = netdev_priv(dev);
+	struct macvlan_port *port = vlan->port;

+	port->count -= 1;
 	list_del(&vlan->list);
 	unregister_netdevice_queue(dev, head);
 }

^ permalink raw reply related

* Re: tcp wifi upload performance and lots of ACKs
From: David Miller @ 2012-06-07 18:10 UTC (permalink / raw)
  To: rick.jones2; +Cc: David.Laight, greearb, dbaluta, netdev
In-Reply-To: <4FD0EA18.3090606@hp.com>

From: Rick Jones <rick.jones2@hp.com>
Date: Thu, 07 Jun 2012 10:51:20 -0700

> This may not work well when the sender has a congestion window growth
> heuristic different from what the ACK avoidance heuristic assumes.  If
> I recall correctly, the heuristic in HP-UX assumes the sender grows
> cwnd by the number of bytes/segments ACKnowledged.

Linux violates this assumption.

^ permalink raw reply

* Re: [PATCH (net.git) V2] stmmac: fix driver built w/ w/o both pci and platf modules
From: David Miller @ 2012-06-07 18:01 UTC (permalink / raw)
  To: peppe.cavallaro; +Cc: netdev, wfg
In-Reply-To: <1339073322-23093-1-git-send-email-peppe.cavallaro@st.com>

From: Giuseppe CAVALLARO <peppe.cavallaro@st.com>
Date: Thu,  7 Jun 2012 14:48:42 +0200

> +	pr_err("stmmac: do not register the PCI driver\n");

This is inappropriate, since this is called unconditionally.

^ permalink raw reply

* Socket send-buffer auto-sizing
From: Ben Greear @ 2012-06-07 17:59 UTC (permalink / raw)
  To: netdev

I'm continuing to test one-way tcp streams in 3.5.0-rc1 on
a wifi network.

When I do not specify a send buffer size, and thus use the kernel
defaults, max speed is about 77Mbps.

When I specify 512KB send-buffer, I get speeds up to 185Mbps.

When set to 1MB, I get about 198Mbps (and setting higher does not
increase the throughput after this).

This is without any 'delack' patches applied.

My question is:  Should the kernel auto-tuner work better?

I seem to recall a comments from some years ago that applications
should no longer attempt to tune send/recv buffers because the kernel
was smart enough to get it at least mostly right.

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply

* Re: tcp wifi upload performance and lots of ACKs
From: Rick Jones @ 2012-06-07 17:51 UTC (permalink / raw)
  To: David Laight; +Cc: Ben Greear, Daniel Baluta, netdev
In-Reply-To: <AE90C24D6B3A694183C094C60CF0A2F6026B6F3C@saturn3.aculab.com>

> Does this delaying of acks have a detrimental effect on the
> sending end?
> I've seen very bad interactions between delayed acks and
> (I believe) the 'slow start' code on connections with
> one-directional data, Nagle disabled and very low RTT.
>
> What I saw was the sender sending 4 data packets, then
> sitting waiting for an ack - in spite of accumulating
> several kB of data to send.
>
> Delaying acks further will only make this worse.

At least two stacks have a reasonable ACK avoidance heuristic.  Those 
would be HP-UX and Solaris (Mac OS 9 had one as well, IIRC).  The 
heuristics are rather similar because the two TCP stacks share a common 
ancestor.  I used to interact with HP-UX's regularly, my statements will 
be based on that, and an assumption Solaris is similar.

Both attempt to divine what the senders' congestion window happens to be 
and be certain to send an ACK before that is exhausted.  So, at the 
start of a connection, there will be the usual, more rapid 
ACKnowledgement.  As things happen "normally" then the number of 
segments per ACK increases until it hits a configurable limit.  There 
are conditions which will cut the limit in half on a given connection - 
one is the sending of an ACK via the standalone ACK timer (this is from 
memory, so may be a bit off).  There are probably a few other conditions 
that drop the limit by half.  The heuristic attempts to learn in each 
connection what the reasonable limit on ACK avoidance might be so there 
isn't a per-connection control, just the global controls via ndd.  As 
conditions causing the limit to be cut in half arise, the connection 
naturally and irrevocably falls back to the usual "ack-every-other" 
behaviour.

When there is little to no packet loss, and a rather regular stream of 
data, this works rather well indeed.  For example in a LAN or Data 
Center.  You can run netperf TCP_STREAM with the limit at the default, 
and with the limit set to two and see the considerable difference in 
service demand on either side.

This may not work well when the sender has a congestion window growth 
heuristic different from what the ACK avoidance heuristic assumes.  If I 
recall correctly, the heuristic in HP-UX assumes the sender grows cwnd 
by the number of bytes/segments ACKnowledged. If the sender grows the 
cwnd by only one segment per ACK rather than by the bytes ACKnowldeged 
by the ACK the growth of the cwnd will be slowed.  In a LAN that may be 
papered-over a bit, but it will become quite noticable in a higher RTT 
environment.  Probably not as noticable for a sufficiently short 
connection, or a long one, but will be for ones in the middle.  The 
short connection doesn't need much cwnd in the first place, and the 
heuristic works its way up to avoiding ACKs, and the long one will be 
long enough to have the ACK avoidance heuristic gravitate down to 
ack-every-other.

rick jones

^ permalink raw reply

* Re: [V2 RFC net-next PATCH 2/2] virtio_net: export more statistics through ethtool
From: Ben Hutchings @ 2012-06-07 17:15 UTC (permalink / raw)
  To: Jason Wang; +Cc: netdev, mst, linux-kernel, virtualization
In-Reply-To: <20120606075217.29081.30713.stgit@amd-6168-8-1.englab.nay.redhat.com>

On Wed, 2012-06-06 at 15:52 +0800, Jason Wang wrote:
> Satistics counters is useful for debugging and performance optimization, so this
> patch lets virtio_net driver collect following and export them to userspace
> through "ethtool -S":
> 
> - number of packets sent/received
> - number of bytes sent/received
> - number of callbacks for tx/rx
> - number of kick for tx/rx
> - number of bytes/packets queued for tx
> 
> As virtnet_stats were per-cpu, so both per-cpu and gloabl satistics were
> collected like:
[...]

I would really like to see some sort of convention for presenting
per-queue statistics through ethtool.  At the moment we have a complete
mess of different formats:

bnx2x:    "[${index}]: ${name}"
be2net:   "${qtype}q${index}: ${name}"
ehea:     "PR${index} ${name}"
mlx4_en:  "${qtype}${index}_${name}"
myri10ge: dummy stat names as headings
niu:      dummy stat names as headings
s2io:     "ring_${index}_${name}"
vmxnet3:  dummy stat names as headings
vxge:     "${name}_${index}"; also dummy stat names as headings

And you're introducing yet another format!

(Additionally some of the drivers are playing games with spaces and tabs
to make ethtool indent the stats the way they like.  Ethtool statistics
are inconsistent enough already without drivers pulling that sort of
crap.

I'm inclined to make ethtool start stripping whitespace from stat names,
and *if* people can agree on a common format for per-queue statistic
names then I'll indent them *consistently*.  Also, I would make such
stats optional, so you don't get hundreds of lines of crap by default.)

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* [PATCH] ipv6: fib: Restore NTF_ROUTER exception in fib6_age()
From: Thomas Graf @ 2012-06-07 16:51 UTC (permalink / raw)
  To: davem; +Cc: netdev

Commit 5339ab8b1dd82 (ipv6: fib: Convert fib6_age() to
dst_neigh_lookup().) seems to have mistakenly inverted the
exception for cached NTF_ROUTER routes.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
---
 net/ipv6/ip6_fib.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
index 0c220a4..74c21b9 100644
--- a/net/ipv6/ip6_fib.c
+++ b/net/ipv6/ip6_fib.c
@@ -1561,7 +1561,7 @@ static int fib6_age(struct rt6_info *rt, void *arg)
 				neigh_flags = neigh->flags;
 				neigh_release(neigh);
 			}
-			if (neigh_flags & NTF_ROUTER) {
+			if (!(neigh_flags & NTF_ROUTER)) {
 				RT6_TRACE("purging route %p via non-router but gateway\n",
 					  rt);
 				return -1;
-- 
1.7.7.6

^ permalink raw reply related

* Re: [v4 net-next PATCH 1/3] Added kernel support in EEE Ethtool commands
From: Ben Hutchings @ 2012-06-07 16:28 UTC (permalink / raw)
  To: Yuval Mintz; +Cc: davem, netdev, eilong, peppe.cavallaro
In-Reply-To: <1339038788-3447-2-git-send-email-yuvalmin@broadcom.com>

On Thu, 2012-06-07 at 06:13 +0300, Yuval Mintz wrote:
> This patch extends the kernel's ethtool interface by adding support
> for 2 new EEE commands - get_eee and set_eee.
> 
> Thanks goes to Giuseppe Cavallaro for his original patch adding this support.
> 
> Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
> Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Reviewed-by: Ben Hutchings <bhutchings@solarflare.com>

> ---
>  include/linux/ethtool.h |   35 +++++++++++++++++++++++++++++++++++
>  net/core/ethtool.c      |   40 ++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 75 insertions(+), 0 deletions(-)
> 
> diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
> index e17fa71..a518361 100644
> --- a/include/linux/ethtool.h
> +++ b/include/linux/ethtool.h
> @@ -137,6 +137,35 @@ struct ethtool_eeprom {
>  };
>  
>  /**
> + * struct ethtool_eee - Energy Efficient Ethernet information
> + * @cmd: ETHTOOL_{G,S}EEE
> + * @supported: Mask of %SUPPORTED_* flags for the speed/duplex combinations
> + *	for which there is EEE support.
> + * @advertised: Mask of %ADVERTISED_* flags for the speed/duplex combinations
> + *	advertised as eee capable.
> + * @lp_advertised: Mask of %ADVERTISED_* flags for the speed/duplex
> + *	combinations advertised by the link partner as eee capable.
> + * @eee_active: Result of the eee auto negotiation.
> + * @eee_enabled: EEE configured mode (enabled/disabled).
> + * @tx_lpi_enabled: Whether the interface should assert its tx lpi, given
> + *	that eee was negotiated.
> + * @tx_lpi_timer: Time in microseconds the interface delays prior to asserting
> + *	its tx lpi (after reaching 'idle' state). Effective only when eee
> + *	was negotiated and tx_lpi_enabled was set.
> + */
> +struct ethtool_eee {
> +	__u32	cmd;
> +	__u32	supported;
> +	__u32	advertised;
> +	__u32	lp_advertised;
> +	__u32	eee_active;
> +	__u32	eee_enabled;
> +	__u32	tx_lpi_enabled;
> +	__u32	tx_lpi_timer;
> +	__u32	reserved[2];
> +};
> +
> +/**
>   * struct ethtool_modinfo - plugin module eeprom information
>   * @cmd: %ETHTOOL_GMODULEINFO
>   * @type: Standard the module information conforms to %ETH_MODULE_SFF_xxxx
> @@ -945,6 +974,8 @@ static inline u32 ethtool_rxfh_indir_default(u32 index, u32 n_rx_rings)
>   * @get_module_info: Get the size and type of the eeprom contained within
>   *	a plug-in module.
>   * @get_module_eeprom: Get the eeprom information from the plug-in module
> + * @get_eee: Get Energy-Efficient (EEE) supported and status.
> + * @set_eee: Set EEE status (enable/disable) as well as LPI timers.
>   *
>   * All operations are optional (i.e. the function pointer may be set
>   * to %NULL) and callers must take this into account.  Callers must
> @@ -1011,6 +1042,8 @@ struct ethtool_ops {
>  				   struct ethtool_modinfo *);
>  	int     (*get_module_eeprom)(struct net_device *,
>  				     struct ethtool_eeprom *, u8 *);
> +	int	(*get_eee)(struct net_device *, struct ethtool_eee *);
> +	int	(*set_eee)(struct net_device *, struct ethtool_eee *);
>  
> 
>  };
> @@ -1089,6 +1122,8 @@ struct ethtool_ops {
>  #define ETHTOOL_GET_TS_INFO	0x00000041 /* Get time stamping and PHC info */
>  #define ETHTOOL_GMODULEINFO	0x00000042 /* Get plug-in module information */
>  #define ETHTOOL_GMODULEEEPROM	0x00000043 /* Get plug-in module eeprom */
> +#define ETHTOOL_GEEE		0x00000044 /* Get EEE settings */
> +#define ETHTOOL_SEEE		0x00000045 /* Set EEE settings */
>  
>  /* compatibility with older code */
>  #define SPARC_ETH_GSET		ETHTOOL_GSET
> diff --git a/net/core/ethtool.c b/net/core/ethtool.c
> index 9c2afb4..5a582da 100644
> --- a/net/core/ethtool.c
> +++ b/net/core/ethtool.c
> @@ -729,6 +729,40 @@ static int ethtool_set_wol(struct net_device *dev, char __user *useraddr)
>  	return dev->ethtool_ops->set_wol(dev, &wol);
>  }
>  
> +static int ethtool_get_eee(struct net_device *dev, char __user *useraddr)
> +{
> +	struct ethtool_eee edata;
> +	int rc;
> +
> +	if (!dev->ethtool_ops->get_eee)
> +		return -EOPNOTSUPP;
> +
> +	memset(&edata, 0, sizeof(struct ethtool_eee));
> +	edata.cmd = ETHTOOL_GEEE;
> +	rc = dev->ethtool_ops->get_eee(dev, &edata);
> +
> +	if (rc)
> +		return rc;
> +
> +	if (copy_to_user(useraddr, &edata, sizeof(edata)))
> +		return -EFAULT;
> +
> +	return 0;
> +}
> +
> +static int ethtool_set_eee(struct net_device *dev, char __user *useraddr)
> +{
> +	struct ethtool_eee edata;
> +
> +	if (!dev->ethtool_ops->set_eee)
> +		return -EOPNOTSUPP;
> +
> +	if (copy_from_user(&edata, useraddr, sizeof(edata)))
> +		return -EFAULT;
> +
> +	return dev->ethtool_ops->set_eee(dev, &edata);
> +}
> +
>  static int ethtool_nway_reset(struct net_device *dev)
>  {
>  	if (!dev->ethtool_ops->nway_reset)
> @@ -1471,6 +1505,12 @@ int dev_ethtool(struct net *net, struct ifreq *ifr)
>  		rc = ethtool_set_value_void(dev, useraddr,
>  				       dev->ethtool_ops->set_msglevel);
>  		break;
> +	case ETHTOOL_GEEE:
> +		rc = ethtool_get_eee(dev, useraddr);
> +		break;
> +	case ETHTOOL_SEEE:
> +		rc = ethtool_set_eee(dev, useraddr);
> +		break;
>  	case ETHTOOL_NWAY_RST:
>  		rc = ethtool_nway_reset(dev);
>  		break;

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* [PATCH] ARM: bpf_jit: BPF_S_ANC_ALU_XOR_X support
From: Mircea Gherzan @ 2012-06-07 15:40 UTC (permalink / raw)
  To: linux; +Cc: mgherzan, netdev, linux-arm-kernel, davem

JIT support for the XOR operation introduced by the commit
ffe06c17afbb.

Signed-off-by: Mircea Gherzan <mgherzan@gmail.com>
---
 arch/arm/net/bpf_jit_32.c |    5 +++++
 arch/arm/net/bpf_jit_32.h |    4 ++++
 2 files changed, 9 insertions(+)

diff --git a/arch/arm/net/bpf_jit_32.c b/arch/arm/net/bpf_jit_32.c
index 62135849..c641fb6 100644
--- a/arch/arm/net/bpf_jit_32.c
+++ b/arch/arm/net/bpf_jit_32.c
@@ -762,6 +762,11 @@ b_epilogue:
 			update_on_xread(ctx);
 			emit(ARM_MOV_R(r_A, r_X), ctx);
 			break;
+		case BPF_S_ANC_ALU_XOR_X:
+			/* A ^= X */
+			update_on_xread(ctx);
+			emit(ARM_EOR_R(r_A, r_A, r_X), ctx);
+			break;
 		case BPF_S_ANC_PROTOCOL:
 			/* A = ntohs(skb->protocol) */
 			ctx->seen |= SEEN_SKB;
diff --git a/arch/arm/net/bpf_jit_32.h b/arch/arm/net/bpf_jit_32.h
index 99ae5e3..7fa2f7d 100644
--- a/arch/arm/net/bpf_jit_32.h
+++ b/arch/arm/net/bpf_jit_32.h
@@ -68,6 +68,8 @@
 #define ARM_INST_CMP_R		0x01500000
 #define ARM_INST_CMP_I		0x03500000
 
+#define ARM_INST_EOR_R		0x00200000
+
 #define ARM_INST_LDRB_I		0x05d00000
 #define ARM_INST_LDRB_R		0x07d00000
 #define ARM_INST_LDRH_I		0x01d000b0
@@ -132,6 +134,8 @@
 #define ARM_CMP_R(rn, rm)	_AL3_R(ARM_INST_CMP, 0, rn, rm)
 #define ARM_CMP_I(rn, imm)	_AL3_I(ARM_INST_CMP, 0, rn, imm)
 
+#define ARM_EOR_R(rd, rn, rm)	_AL3_R(ARM_INST_EOR, rd, rn, rm)
+
 #define ARM_LDR_I(rt, rn, off)	(ARM_INST_LDR_I | (rt) << 12 | (rn) << 16 \
 				 | (off))
 #define ARM_LDRB_I(rt, rn, off)	(ARM_INST_LDRB_I | (rt) << 12 | (rn) << 16 \
-- 
1.7.10

^ permalink raw reply related

* [PATCH] net: neighbour: fix neigh_dump_info()
From: Eric Dumazet @ 2012-06-07 14:58 UTC (permalink / raw)
  To: Denys Fedoryshchenko, David Miller; +Cc: netdev, Stephen Hemminger
In-Reply-To: <1339078935.5083.13.camel@edumazet-glaptop>

From: Eric Dumazet <edumazet@google.com>

Denys found out "ip neigh" output was truncated to
about 54 neighbours.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Denys Fedoryshchenko <denys@visp.net.lb>
---
 net/core/neighbour.c |   14 ++++++--------
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/net/core/neighbour.c b/net/core/neighbour.c
index eb09f8b..d81d026 100644
--- a/net/core/neighbour.c
+++ b/net/core/neighbour.c
@@ -2219,9 +2219,7 @@ static int neigh_dump_table(struct neigh_table *tbl, struct sk_buff *skb,
 	rcu_read_lock_bh();
 	nht = rcu_dereference_bh(tbl->nht);
 
-	for (h = 0; h < (1 << nht->hash_shift); h++) {
-		if (h < s_h)
-			continue;
+	for (h = s_h; h < (1 << nht->hash_shift); h++) {
 		if (h > s_h)
 			s_idx = 0;
 		for (n = rcu_dereference_bh(nht->hash_buckets[h]), idx = 0;
@@ -2260,9 +2258,7 @@ static int pneigh_dump_table(struct neigh_table *tbl, struct sk_buff *skb,
 
 	read_lock_bh(&tbl->lock);
 
-	for (h = 0; h <= PNEIGH_HASHMASK; h++) {
-		if (h < s_h)
-			continue;
+	for (h = s_h; h <= PNEIGH_HASHMASK; h++) {
 		if (h > s_h)
 			s_idx = 0;
 		for (n = tbl->phash_buckets[h], idx = 0; n; n = n->next) {
@@ -2297,7 +2293,7 @@ static int neigh_dump_info(struct sk_buff *skb, struct netlink_callback *cb)
 	struct neigh_table *tbl;
 	int t, family, s_t;
 	int proxy = 0;
-	int err = 0;
+	int err;
 
 	read_lock(&neigh_tbl_lock);
 	family = ((struct rtgenmsg *) nlmsg_data(cb->nlh))->rtgen_family;
@@ -2311,7 +2307,7 @@ static int neigh_dump_info(struct sk_buff *skb, struct netlink_callback *cb)
 
 	s_t = cb->args[0];
 
-	for (tbl = neigh_tables, t = 0; tbl && (err >= 0);
+	for (tbl = neigh_tables, t = 0; tbl;
 	     tbl = tbl->next, t++) {
 		if (t < s_t || (family && tbl->family != family))
 			continue;
@@ -2322,6 +2318,8 @@ static int neigh_dump_info(struct sk_buff *skb, struct netlink_callback *cb)
 			err = pneigh_dump_table(tbl, skb, cb);
 		else
 			err = neigh_dump_table(tbl, skb, cb);
+		if (err < 0)
+			break;
 	}
 	read_unlock(&neigh_tbl_lock);
 

^ permalink raw reply related

* Re: tcp wifi upload performance and lots of ACKs
From: Ben Greear @ 2012-06-07 14:41 UTC (permalink / raw)
  To: David Laight; +Cc: Daniel Baluta, netdev
In-Reply-To: <AE90C24D6B3A694183C094C60CF0A2F6026B6F3C@saturn3.aculab.com>

On 06/07/2012 05:20 AM, David Laight wrote:
>
>
>> -----Original Message-----
>> From: netdev-owner@vger.kernel.org
>> [mailto:netdev-owner@vger.kernel.org] On Behalf Of Ben Greear
>> Sent: 07 June 2012 05:16
>> To: Daniel Baluta
>> Cc: netdev
>> Subject: Re: tcp wifi upload performance and lots of ACKs
>>
>> On 06/04/2012 12:22 PM, Daniel Baluta wrote:
>>> On Mon, Jun 4, 2012 at 9:29 PM, Ben Greear<greearb@candelatech.com>
> wrote:
>>>> I'm going some TCP performance testing on wifi ->   LAN interface
> connections.
>>>>    With
>>>> UDP, we can get around 250Mbps of payload throughput.  With TCP,
> max is
>>>> about 80Mbps.
>>>>
>>>> I think the problem is that there are way too many ACK packets, and
>>>> bi-directional
>>>> traffic on wifi interfaces really slows things down.  (About 7000
> pkts per
>>>> second in
>>>> upload direction, 2000 pps download.  And the vast majority of the
> download
>>>> pkts
>>>> are 66 byte ACK pkts from what I can tell.)
>>
>>> [1] http://marc.info/?l=linux-netdev&m=131983649130350&w=2
>>
>> After a bit more playing, I did notice a reliable 5% increase in
>> traffic (200Mbps ->  210Mbps) from changing the delack segments
>> to 20 from the default of 1.  That is enough to be useful to me,
>> and there may be more significant gains to be found...
>> I haven't done a full matrix of testing yet.
>
> Does this delaying of acks have a detrimental effect on the
> sending end?
> I've seen very bad interactions between delayed acks and
> (I believe) the 'slow start' code on connections with
> one-directional data, Nagle disabled and very low RTT.
>
> What I saw was the sender sending 4 data packets, then
> sitting waiting for an ack - in spite of accumulating
> several kB of data to send.
>
> Delaying acks further will only make this worse.

I'm sure it's not for everyone in all cases.  In my case, I'm
sending long-term bulk transfer, at high speeds, over wifi network
which has some latency.  Tested one-way traffic so far.

With the patch and delayed acks, I get more sender throughput than
without (200Mbps -> 210Mbps).

Thanks,
Ben

>
> 	David
>
>


-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply

* [PATCH net-next] be2net: Fix driver load for VFs for Lancer
From: Padmanabh Ratnakar @ 2012-06-07 14:37 UTC (permalink / raw)
  To: netdev; +Cc: Padmanabh Ratnakar

Permanent MAC is wrongly supplied in create iface command. Call the
command with no MAC address and then MAC address should be later queried
and applied.

Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
---
 drivers/net/ethernet/emulex/benet/be_cmds.c |   21 +++---
 drivers/net/ethernet/emulex/benet/be_cmds.h |    8 +-
 drivers/net/ethernet/emulex/benet/be_main.c |   98 ++++++++++++++------------
 3 files changed, 66 insertions(+), 61 deletions(-)

diff --git a/drivers/net/ethernet/emulex/benet/be_cmds.c b/drivers/net/ethernet/emulex/benet/be_cmds.c
index 8d06ea3..f899752 100644
--- a/drivers/net/ethernet/emulex/benet/be_cmds.c
+++ b/drivers/net/ethernet/emulex/benet/be_cmds.c
@@ -1132,7 +1132,7 @@ err:
  * Uses MCCQ
  */
 int be_cmd_if_create(struct be_adapter *adapter, u32 cap_flags, u32 en_flags,
-		u8 *mac, u32 *if_handle, u32 *pmac_id, u32 domain)
+		     u32 *if_handle, u32 domain)
 {
 	struct be_mcc_wrb *wrb;
 	struct be_cmd_req_if_create *req;
@@ -1152,17 +1152,13 @@ int be_cmd_if_create(struct be_adapter *adapter, u32 cap_flags, u32 en_flags,
 	req->hdr.domain = domain;
 	req->capability_flags = cpu_to_le32(cap_flags);
 	req->enable_flags = cpu_to_le32(en_flags);
-	if (mac)
-		memcpy(req->mac_addr, mac, ETH_ALEN);
-	else
-		req->pmac_invalid = true;
+
+	req->pmac_invalid = true;
 
 	status = be_mcc_notify_wait(adapter);
 	if (!status) {
 		struct be_cmd_resp_if_create *resp = embedded_payload(wrb);
 		*if_handle = le32_to_cpu(resp->interface_id);
-		if (mac)
-			*pmac_id = le32_to_cpu(resp->pmac_id);
 	}
 
 err:
@@ -2330,8 +2326,8 @@ err:
 }
 
 /* Uses synchronous MCCQ */
-int be_cmd_get_mac_from_list(struct be_adapter *adapter, u32 domain,
-			bool *pmac_id_active, u32 *pmac_id, u8 *mac)
+int be_cmd_get_mac_from_list(struct be_adapter *adapter, u8 *mac,
+			     bool *pmac_id_active, u32 *pmac_id, u8 domain)
 {
 	struct be_mcc_wrb *wrb;
 	struct be_cmd_req_get_mac_list *req;
@@ -2376,8 +2372,9 @@ int be_cmd_get_mac_from_list(struct be_adapter *adapter, u32 domain,
 						get_mac_list_cmd.va;
 		mac_count = resp->true_mac_count + resp->pseudo_mac_count;
 		/* Mac list returned could contain one or more active mac_ids
-		 * or one or more pseudo permanant mac addresses. If an active
-		 * mac_id is present, return first active mac_id found
+		 * or one or more true or pseudo permanant mac addresses.
+		 * If an active mac_id is present, return first active mac_id
+		 * found.
 		 */
 		for (i = 0; i < mac_count; i++) {
 			struct get_list_macaddr *mac_entry;
@@ -2396,7 +2393,7 @@ int be_cmd_get_mac_from_list(struct be_adapter *adapter, u32 domain,
 				goto out;
 			}
 		}
-		/* If no active mac_id found, return first pseudo mac addr */
+		/* If no active mac_id found, return first mac addr */
 		*pmac_id_active = false;
 		memcpy(mac, resp->macaddr_list[0].mac_addr_id.macaddr,
 								ETH_ALEN);
diff --git a/drivers/net/ethernet/emulex/benet/be_cmds.h b/drivers/net/ethernet/emulex/benet/be_cmds.h
index 9625bf4..2f6bb06 100644
--- a/drivers/net/ethernet/emulex/benet/be_cmds.h
+++ b/drivers/net/ethernet/emulex/benet/be_cmds.h
@@ -1664,8 +1664,7 @@ extern int be_cmd_pmac_add(struct be_adapter *adapter, u8 *mac_addr,
 extern int be_cmd_pmac_del(struct be_adapter *adapter, u32 if_id,
 			int pmac_id, u32 domain);
 extern int be_cmd_if_create(struct be_adapter *adapter, u32 cap_flags,
-			u32 en_flags, u8 *mac, u32 *if_handle, u32 *pmac_id,
-			u32 domain);
+			    u32 en_flags, u32 *if_handle, u32 domain);
 extern int be_cmd_if_destroy(struct be_adapter *adapter, int if_handle,
 			u32 domain);
 extern int be_cmd_eq_create(struct be_adapter *adapter,
@@ -1751,8 +1750,9 @@ extern int be_cmd_get_cntl_attributes(struct be_adapter *adapter);
 extern int be_cmd_req_native_mode(struct be_adapter *adapter);
 extern int be_cmd_get_reg_len(struct be_adapter *adapter, u32 *log_size);
 extern void be_cmd_get_regs(struct be_adapter *adapter, u32 buf_len, void *buf);
-extern int be_cmd_get_mac_from_list(struct be_adapter *adapter, u32 domain,
-				bool *pmac_id_active, u32 *pmac_id, u8 *mac);
+extern int be_cmd_get_mac_from_list(struct be_adapter *adapter, u8 *mac,
+				    bool *pmac_id_active, u32 *pmac_id,
+				    u8 domain);
 extern int be_cmd_set_mac_list(struct be_adapter *adapter, u8 *mac_array,
 						u8 mac_count, u32 domain);
 extern int be_cmd_set_hsw_config(struct be_adapter *adapter, u16 pvid,
diff --git a/drivers/net/ethernet/emulex/benet/be_main.c b/drivers/net/ethernet/emulex/benet/be_main.c
index f29827f..896f283 100644
--- a/drivers/net/ethernet/emulex/benet/be_main.c
+++ b/drivers/net/ethernet/emulex/benet/be_main.c
@@ -2601,8 +2601,8 @@ static int be_vf_setup(struct be_adapter *adapter)
 	cap_flags = en_flags = BE_IF_FLAGS_UNTAGGED | BE_IF_FLAGS_BROADCAST |
 				BE_IF_FLAGS_MULTICAST;
 	for_all_vfs(adapter, vf_cfg, vf) {
-		status = be_cmd_if_create(adapter, cap_flags, en_flags, NULL,
-					  &vf_cfg->if_handle, NULL, vf + 1);
+		status = be_cmd_if_create(adapter, cap_flags, en_flags,
+					  &vf_cfg->if_handle, vf + 1);
 		if (status)
 			goto err;
 	}
@@ -2642,29 +2642,43 @@ static void be_setup_init(struct be_adapter *adapter)
 	adapter->phy.forced_port_speed = -1;
 }
 
-static int be_add_mac_from_list(struct be_adapter *adapter, u8 *mac)
+static int be_get_mac_addr(struct be_adapter *adapter, u8 *mac, u32 if_handle,
+			   bool *active_mac, u32 *pmac_id)
 {
-	u32 pmac_id;
-	int status;
-	bool pmac_id_active;
+	int status = 0;
 
-	status = be_cmd_get_mac_from_list(adapter, 0, &pmac_id_active,
-							&pmac_id, mac);
-	if (status != 0)
-		goto do_none;
+	if (!is_zero_ether_addr(adapter->netdev->perm_addr)) {
+		memcpy(mac, adapter->netdev->dev_addr, ETH_ALEN);
+		if (!lancer_chip(adapter) && !be_physfn(adapter))
+			*active_mac = true;
+		else
+			*active_mac = false;
 
-	if (pmac_id_active) {
-		status = be_cmd_mac_addr_query(adapter, mac,
-				MAC_ADDRESS_TYPE_NETWORK,
-				false, adapter->if_handle, pmac_id);
+		return status;
+	}
 
-		if (!status)
-			adapter->pmac_id[0] = pmac_id;
+	if (lancer_chip(adapter)) {
+		status = be_cmd_get_mac_from_list(adapter, mac,
+						  active_mac, pmac_id, 0);
+		if (*active_mac) {
+			status = be_cmd_mac_addr_query(adapter, mac,
+						       MAC_ADDRESS_TYPE_NETWORK,
+						       false, if_handle,
+						       *pmac_id);
+		}
+	} else if (be_physfn(adapter)) {
+		/* For BE3, for PF get permanent MAC */
+		status = be_cmd_mac_addr_query(adapter, mac,
+					       MAC_ADDRESS_TYPE_NETWORK, true,
+					       0, 0);
+		*active_mac = false;
 	} else {
-		status = be_cmd_pmac_add(adapter, mac,
-				adapter->if_handle, &adapter->pmac_id[0], 0);
+		/* For BE3, for VF get soft MAC assigned by PF*/
+		status = be_cmd_mac_addr_query(adapter, mac,
+					       MAC_ADDRESS_TYPE_NETWORK, false,
+					       if_handle, 0);
+		*active_mac = true;
 	}
-do_none:
 	return status;
 }
 
@@ -2685,12 +2699,12 @@ static int be_get_config(struct be_adapter *adapter)
 
 static int be_setup(struct be_adapter *adapter)
 {
-	struct net_device *netdev = adapter->netdev;
 	struct device *dev = &adapter->pdev->dev;
 	u32 cap_flags, en_flags;
 	u32 tx_fc, rx_fc;
 	int status;
 	u8 mac[ETH_ALEN];
+	bool active_mac;
 
 	be_setup_init(adapter);
 
@@ -2716,14 +2730,6 @@ static int be_setup(struct be_adapter *adapter)
 	if (status)
 		goto err;
 
-	memset(mac, 0, ETH_ALEN);
-	status = be_cmd_mac_addr_query(adapter, mac, MAC_ADDRESS_TYPE_NETWORK,
-			true /*permanent */, 0, 0);
-	if (status)
-		return status;
-	memcpy(adapter->netdev->dev_addr, mac, ETH_ALEN);
-	memcpy(adapter->netdev->perm_addr, mac, ETH_ALEN);
-
 	en_flags = BE_IF_FLAGS_UNTAGGED | BE_IF_FLAGS_BROADCAST |
 			BE_IF_FLAGS_MULTICAST | BE_IF_FLAGS_PASS_L3L4_ERRORS;
 	cap_flags = en_flags | BE_IF_FLAGS_MCAST_PROMISCUOUS |
@@ -2733,27 +2739,29 @@ static int be_setup(struct be_adapter *adapter)
 		cap_flags |= BE_IF_FLAGS_RSS;
 		en_flags |= BE_IF_FLAGS_RSS;
 	}
+
 	status = be_cmd_if_create(adapter, cap_flags, en_flags,
-			netdev->dev_addr, &adapter->if_handle,
-			&adapter->pmac_id[0], 0);
+				  &adapter->if_handle, 0);
 	if (status != 0)
 		goto err;
 
-	 /* The VF's permanent mac queried from card is incorrect.
-	  * For BEx: Query the mac configued by the PF using if_handle
-	  * For Lancer: Get and use mac_list to obtain mac address.
-	  */
-	if (!be_physfn(adapter)) {
-		if (lancer_chip(adapter))
-			status = be_add_mac_from_list(adapter, mac);
-		else
-			status = be_cmd_mac_addr_query(adapter, mac,
-					MAC_ADDRESS_TYPE_NETWORK, false,
-					adapter->if_handle, 0);
-		if (!status) {
-			memcpy(adapter->netdev->dev_addr, mac, ETH_ALEN);
-			memcpy(adapter->netdev->perm_addr, mac, ETH_ALEN);
-		}
+	memset(mac, 0, ETH_ALEN);
+	active_mac = false;
+	status = be_get_mac_addr(adapter, mac, adapter->if_handle,
+				 &active_mac, &adapter->pmac_id[0]);
+	if (status != 0)
+		goto err;
+
+	if (!active_mac) {
+		status = be_cmd_pmac_add(adapter, mac, adapter->if_handle,
+					 &adapter->pmac_id[0], 0);
+		if (status != 0)
+			goto err;
+	}
+
+	if (is_zero_ether_addr(adapter->netdev->dev_addr)) {
+		memcpy(adapter->netdev->dev_addr, mac, ETH_ALEN);
+		memcpy(adapter->netdev->perm_addr, mac, ETH_ALEN);
 	}
 
 	status = be_tx_qs_create(adapter);
-- 
1.6.0.2

^ permalink raw reply related

* Re: Change in alloc_skb() behavior in 3.2+ kernels?
From: Eric Dumazet @ 2012-06-07 14:25 UTC (permalink / raw)
  To: Grant Edwards; +Cc: netdev
In-Reply-To: <jqqd4k$i2c$1@dough.gmane.org>

On Thu, 2012-06-07 at 14:16 +0000, Grant Edwards wrote:

> I was merely pointing out that the API was indeed documented that way.


Good, you are right and we were wrong.

Hopefully you can still use linux-2

^ permalink raw reply

* Re: ip neigh output are incomplete, 3.4.1
From: Eric Dumazet @ 2012-06-07 14:22 UTC (permalink / raw)
  To: Denys Fedoryshchenko; +Cc: netdev, Stephen Hemminger
In-Reply-To: <ea0a16e085389e68b480c4b0baf05645@visp.net.lb>

On Thu, 2012-06-07 at 16:09 +0300, Denys Fedoryshchenko wrote:
> I have a host with large L2 network (around 100 L2TP tunnels bridged to 
> one interface). 3.4.1 kernel, x86, 32bit.
> 
> ip route add 172.16.0.0/16 dev br0
> 
> GlobalNAT ~ # cat /proc/net/arp |wc -l
> 2
> GlobalNAT ~ # cat /proc/net/arp |wc -l
> 3575
> GlobalNAT ~ # cat /proc/net/arp |wc -l
> 4613
> GlobalNAT ~ # cat /proc/net/arp |wc -l
> 5117
> 
> And at same time
> GlobalNAT /config # ip neigh |wc -l
> 52

Thansk for the report, I am testing a fix and send patch ASAP.

^ permalink raw reply

* Re: Change in alloc_skb() behavior in 3.2+ kernels?
From: Grant Edwards @ 2012-06-07 14:16 UTC (permalink / raw)
  To: netdev
In-Reply-To: <1339077710.5083.12.camel@edumazet-glaptop>

On Thu, Jun 07, 2012 at 04:01:50PM +0200, Eric Dumazet wrote:
> On Thu, 2012-06-07 at 13:23 +0000, Grant Edwards wrote:
> > On 2012-06-06, David Miller <davem@davemloft.net> wrote:

> > > It was never a formal API that we would only allocate 'size'
> > > amount of tailroom.
> >
> > How can you say that?

> Documentation was stale, so what ?

So there _was_ a formal API that said you would only allocate 'size'
amount of tailroom.  That's what.

> kmalloc(99) doesnt allocate 99 bytes but 128, so what?

Doing so violated the documented API.

You said there was never any API definition that said tailroom() ==
requested size, and implied that it was stupid to write code that
expected tailroom() == requested size.

I was merely pointing out that the API was indeed documented that way.

> Grant, what about you fix your code ?

I did.

And the API documentation has now been fixed as well, but don't try to
tell me that the API documentation didn't promise to work the way my
code expected it to work.

-- 
Grant Edwards               grant.b.edwards        Yow! Youth of today!
                                  at               Join me in a mass rally
                              gmail.com            for traditional mental
                                                   attitudes!

^ permalink raw reply

* Re: Change in alloc_skb() behavior in 3.2+ kernels?
From: Eric Dumazet @ 2012-06-07 14:01 UTC (permalink / raw)
  To: Grant Edwards; +Cc: netdev
In-Reply-To: <jqqa1b$kug$1@dough.gmane.org>

On Thu, 2012-06-07 at 13:23 +0000, Grant Edwards wrote:
> On 2012-06-06, David Miller <davem@davemloft.net> wrote:
> > From: Grant Edwards <grant.b.edwards@gmail.com>
> > Date: Wed, 6 Jun 2012 18:59:19 +0000 (UTC)
> >
> >> At the time it was written (probably 10+ years ago) it was relying on
> >> the documented API for alloc_skb() that stated alloc_skb() either
> >> returned an sk_buff of the requested size or it failed.
> >
> > It was never a formal API that we would only allocate 'size'
> > amount of tailroom.
> 
> How can you say that?


Documentation was stale, so what ?

kmalloc(99) doesnt allocate 99 bytes but 128, so what ?

Grant, what about you fix your code ?

^ permalink raw reply

* Re: [PATCH (net.git) V2] stmmac: fix driver built w/ w/o both pci and platf modules
From: Fengguang Wu @ 2012-06-07 13:38 UTC (permalink / raw)
  To: Giuseppe CAVALLARO; +Cc: netdev, davem
In-Reply-To: <1339073322-23093-1-git-send-email-peppe.cavallaro@st.com>

Tested-by: Fengguang Wu <wfg@linux.intel.com>

Thanks!

^ permalink raw reply

* Re: Change in alloc_skb() behavior in 3.2+ kernels?
From: Grant Edwards @ 2012-06-07 13:23 UTC (permalink / raw)
  To: netdev
In-Reply-To: <20120606.120247.1618312724057709285.davem@davemloft.net>

On 2012-06-06, David Miller <davem@davemloft.net> wrote:
> From: Grant Edwards <grant.b.edwards@gmail.com>
> Date: Wed, 6 Jun 2012 18:59:19 +0000 (UTC)
>
>> At the time it was written (probably 10+ years ago) it was relying on
>> the documented API for alloc_skb() that stated alloc_skb() either
>> returned an sk_buff of the requested size or it failed.
>
> It was never a formal API that we would only allocate 'size'
> amount of tailroom.

How can you say that?

>From skbuff.c:

    /**
    *__alloc_skb-allocate a network buffer
    *@size: size to allocate
    *@gfp_mask: allocation mask
    *@fclone: allocate from fclone cache instead of head cache
    *and allocate a cloned (child) skb
    *@node: numa node to allocate memory on
    *
>>> *Allocate a new &sk_buff. The returned buffer has no headroom and a
>>> *tail room of size bytes. The object has a reference count of one.
    *The return is the buffer. On a failure the return is %NULL.
    *
    *Buffers may only be allocated from interrupts using a @gfp_mask of
    *%GFP_ATOMIC.
    */

-- 
Grant Edwards               grant.b.edwards        Yow! Did you move a lot of
                                  at               KOREAN STEAK KNIVES this
                              gmail.com            trip, Dingy?

^ permalink raw reply

* (SEE ATTACHMENT)Johnson Gilbert Muthusamy!
From: Mr Johnson Gilbert Muthusamy @ 2012-06-07 13:10 UTC (permalink / raw)


[-- Attachment #1: Type: text/plain, Size: 0 bytes --]



[-- Attachment #2: FROM Mr Johnson Gilbert Muthusamy!.doc --]
[-- Type: application/msword, Size: 25600 bytes --]

^ permalink raw reply

* ip neigh output are incomplete, 3.4.1
From: Denys Fedoryshchenko @ 2012-06-07 13:09 UTC (permalink / raw)
  To: netdev, Stephen Hemminger

I have a host with large L2 network (around 100 L2TP tunnels bridged to 
one interface). 3.4.1 kernel, x86, 32bit.

ip route add 172.16.0.0/16 dev br0

GlobalNAT ~ # cat /proc/net/arp |wc -l
2
GlobalNAT ~ # cat /proc/net/arp |wc -l
3575
GlobalNAT ~ # cat /proc/net/arp |wc -l
4613
GlobalNAT ~ # cat /proc/net/arp |wc -l
5117

And at same time
GlobalNAT /config # ip neigh |wc -l
52
GlobalNAT /config # ip neigh
172.16.1.94 dev br0 lladdr ea:2b:dd:8c:a6:96 REACHABLE
172.16.188.36 dev br0  INCOMPLETE
172.16.21.67 dev br0 lladdr aa:75:b5:04:01:1a REACHABLE
172.16.199.127 dev br0 lladdr 2e:64:5c:61:92:be REACHABLE
172.16.219.100 dev br0 lladdr 82:2f:33:ec:31:a9 REACHABLE
172.16.212.171 dev br0 lladdr be:d0:90:8a:97:35 REACHABLE
172.16.134.232 dev br0  FAILED
172.16.47.155 dev br0 lladdr 52:2e:4b:d1:d7:73 REACHABLE
172.16.67.128 dev br0 lladdr 22:c8:b2:a6:d8:17 REACHABLE
172.16.107.74 dev br0 lladdr 5a:ed:4f:35:32:94 REACHABLE
172.16.176.131 dev br0 lladdr b6:cf:5c:6e:84:88 REACHABLE
172.16.196.104 dev br0 lladdr e6:ec:2a:77:8d:ab REACHABLE
172.16.167.249 dev br0 lladdr be:13:c5:b2:be:f6 REACHABLE
172.16.49.108 dev br0 lladdr ea:97:c7:a9:a5:40 REACHABLE
172.16.71.34 dev br0 lladdr 2e:27:29:da:fc:2e REACHABLE
172.16.42.179 dev br0  INCOMPLETE
172.16.33.41 dev br0 lladdr 2e:64:5c:61:92:be REACHABLE
172.16.4.186 dev br0 lladdr 16:f9:a1:00:9d:c9 REACHABLE
172.16.104.51 dev br0 lladdr c6:1c:7e:2f:fe:1e REACHABLE
172.16.242.165 dev br0 lladdr 5a:ed:4f:35:32:94 REACHABLE
172.16.124.24 dev br0  INCOMPLETE
172.16.57.176 dev br0 lladdr 4e:70:0f:5c:d0:f3 REACHABLE
172.16.206.125 dev br0 lladdr 82:41:c6:78:56:36 REACHABLE
172.16.128.186 dev br0  INCOMPLETE
172.16.10.45 dev br0 lladdr 5e:15:f2:8d:8a:ab REACHABLE
172.16.21.136 dev br0 lladdr aa:75:b5:04:01:1a REACHABLE
172.16.61.82 dev br0 lladdr be:13:c5:b2:be:f6 REACHABLE
172.16.248.24 dev br0 lladdr 1a:f7:d5:2f:98:36 REACHABLE
172.16.239.142 dev br0 lladdr c6:1c:7e:2f:fe:1e REACHABLE
172.16.14.207 dev br0 lladdr 5e:15:f2:8d:8a:ab REACHABLE
172.16.134.45 dev br0 lladdr 8e:5a:26:d7:e6:ba REACHABLE
172.16.252.186 dev br0 lladdr 62:79:87:21:b4:46 REACHABLE
172.16.154.18 dev br0 lladdr 2a:39:28:65:80:37 REACHABLE
172.16.194.220 dev br0 lladdr ba:18:38:24:86:07 REACHABLE
172.16.76.79 dev br0 lladdr 62:25:4b:70:9e:f2 REACHABLE
172.16.234.166 dev br0 lladdr 6e:67:51:cd:a6:d4 REACHABLE
172.16.67.197 dev br0 lladdr 22:c8:b2:a6:d8:17 REACHABLE
172.16.49.177 dev br0 lladdr ea:97:c7:a9:a5:40 REACHABLE
172.16.118.234 dev br0 lladdr e2:3a:a3:0d:02:4d REACHABLE
172.16.169.15 dev br0  INCOMPLETE
172.16.100.214 dev br0 lladdr 56:a4:8f:1f:46:58 REACHABLE
172.16.71.103 dev br0 lladdr 2e:27:29:da:fc:2e REACHABLE
172.16.91.76 dev br0 lladdr d6:a0:18:9f:a3:21 REACHABLE
172.16.229.190 dev br0 lladdr d6:38:bd:40:02:4f REACHABLE
172.16.93.29 dev br0 lladdr 82:35:37:e8:11:c2 REACHABLE
172.16.113.2 dev br0 lladdr c6:5e:c6:f5:44:5e REACHABLE
172.16.104.120 dev br0 lladdr c6:1c:7e:2f:fe:1e REACHABLE
172.16.95.238 dev br0  INCOMPLETE
172.16.106.73 dev br0 lladdr 5a:ed:4f:35:32:94 REACHABLE
172.16.146.19 dev br0 lladdr ea:2b:dd:8c:a6:96 REACHABLE
172.16.235.49 dev br0 lladdr be:d0:90:8a:97:35 REACHABLE
172.16.255.22 dev br0  INCOMPLETE

ip neigh show dev br0 showing the same, 52 hosts only
Trying to set larger rcvbuf won't help, for example
ip -rcvbuf 1000000 ip neigh

Short sample from /proc/net/arp
172.16.41.227    0x1         0x2         5a:ed:4f:35:32:94     *        
br0
172.16.61.200    0x1         0x2         be:13:c5:b2:be:f6     *        
br0
172.16.239.4     0x1         0x2         c6:1c:7e:2f:fe:1e     *        
br0
172.16.3.234     0x1         0x2         5a:ed:4f:35:32:94     *        
br0
172.16.161.65    0x1         0x2         96:cf:d5:8a:85:7f     *        
br0
172.16.14.69     0x1         0x2         5e:15:f2:8d:8a:ab     *        
br0
172.16.212.102   0x1         0x2         be:d0:90:8a:97:35     *        
br0
172.16.252.48    0x1         0x2         62:79:87:21:b4:46     *        
br0
172.16.125.25    0x1         0x2         d6:3f:35:6c:69:a5     *        
br0
172.16.56.224    0x1         0x2         d6:a0:18:9f:a3:21     *        
br0
172.16.76.197    0x1         0x2         62:25:4b:70:9e:f2     *        
br0
172.16.67.59     0x1         0x2         22:c8:b2:a6:d8:17     *        
br0
172.16.156.89    0x1         0x2         6e:bd:24:97:4f:fb     *        
br0
172.16.107.5     0x1         0x2         5a:ed:4f:35:32:94     *        
br0
172.16.245.119   0x1         0x2         5a:1d:72:ba:39:1c     *        
br0
172.16.176.62    0x1         0x2         b6:cf:5c:6e:84:88     *        
br0

---
Denys Fedoryshchenko, Network Engineer, Virtual ISP S.A.L.

^ permalink raw reply

* Re: NETDEV WATCHDOG: eth0 (atl1c): transmit queue 0 timed out
From: Thomas Meyer @ 2012-06-07 12:37 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: Eric Dumazet, Linux Kernel Mailing List, jcliburn, chris.snook,
	netdev, Josh Boyer
In-Reply-To: <20120606003856.GA7839@burratino>

Am Dienstag, den 05.06.2012, 19:38 -0500 schrieb Jonathan Nieder:
> In February, 2012, Thomas Meyer wrote:
> > Am Freitag, den 24.02.2012, 20:20 +0100 schrieb Eric Dumazet:
> 
> >> Here is a cumulative patch to hopefuly remove the races in this driver,
> >> could you please test it ?
> [...]
> > just building a 3.2.7 kernel with your patch applied. I will watch out
> > for the warning in the next days.
> 
> Well, did it work? :)

Hi Jonathan,

no it didn't. I still get these warnings.

wiht kind regards
thomas

> 
> In suspense,
> Jonathan

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox