* Re: [PATCH net-next 00/10] tg3: Bugfixes and enhancements
From: David Miller @ 2010-11-24 19:06 UTC (permalink / raw)
To: mcarlson; +Cc: netdev, andy
In-Reply-To: <1290623514-18193-1-git-send-email-mcarlson@broadcom.com>
From: "Matt Carlson" <mcarlson@broadcom.com>
Date: Wed, 24 Nov 2010 10:31:44 -0800
> This patchset applies some bugfixes and adds a few performance features
> for the 5719.
All applied, thanks a lot.
^ permalink raw reply
* Re: [PATCH net-next] bnx2x: Disable local BHes to prevent a dead-lock situation
From: David Miller @ 2010-11-24 19:09 UTC (permalink / raw)
To: vladz; +Cc: eilong, netdev, eric.dumazet
In-Reply-To: <1290606310.25676.7.camel@lb-tlvb-vladz>
From: "Vladislav Zolotarov" <vladz@broadcom.com>
Date: Wed, 24 Nov 2010 15:45:10 +0200
> From: Eric Dumazet <eric.dumazet@gmail.com>
>
> According to Eric's suggestion:
> Disable local BHes to prevent a dead-lock situation between sch_direct_xmit()
> (Soft_IRQ context) and bnx2x_tx_int (called by bnx2x_run_loopback() - syscall
> context), as both are taking a netif_tx_lock().
>
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> Signed-off-by: Vladislav Zolotarov <vladz@broadcom.com>
> Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Applied.
^ permalink raw reply
* Re: [PATCH net-next] bnx2x: Do interrupt mode initialization and NAPIs adding before register_netdev()
From: David Miller @ 2010-11-24 19:10 UTC (permalink / raw)
To: vladz; +Cc: eilong, netdev, mchan
In-Reply-To: <1290612078.27220.2.camel@lb-tlvb-vladz>
From: "Vladislav Zolotarov" <vladz@broadcom.com>
Date: Wed, 24 Nov 2010 17:21:18 +0200
> Move the interrupt mode configuration and NAPIs adding before a
> register_netdev() call to prevent netdev->open() from running
> before these functions are done.
>
> Advance a driver version number.
>
> Signed-off-by: Vladislav Zolotarov <vladz@broadcom.com>
> Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
> Reported-by: Michael Chan <mchan@broadcom.com>
Also applied, thanks.
^ permalink raw reply
* Re: [PATCH 1/4] stmmac: tidy-up stmmac_priv structure
From: David Miller @ 2010-11-24 19:14 UTC (permalink / raw)
To: peppe.cavallaro; +Cc: netdev
In-Reply-To: <1290561646-9429-1-git-send-email-peppe.cavallaro@st.com>
From: Peppe CAVALLARO <peppe.cavallaro@st.com>
Date: Wed, 24 Nov 2010 13:37:58 +0100
> This patch tidies-up the stmmac_priv structure
> that had many fileds alredy defined in the
> plat_stmmacenet_data structure.
>
> Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Applied.
^ permalink raw reply
* Re: [PATCH 2/4] stmmac: add init/exit callback in plat_stmmacenet_data struct
From: David Miller @ 2010-11-24 19:14 UTC (permalink / raw)
To: peppe.cavallaro; +Cc: netdev
In-Reply-To: <1290561646-9429-2-git-send-email-peppe.cavallaro@st.com>
From: Peppe CAVALLARO <peppe.cavallaro@st.com>
Date: Wed, 24 Nov 2010 13:38:05 +0100
> This patch adds in the plat_stmmacenet_data
> the init and exit callbacks that can be used
> for invoking specific platform functions.
> For example, on ST targets, these call the
> PAD manager functions to set PIO lines and
> syscfg registers.
> The patch removes the stmmac_claim_resource
> only used on STM Kernels as well.
>
> Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Applied.
^ permalink raw reply
* Re: [PATCH 3/4] stmmac: convert to dev_pm_ops.
From: David Miller @ 2010-11-24 19:14 UTC (permalink / raw)
To: peppe.cavallaro; +Cc: netdev
In-Reply-To: <1290561646-9429-3-git-send-email-peppe.cavallaro@st.com>
From: Peppe CAVALLARO <peppe.cavallaro@st.com>
Date: Wed, 24 Nov 2010 13:38:11 +0100
> This patch updates the PM support using the dev_pm_ops
> and reviews the hibernation support.
>
> Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Applied.
^ permalink raw reply
* Re: [PATCH 4/4] stmmac: update the driver version
From: David Miller @ 2010-11-24 19:14 UTC (permalink / raw)
To: peppe.cavallaro; +Cc: netdev
In-Reply-To: <1290561646-9429-4-git-send-email-peppe.cavallaro@st.com>
From: Peppe CAVALLARO <peppe.cavallaro@st.com>
Date: Wed, 24 Nov 2010 13:38:17 +0100
> Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Applied.
^ permalink raw reply
* Re: [PATCH net-next-2.6] ipv6: mcast: RCU conversion
From: David Miller @ 2010-11-24 19:17 UTC (permalink / raw)
To: eric.dumazet; +Cc: netdev
In-Reply-To: <1290553935.2866.82.camel@edumazet-laptop>
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Wed, 24 Nov 2010 00:12:15 +0100
> ipv6_sk_mc_lock rwlock becomes a spinlock.
>
> readers (inet6_mc_check()) now takes rcu_read_lock() instead of read
> lock. Writers dont need to disable BH anymore.
>
> struct ipv6_mc_socklist objects are reclaimed after one RCU grace
> period.
>
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Applied.
^ permalink raw reply
* Re: [PATCH net-next-2.6] scm: lower SCM_MAX_FD
From: David Miller @ 2010-11-24 19:17 UTC (permalink / raw)
To: eric.dumazet; +Cc: vegard.nossum, linux-kernel, akpm, eugene, netdev
In-Reply-To: <1290557355.2866.117.camel@edumazet-laptop>
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Wed, 24 Nov 2010 01:09:15 +0100
> [PATCH net-next-2.6] net: scm: lower SCM_MAX_FD
>
> Lower SCM_MAX_FD from 255 to 253 so that allocations for scm_fp_list are
> halved. (commit f8d570a4 added two pointers in this structure)
>
> scm_fp_dup() should not copy whole structure (and trigger kmemcheck
> warnings), but only the used part. While we are at it, only allocate
> needed size.
>
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Also applied, thanks Eric.
^ permalink raw reply
* Re: [PATCH 3/8] ethoc: enable interrupts after napi_complete
From: Laurent Chavey @ 2010-11-24 19:33 UTC (permalink / raw)
To: Jonas Bonn; +Cc: netdev, Adam Edvardsson
In-Reply-To: <AANLkTim=Y6NGk8MSq=hJTrBKaYuk4SJ_7wA=f+UBy3d0@mail.gmail.com>
actually my previous comments are not correct.
the check for work_done < budget will
only cause an extra call with work_done == 0
if no more work is done.
so that will work.
--
--------------------------------------------------------------------------------
^ permalink raw reply
* Re: [PATCH 1/8] ethoc: Add device tree configuration
From: David Miller @ 2010-11-24 19:35 UTC (permalink / raw)
To: jonas; +Cc: netdev
In-Reply-To: <1290606058-26703-2-git-send-email-jonas@southpole.se>
From: Jonas Bonn <jonas@southpole.se>
Date: Wed, 24 Nov 2010 14:40:51 +0100
> This patch adds the ability to describe ethernet devices via a flattened
> device tree. As device tree remains an optional feature, these bits all
> need to be guarded by CONFIG_OF ifdefs.
>
> MAC address is settable via the device tree parameter "local-mac-address";
> however, the selection of the phy id is limited to probing, for now.
>
> Signed-off-by: Jonas Bonn <jonas@southpole.se>
...
> + }
Trailing whitespace.
^ permalink raw reply
* [PATCH] hso: fix disable_net
From: Filip Aben @ 2010-11-24 19:35 UTC (permalink / raw)
To: davem-fT/PcQaiUtIeIZ0/mPfg9Q
Cc: linux-usb-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA,
jhovold-Re5JQEeQqe8AvxtiuMwx3w, pki-/L4m51SJ8HhmR6Xm/wNWPw,
j.dumon-x9gZzRpC1QbQT0dZR+AlfA
The HSO driver incorrectly creates a serial device instead of a net
device when disable_net is set. It shouldn't create anything for the
network interface.
Signed-off-by: Filip Aben <f.aben-x9gZzRpC1QbQT0dZR+AlfA@public.gmane.org>
---
diff --git a/drivers/net/usb/hso.c b/drivers/net/usb/hso.c
index b154a94..b05c235 100644
--- a/drivers/net/usb/hso.c
+++ b/drivers/net/usb/hso.c
@@ -2994,10 +2994,10 @@ static int hso_probe(struct usb_interface *interface,
case HSO_INTF_BULK:
/* It's a regular bulk interface */
- if (((port_spec & HSO_PORT_MASK) == HSO_PORT_NETWORK) &&
- !disable_net)
+ if ((port_spec & HSO_PORT_MASK) == HSO_PORT_NETWORK) {
+ if(!disable_net)
hso_dev = hso_create_net_device(interface, port_spec);
- else
+ } else
hso_dev =
hso_create_bulk_serial_device(interface, port_spec);
if (!hso_dev)
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* Re: [PATCH 4/8] ethoc: prevent overflow of rx counter
From: David Miller @ 2010-11-24 19:36 UTC (permalink / raw)
To: jonas; +Cc: netdev
In-Reply-To: <1290606058-26703-5-git-send-email-jonas@southpole.se>
From: Jonas Bonn <jonas@southpole.se>
Date: Wed, 24 Nov 2010 14:40:54 +0100
> Rewind cur_rx to prevent it from overflowing.
>
> Signed-off-by: Jonas Bonn <jonas@southpole.se>
...
> + /* Prevent overflow of priv->cur_rx by rewinding it */
Trailing whitespace, also please integrate the feedback from Ben
and Eric Dumazet about making this computation less expensive.
^ permalink raw reply
* Re: [PATCH 5/8] ethoc: Double check pending RX packet
From: David Miller @ 2010-11-24 19:37 UTC (permalink / raw)
To: jonas; +Cc: netdev
In-Reply-To: <1290606058-26703-6-git-send-email-jonas@southpole.se>
From: Jonas Bonn <jonas@southpole.se>
Date: Wed, 24 Nov 2010 14:40:55 +0100
> An interrupt may occur between checking bd.stat and clearing the
> interrupt source register which would result in the packet going totally
> unnoticed as the interrupt will be missed. Double check bd.stat after
> clearing the interrupt source register to guard against such an
> occurrence.
>
> Signed-off-by: Jonas Bonn <jonas@southpole.se>
...
> + if (bd.stat & RX_BD_EMPTY)
Trailing whitespace.
> + break;
> +
> + }
Unnecessary empty line.
^ permalink raw reply
* Re: [PATCH 6/8] ethoc: rework interrupt handling
From: David Miller @ 2010-11-24 19:38 UTC (permalink / raw)
To: jonas; +Cc: netdev
In-Reply-To: <1290606058-26703-7-git-send-email-jonas@southpole.se>
From: Jonas Bonn <jonas@southpole.se>
Date: Wed, 24 Nov 2010 14:40:56 +0100
> The old interrupt handling was incorrect in that it did not account for the
> fact that the interrupt source bits get set irregardless of whether or not
> their corresponding mask is set. This patch fixes that by masking off the
> source bits for masked interrupts.
>
> Furthermore, the handling of transmission events is moved to the NAPI polling
> handler alongside the reception handler, thus preventing a whole bunch of
> interrupts during heavy traffic.
>
> Signed-off-by: Jonas Bonn <jonas@southpole.se>
> + * and clearing the interrupt source, then we risk
...
> + * right away when we reenable it; hence, check
Trailing whitespace.
> - if ((priv->cur_tx - priv->dty_tx) <= (priv->num_tx / 2))
> + if ((priv->cur_tx - priv->dty_tx) <= (priv->num_tx / 2)) {
> netif_wake_queue(dev);
> + }
>
One-line statement does not require braces.
^ permalink raw reply
* Re: [PATCH 1/2 v7] xps: Improvements in TX queue selection
From: David Miller @ 2010-11-24 19:45 UTC (permalink / raw)
To: therbert; +Cc: netdev, eric.dumazet
In-Reply-To: <alpine.DEB.2.00.1011211501180.14901@pokey.mtv.corp.google.com>
From: Tom Herbert <therbert@google.com>
Date: Sun, 21 Nov 2010 15:17:29 -0800 (PST)
> In dev_pick_tx, don't do work in calculating queue
> index or setting
> the index in the sock unless the device has more than one queue. This
> allows the sock to be set only with a queue index of a multi-queue
> device which is desirable if device are stacked like in a tunnel.
>
> We also allow the mapping of a socket to queue to be changed. To
> maintain in order packet transmission a flag (ooo_okay) has been
> added to the sk_buff structure. If a transport layer sets this flag
> on a packet, the transmit queue can be changed for the socket.
> Presumably, the transport would set this if there was no possbility
> of creating OOO packets (for instance, there are no packets in flight
> for the socket). This patch includes the modification in TCP output
> for setting this flag.
>
> Signed-off-by: Tom Herbert <therbert@google.com>
Applied.
^ permalink raw reply
* Re: [PATCH 2/2 v7] xps: Transmit Packet Steering
From: David Miller @ 2010-11-24 19:45 UTC (permalink / raw)
To: therbert; +Cc: netdev, eric.dumazet
In-Reply-To: <alpine.DEB.2.00.1011211501430.14906@pokey.mtv.corp.google.com>
From: Tom Herbert <therbert@google.com>
Date: Sun, 21 Nov 2010 15:17:27 -0800 (PST)
> This patch implements transmit packet steering (XPS) for multiqueue
> devices. XPS selects a transmit queue during packet transmission based
> on configuration. This is done by mapping the CPU transmitting the
> packet to a queue. This is the transmit side analogue to RPS-- where
> RPS is selecting a CPU based on receive queue, XPS selects a queue
> based on the CPU (previously there was an XPS patch from Eric
> Dumazet, but that might more appropriately be called transmit completion
> steering).
>
> Each transmit queue can be associated with a number of CPUs which will
> use the queue to send packets. This is configured as a CPU mask on a
> per queue basis in:
>
> /sys/class/net/eth<n>/queues/tx-<n>/xps_cpus
>
> The mappings are stored per device in an inverted data structure that
> maps CPUs to queues. In the netdevice structure this is an array of
> num_possible_cpu structures where each structure holds and array of
> queue_indexes for queues which that CPU can use.
>
> The benefits of XPS are improved locality in the per queue data
> structures. Also, transmit completions are more likely to be done
> nearer to the sending thread, so this should promote locality back
> to the socket on free (e.g. UDP). The benefits of XPS are dependent on
> cache hierarchy, application load, and other factors. XPS would
> nominally be configured so that a queue would only be shared by CPUs
> which are sharing a cache, the degenerative configuration woud be that
> each CPU has it's own queue.
>
> Below are some benchmark results which show the potential benfit of
> this patch. The netperf test has 500 instances of netperf TCP_RR test
> with 1 byte req. and resp.
>
> bnx2x on 16 core AMD
> XPS (16 queues, 1 TX queue per CPU) 1234K at 100% CPU
> No XPS (16 queues) 996K at 100% CPU
>
> Signed-off-by: Tom Herbert <therbert@google.com>
Applied, please consider Eric's feedback about map NUMA node placement.
^ permalink raw reply
* Re: possible kernel oops from user MSS
From: David Miller @ 2010-11-24 19:47 UTC (permalink / raw)
To: mzhang; +Cc: netdev
In-Reply-To: <4CDDC6EE.2010005@mvista.com>
From: Min Zhang <mzhang@mvista.com>
Date: Fri, 12 Nov 2010 14:59:58 -0800
> Regarding commit 7a1abd08d52fdeddb3e9a5a33f2f15cc6a5674d2 ("tcp:
> Increase TCP_MAXSEG socket option minimum"). What is the reason
> TCP_MAXSEG minimum be 64? Isn't the exact be 40 which is
> TCPOLEN_MD5SIG_ALIGNED(20) + TCPOLEN_TSTAMP_ALIGNED(12) + 8?
>
> Or is it better to use TCP_MIN_MSS from tcp.h:
>
> /* Minimal accepted MSS. It is (60+60+8) - (20+20). */
> #define TCP_MIN_MSS 88U
Committed to net-2.6:
--------------------
>From c39508d6f118308355468314ff414644115a07f3 Mon Sep 17 00:00:00 2001
From: David S. Miller <davem@davemloft.net>
Date: Wed, 24 Nov 2010 11:47:22 -0800
Subject: [PATCH] tcp: Make TCP_MAXSEG minimum more correct.
Use TCP_MIN_MSS instead of constant 64.
Reported-by: Min Zhang <mzhang@mvista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
net/ipv4/tcp.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 0814199..f15c36a 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -2246,7 +2246,7 @@ static int do_tcp_setsockopt(struct sock *sk, int level,
/* Values greater than interface MTU won't take effect. However
* at the point when this call is done we typically don't yet
* know which interface is going to be used */
- if (val < 64 || val > MAX_TCP_WINDOW) {
+ if (val < TCP_MIN_MSS || val > MAX_TCP_WINDOW) {
err = -EINVAL;
break;
}
--
1.7.3.2
^ permalink raw reply related
* Unplug ethernet cable, the route persists. Why?
From: Mike Caoco @ 2010-11-24 19:48 UTC (permalink / raw)
To: Netdev, LKML; +Cc: caoco2002
Hello,
This may have been discussed, but all search engines couldn't give me a good answer...
I notice that when an interface is up/running, a local route is in the routing table:
$ ifconfig eth1
eth1 Link encap:Ethernet HWaddr 00:13:20:0e:2f:ed
inet addr:192.168.1.125 Bcast:192.168.1.255 Mask:255.255.255.0
inet6 addr: fe80::213:20ff:fe0e:2fed/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:35984995 errors:0 dropped:0 overruns:0 frame:0
TX packets:7409151 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:3252413825 (3.2 GB) TX bytes:1340077250 (1.3 GB)
$ ip route
192.168.20.0/24 dev eth0 proto kernel scope link src 192.168.20.120
192.168.1.0/24 dev eth1 proto kernel scope link src 192.168.1.125
default via 192.168.20.254 dev eth1 metric 100
After I unplug the cable from eth1, the RUNNING flag disappears, but the route is still there:
$ ifconfig eth1
eth1 Link encap:Ethernet HWaddr 00:13:20:0e:2f:ed
inet addr:192.168.1.125 Bcast:192.168.1.255 Mask:255.255.255.0
inet6 addr: fe80::213:20ff:fe0e:2fed/64 Scope:Link
UP BROADCAST MULTICAST MTU:1500 Metric:1
RX packets:35985023 errors:0 dropped:0 overruns:0 frame:0
TX packets:7409151 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:3252415633 (3.2 GB) TX bytes:1340077250 (1.3 GB)
$ ip route
192.168.20.0/24 dev eth0 proto kernel scope link src 192.168.20.120
192.168.1.0/24 dev eth1 proto kernel scope link src 192.168.1.125
default via 192.168.20.254 dev eth1 metric 100
And that *prevents* from using the default route to reach 192.168.1/24 subnet after eth1 is out.
I looked at the code, it seems the IFF_RUNNING flag change is ignored in dev_change_flags():
void __dev_notify_flags(struct net_device *dev, unsigned int old_flags)
{
.....
if (dev->flags & IFF_UP &&
(changes & ~(IFF_UP | IFF_PROMISC | IFF_ALLMULTI | IFF_VOLATILE)))
call_netdevice_notifiers(NETDEV_CHANGE, dev);
}
I searched in the Internet, and saw some people suggest using an application listener (eg, netplug) to remove the route.
My question is why cannot the kernel remove the route automatically when the link becomes down? Why should this complexity be pushed to the user to find a program to do that?
Thanks,
Joe
^ permalink raw reply
* Re: Generalizing mmap'ed sockets
From: Michael S. Tsirkin @ 2010-11-24 19:57 UTC (permalink / raw)
To: Tom Herbert; +Cc: David Miller, rick.jones2, netdev
In-Reply-To: <AANLkTik7HU0wyub3krZb_aABsHp-_LvNLtAr5CaAo4_A@mail.gmail.com>
On Fri, Nov 19, 2010 at 02:49:46PM -0800, Tom Herbert wrote:
> > I think the ACK (or for UDP, the kfree_skb() after TX completes) should
> > move the consumer pointer. Otherwise you have to copy, and the ACKs
> > do not clock the sender process properly.
> >
> Right, with the caveats that even ACK'ed data might still go out on
> the with that was discussed in the vmsplice() related patches. I
> don't think this should make the problem any worse.
Or any better. Sigh. Any idea how to actually track pages
in question so we can either really know when the stack is no longer
referencing them, or force a copy if they hang around after ack?
--
MST
^ permalink raw reply
* Re: Unplug ethernet cable, the route persists. Why?
From: Stephen Hemminger @ 2010-11-24 20:18 UTC (permalink / raw)
To: Mike Caoco; +Cc: Netdev, LKML
In-Reply-To: <242082.99180.qm@web63407.mail.re1.yahoo.com>
On Wed, 24 Nov 2010 11:48:03 -0800 (PST)
Mike Caoco <caoco2002@yahoo.com> wrote:
> Hello,
>
> This may have been discussed, but all search engines couldn't give me a good answer...
>
> I notice that when an interface is up/running, a local route is in the routing table:
>
> $ ifconfig eth1
> eth1 Link encap:Ethernet HWaddr 00:13:20:0e:2f:ed
> inet addr:192.168.1.125 Bcast:192.168.1.255 Mask:255.255.255.0
> inet6 addr: fe80::213:20ff:fe0e:2fed/64 Scope:Link
> UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
> RX packets:35984995 errors:0 dropped:0 overruns:0 frame:0
> TX packets:7409151 errors:0 dropped:0 overruns:0 carrier:0
> collisions:0 txqueuelen:1000
> RX bytes:3252413825 (3.2 GB) TX bytes:1340077250 (1.3 GB)
>
> $ ip route
> 192.168.20.0/24 dev eth0 proto kernel scope link src 192.168.20.120
> 192.168.1.0/24 dev eth1 proto kernel scope link src 192.168.1.125
> default via 192.168.20.254 dev eth1 metric 100
>
> After I unplug the cable from eth1, the RUNNING flag disappears, but the route is still there:
>
> $ ifconfig eth1
> eth1 Link encap:Ethernet HWaddr 00:13:20:0e:2f:ed
> inet addr:192.168.1.125 Bcast:192.168.1.255 Mask:255.255.255.0
> inet6 addr: fe80::213:20ff:fe0e:2fed/64 Scope:Link
> UP BROADCAST MULTICAST MTU:1500 Metric:1
> RX packets:35985023 errors:0 dropped:0 overruns:0 frame:0
> TX packets:7409151 errors:0 dropped:0 overruns:0 carrier:0
> collisions:0 txqueuelen:1000
> RX bytes:3252415633 (3.2 GB) TX bytes:1340077250 (1.3 GB)
>
> $ ip route
> 192.168.20.0/24 dev eth0 proto kernel scope link src 192.168.20.120
> 192.168.1.0/24 dev eth1 proto kernel scope link src 192.168.1.125
> default via 192.168.20.254 dev eth1 metric 100
>
> And that *prevents* from using the default route to reach 192.168.1/24 subnet after eth1 is out.
>
> I looked at the code, it seems the IFF_RUNNING flag change is ignored in dev_change_flags():
>
> void __dev_notify_flags(struct net_device *dev, unsigned int old_flags)
> {
> .....
> if (dev->flags & IFF_UP &&
> (changes & ~(IFF_UP | IFF_PROMISC | IFF_ALLMULTI | IFF_VOLATILE)))
> call_netdevice_notifiers(NETDEV_CHANGE, dev);
> }
>
> I searched in the Internet, and saw some people suggest using an application listener (eg, netplug) to remove the route.
>
> My question is why cannot the kernel remove the route automatically when the link becomes down? Why should this complexity be pushed to the user to find a program to do that?
>
Because there is no reason for the kernel to not expect the link to come back.
It is up to user space to do routing policy. For desktop/laptop users this is
done typically with NetworkManager or Connman; for routers this is done with
Quagga; and for servers use other tools.
If the kernel automatically removed the route, it would cause routing daemons
to recompute the route table (and propagate the change) every time a cable
got pulled or NIC needed to be reset.
--
^ permalink raw reply
* Re: bonding: propagation of offload settings
From: David Miller @ 2010-11-24 20:27 UTC (permalink / raw)
To: horms; +Cc: netdev, fubar
In-Reply-To: <20101030025435.GF12842@verge.net.au>
From: Simon Horman <horms@verge.net.au>
Date: Sat, 30 Oct 2010 11:54:35 +0900
> It seems to me that from a user point of view it may make more sense to:
>
> a) propagate settings from the master to the slaves and;
> b) possibly disallow setting the slaves directly
Yeah, good question.
Propagation from master to slave(s) would have difficult semantics.
If any of the slave changes fail (f.e. unsupported feature or memory
allocation failure) we'd have to unwind all of the slaves which did
accept the change without error.
What if the unwind operation fails, due to lack of resources? A lot
of state would thus need to be tracked to support this reasonably.
However we pretty much have to respect whatever changes get made
directly to the slaves, since the master must at all times claim
support for only the lowest common denominator, feature wise, amongst
that master's slaves.
^ permalink raw reply
* Re: tg3 driver not advertising 1000mbit
From: Krzysztof Olędzki @ 2010-11-24 20:09 UTC (permalink / raw)
To: Matt Carlson
Cc: Jean-Louis Dupond, Michael Chan, netdev@vger.kernel.org,
David Christensen
In-Reply-To: <20090702164212.GA8430@xw6200.broadcom.net>
On 2009-07-02 18:42, Matt Carlson wrote:
> On Tue, Jun 30, 2009 at 02:20:45AM -0700, Jean-Louis Dupond wrote:
>> # ethtool -i eth0
>> driver: tg3
>> version: 3.97
>> firmware-version: 5722-v3.08, ASFIPMI v6.02
>> bus-info: 0000:01:00.0
>>
>> Kernel version 2.6.29.4
>
> Rats. I mirrored your setup here, but I still can't reproduce the
> problem. I still suspect this is a bad driver<=> firmware interaction.
>
> Can you apply the following patch and show me the resulting syslog
> entries? The patch is just making sure the firmware request to shutdown
> really goes through.
>
>
> diff --git a/drivers/net/tg3.c b/drivers/net/tg3.c
> index 46a3f86..900e28b 100644
> --- a/drivers/net/tg3.c
> +++ b/drivers/net/tg3.c
> @@ -1124,6 +1124,9 @@ static void tg3_wait_for_event_ack(struct tg3 *tp)
> break;
> udelay(8);
> }
> +
> + if (i == delay_cnt)
> + printk( KERN_WARNING "Firmware didn't ack driver event!\n" );
> }
>
> /* tp->lock is held. */
> @@ -6330,12 +6333,16 @@ static void tg3_stop_fw(struct tg3 *tp)
> /* Wait for RX cpu to ACK the previous event. */
> tg3_wait_for_event_ack(tp);
>
> + printk( KERN_NOTICE "%s: Stopping firmware.\n", tp->dev->name );
> +
> tg3_write_mem(tp, NIC_SRAM_FW_CMD_MBOX, FWCMD_NICDRV_PAUSE_FW);
>
> tg3_generate_fw_event(tp);
>
> /* Wait for RX cpu to ACK this event. */
> tg3_wait_for_event_ack(tp);
> +
> + printk( KERN_NOTICE "%s: Operation completed.\n", tp->dev->name );
> }
> }
>
> @@ -7537,6 +7544,8 @@ static void tg3_timer(unsigned long __opaque)
> !(tp->tg3_flags3& TG3_FLG3_ENABLE_APE)) {
> tg3_wait_for_event_ack(tp);
>
> + printk( KERN_NOTICE "%s: Sending keepalive event.\n", tp->dev->name );
> +
> tg3_write_mem(tp, NIC_SRAM_FW_CMD_MBOX,
> FWCMD_NICDRV_ALIVE3);
> tg3_write_mem(tp, NIC_SRAM_FW_CMD_LEN_MBOX, 4);
Hello,
Have you been able to solve this issue? I have a similar problem with
Dell PowerEdge R300 servers connected to HP2610 100Mbps switches. The
servers contain two BCM5722 NICs and after a reboot, with probability
about 70%, I end up with 10Mbps HD mainly on the first NIC.
I discovered that it is enough to run:
/sbin/mii-tool -R eth0
/sbin/mii-tool -R eth1
to trigger renegotiation that brings expected 100Mbps FD. For now, I
added this to my startups scripts as a workaround.
This problem exists in 2.6.30-stable, 2.6.31-stable and 2.6.34-stable
which I'm currently running.
Best regards,
Krzysztof Olędzki
^ permalink raw reply
* Re: Unplug ethernet cable, the route persists. Why?
From: Mike Caoco @ 2010-11-24 20:29 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: Netdev, LKML
In-Reply-To: <20101124121826.39dd6ed1@nehalam>
--- On Wed, 11/24/10, Stephen Hemminger <shemminger@vyatta.com> wrote:
> From: Stephen Hemminger <shemminger@vyatta.com>
> Subject: Re: Unplug ethernet cable, the route persists. Why?
> To: "Mike Caoco" <caoco2002@yahoo.com>
> Cc: "Netdev" <netdev@vger.kernel.org>, "LKML" <linux-kernel@vger.kernel.org>
> Date: Wednesday, November 24, 2010, 12:18 PM
> On Wed, 24 Nov 2010 11:48:03 -0800
> (PST)
> Mike Caoco <caoco2002@yahoo.com>
> wrote:
>
> > Hello,
> >
> > This may have been discussed, but all search engines
> couldn't give me a good answer...
> >
> > I notice that when an interface is up/running, a local
> route is in the routing table:
> >
> > $ ifconfig eth1
> > eth1 Link encap:Ethernet
> HWaddr 00:13:20:0e:2f:ed
> > inet
> addr:192.168.1.125 Bcast:192.168.1.255
> Mask:255.255.255.0
> > inet6
> addr: fe80::213:20ff:fe0e:2fed/64 Scope:Link
> > UP
> BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
> > RX
> packets:35984995 errors:0 dropped:0 overruns:0 frame:0
> > TX
> packets:7409151 errors:0 dropped:0 overruns:0 carrier:0
> >
> collisions:0 txqueuelen:1000
> > RX
> bytes:3252413825 (3.2 GB) TX bytes:1340077250 (1.3
> GB)
> >
> > $ ip route
> > 192.168.20.0/24 dev eth0 proto kernel
> scope link src 192.168.20.120
> > 192.168.1.0/24 dev eth1 proto kernel scope
> link src 192.168.1.125
> > default via 192.168.20.254 dev eth1 metric 100
> >
> > After I unplug the cable from eth1, the RUNNING flag
> disappears, but the route is still there:
> >
> > $ ifconfig eth1
> > eth1 Link encap:Ethernet
> HWaddr 00:13:20:0e:2f:ed
> > inet
> addr:192.168.1.125 Bcast:192.168.1.255
> Mask:255.255.255.0
> > inet6
> addr: fe80::213:20ff:fe0e:2fed/64 Scope:Link
> > UP
> BROADCAST MULTICAST MTU:1500 Metric:1
> > RX
> packets:35985023 errors:0 dropped:0 overruns:0 frame:0
> > TX
> packets:7409151 errors:0 dropped:0 overruns:0 carrier:0
> >
> collisions:0 txqueuelen:1000
> > RX
> bytes:3252415633 (3.2 GB) TX bytes:1340077250 (1.3
> GB)
> >
> > $ ip route
> > 192.168.20.0/24 dev eth0 proto kernel
> scope link src 192.168.20.120
> > 192.168.1.0/24 dev eth1 proto kernel scope
> link src 192.168.1.125
> > default via 192.168.20.254 dev eth1 metric 100
> >
> > And that *prevents* from using the default route to
> reach 192.168.1/24 subnet after eth1 is out.
> >
> > I looked at the code, it seems the IFF_RUNNING flag
> change is ignored in dev_change_flags():
> >
> > void __dev_notify_flags(struct net_device *dev,
> unsigned int old_flags)
> > {
> > .....
> > if
> (dev->flags & IFF_UP &&
> >
> (changes & ~(IFF_UP | IFF_PROMISC |
> IFF_ALLMULTI | IFF_VOLATILE)))
> >
> call_netdevice_notifiers(NETDEV_CHANGE,
> dev);
> > }
> >
> > I searched in the Internet, and saw some people
> suggest using an application listener (eg, netplug) to
> remove the route.
> >
> > My question is why cannot the kernel remove the route
> automatically when the link becomes down? Why should
> this complexity be pushed to the user to find a program to
> do that?
> >
>
> Because there is no reason for the kernel to not expect the
> link to come back.
> It is up to user space to do routing policy. For
> desktop/laptop users this is
> done typically with NetworkManager or Connman; for routers
> this is done with
> Quagga; and for servers use other tools.
>
> If the kernel automatically removed the route, it would
> cause routing daemons
> to recompute the route table (and propagate the change)
> every time a cable
> got pulled or NIC needed to be reset.
>
So if you rely on NetworkManager or Connman or Quagga to remove the route, the routing daemons will recompute the route table anyway. So why cannot this be done in the kernel?
Even when no NetworkManager/Quagga is present, I think it is a legitimate reason to recompute the route when a cable is unplugged, which should not be a frequent event unless when under error conditions.
Thanks,
^ permalink raw reply
* Re: Unplug ethernet cable, the route persists. Why?
From: David Miller @ 2010-11-24 20:44 UTC (permalink / raw)
To: caoco2002; +Cc: shemminger, netdev, linux-kernel
In-Reply-To: <144174.46619.qm@web63401.mail.re1.yahoo.com>
From: Mike Caoco <caoco2002@yahoo.com>
Date: Wed, 24 Nov 2010 12:29:43 -0800 (PST)
> Even when no NetworkManager/Quagga is present, I think it is a
> legitimate reason to recompute the route when a cable is unplugged,
> which should not be a frequent event unless when under error
> conditions.
Cards periodically reset themselves, faulty switchs flap occaisionally,
this is life and it shouldn't cause route table recomputations across
your entire region.
Also Stephen listed places where such policy should be employed in
userspace, he absolutely did not say they should act that way by
default.
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox