Netdev List
 help / color / mirror / Atom feed
* MUTUAL PROJECT
From: jing01lee @ 2013-09-28 20:07 UTC (permalink / raw)
  To: Recipients

Hello

I have a business proposal for you. There is no risks involved.
Pls reply for briefs. 
Mr Lee

^ permalink raw reply

* Re: IPv6 path MTU discovery broken
From: Steinar H. Gunderson @ 2013-09-28 20:51 UTC (permalink / raw)
  To: netdev, edumazet
In-Reply-To: <20130928203318.GC23654@order.stressinduktion.org>

On Sat, Sep 28, 2013 at 10:33:18PM +0200, Hannes Frederic Sowa wrote:
>> Could this be related somehow to the packets coming from 2001:67c:29f4::31,
>> while the default route is to a link-local address? (An RPF issue?) This used
>> to work (although it was often flaky for me) in 3.10 and before. I can't
>> easily bisect, though, as I don't boot this machine too often.
> This looks like a bug and should definitely get fixed. There should be
> no RPF issue. May I have a look at your /proc/net/ipv6_route?

Hi,

I removed all the “weird” routes, and confirmed it fixed the problem.
However, upon adding them back again, the problem was still gone
(despite flushing the route cache).

This means that the issue has gone back to being intermittent, which is of
course the worst kind of bug to trace down. :-) I'll dump
/proc/net/ipv6_route and send you once I see the bug manifest itself again,
OK?

/* Steinar */
-- 
Homepage: http://www.sesse.net/

^ permalink raw reply

* [PATCH net-next ] net ipv4: Convert ipv4.ip_local_port_range to be per netns v3
From: Eric W. Biederman @ 2013-09-28 21:10 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, nicolas.dichtel
In-Reply-To: <8738ovtea9.fsf_-_@tw-ebiederman.twitter.com>


- Move sysctl_local_ports from a global variable into struct netns_ipv4.
- Modify inet_get_local_port_range to take a struct net, and update all
  of the callers.
- Move the initialization of sysctl_local_ports into
   sysctl_net_ipv4.c:ipv4_sysctl_init_net from inet_connection_sock.c

v2:
- Ensure indentation used tabs
- Fixed ip.h so it applies cleanly to todays net-next

v3:
- Compile fixes of strange callers of inet_get_local_port_range.
  This patch now successfully passes an allmodconfig build.
  Removed manual inlining of inet_get_local_port_range in ipv4_local_port_range

Originally-by: Samya <samya@twitter.com>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/infiniband/core/cma.c   |    2 +-
 drivers/net/vxlan.c             |    2 +-
 include/net/ip.h                |    6 +----
 include/net/netns/ipv4.h        |    6 +++++
 net/ipv4/inet_connection_sock.c |   20 +++++----------
 net/ipv4/inet_hashtables.c      |    2 +-
 net/ipv4/ping.c                 |    4 +--
 net/ipv4/sysctl_net_ipv4.c      |   52 +++++++++++++++++++++++++--------------
 net/ipv4/udp.c                  |    2 +-
 net/openvswitch/vport-vxlan.c   |    2 +-
 net/sctp/socket.c               |    2 +-
 security/selinux/hooks.c        |    2 +-
 12 files changed, 56 insertions(+), 46 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index dab4b41..a082fd9 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -2294,7 +2294,7 @@ static int cma_alloc_any_port(struct idr *ps, struct rdma_id_private *id_priv)
 	int low, high, remaining;
 	unsigned int rover;
 
-	inet_get_local_port_range(&low, &high);
+	inet_get_local_port_range(&init_net, &low, &high);
 	remaining = (high - low) + 1;
 	rover = net_random() % remaining + low;
 retry:
diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index d1292fe..c376be7 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -2089,7 +2089,7 @@ static void vxlan_setup(struct net_device *dev)
 	vxlan->age_timer.function = vxlan_cleanup;
 	vxlan->age_timer.data = (unsigned long) vxlan;
 
-	inet_get_local_port_range(&low, &high);
+	inet_get_local_port_range(dev_net(dev), &low, &high);
 	vxlan->port_min = low;
 	vxlan->port_max = high;
 	vxlan->dst_port = htons(vxlan_port);
diff --git a/include/net/ip.h b/include/net/ip.h
index c1f192b..c82e6ec 100644
--- a/include/net/ip.h
+++ b/include/net/ip.h
@@ -203,11 +203,7 @@ static inline void snmp_mib_free(void __percpu *ptr[SNMP_ARRAY_SZ])
 	}
 }
 
-extern struct local_ports {
-	seqlock_t	lock;
-	int		range[2];
-} sysctl_local_ports;
-void inet_get_local_port_range(int *low, int *high);
+void inet_get_local_port_range(struct net *net, int *low, int *high);
 
 extern unsigned long *sysctl_local_reserved_ports;
 static inline int inet_is_reserved_local_port(int port)
diff --git a/include/net/netns/ipv4.h b/include/net/netns/ipv4.h
index bf2ec22..5dbd232 100644
--- a/include/net/netns/ipv4.h
+++ b/include/net/netns/ipv4.h
@@ -15,6 +15,10 @@ struct fib_rules_ops;
 struct hlist_head;
 struct fib_table;
 struct sock;
+struct local_ports {
+	seqlock_t	lock;
+	int		range[2];
+};
 
 struct netns_ipv4 {
 #ifdef CONFIG_SYSCTL
@@ -62,6 +66,8 @@ struct netns_ipv4 {
 	int sysctl_icmp_ratemask;
 	int sysctl_icmp_errors_use_inbound_ifaddr;
 
+	struct local_ports sysctl_local_ports;
+
 	int sysctl_tcp_ecn;
 
 	kgid_t sysctl_ping_group_range[2];
diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
index 6acb541..7ac7aa1 100644
--- a/net/ipv4/inet_connection_sock.c
+++ b/net/ipv4/inet_connection_sock.c
@@ -29,27 +29,19 @@ const char inet_csk_timer_bug_msg[] = "inet_csk BUG: unknown timer value\n";
 EXPORT_SYMBOL(inet_csk_timer_bug_msg);
 #endif
 
-/*
- * This struct holds the first and last local port number.
- */
-struct local_ports sysctl_local_ports __read_mostly = {
-	.lock = __SEQLOCK_UNLOCKED(sysctl_local_ports.lock),
-	.range = { 32768, 61000 },
-};
-
 unsigned long *sysctl_local_reserved_ports;
 EXPORT_SYMBOL(sysctl_local_reserved_ports);
 
-void inet_get_local_port_range(int *low, int *high)
+void inet_get_local_port_range(struct net *net, int *low, int *high)
 {
 	unsigned int seq;
 
 	do {
-		seq = read_seqbegin(&sysctl_local_ports.lock);
+		seq = read_seqbegin(&net->ipv4.sysctl_local_ports.lock);
 
-		*low = sysctl_local_ports.range[0];
-		*high = sysctl_local_ports.range[1];
-	} while (read_seqretry(&sysctl_local_ports.lock, seq));
+		*low = net->ipv4.sysctl_local_ports.range[0];
+		*high = net->ipv4.sysctl_local_ports.range[1];
+	} while (read_seqretry(&net->ipv4.sysctl_local_ports.lock, seq));
 }
 EXPORT_SYMBOL(inet_get_local_port_range);
 
@@ -116,7 +108,7 @@ int inet_csk_get_port(struct sock *sk, unsigned short snum)
 		int remaining, rover, low, high;
 
 again:
-		inet_get_local_port_range(&low, &high);
+		inet_get_local_port_range(net, &low, &high);
 		remaining = (high - low) + 1;
 		smallest_rover = rover = net_random() % remaining + low;
 
diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c
index 7bd8983..2779037 100644
--- a/net/ipv4/inet_hashtables.c
+++ b/net/ipv4/inet_hashtables.c
@@ -494,7 +494,7 @@ int __inet_hash_connect(struct inet_timewait_death_row *death_row,
 		u32 offset = hint + port_offset;
 		struct inet_timewait_sock *tw = NULL;
 
-		inet_get_local_port_range(&low, &high);
+		inet_get_local_port_range(net, &low, &high);
 		remaining = (high - low) + 1;
 
 		local_bh_disable();
diff --git a/net/ipv4/ping.c b/net/ipv4/ping.c
index d7d9882..fc8c95d 100644
--- a/net/ipv4/ping.c
+++ b/net/ipv4/ping.c
@@ -237,11 +237,11 @@ static void inet_get_ping_group_range_net(struct net *net, kgid_t *low,
 	unsigned int seq;
 
 	do {
-		seq = read_seqbegin(&sysctl_local_ports.lock);
+		seq = read_seqbegin(&net->ipv4.sysctl_local_ports.lock);
 
 		*low = data[0];
 		*high = data[1];
-	} while (read_seqretry(&sysctl_local_ports.lock, seq));
+	} while (read_seqretry(&net->ipv4.sysctl_local_ports.lock, seq));
 }
 
 
diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
index 540279f..c08f096 100644
--- a/net/ipv4/sysctl_net_ipv4.c
+++ b/net/ipv4/sysctl_net_ipv4.c
@@ -43,12 +43,12 @@ static int ip_ping_group_range_min[] = { 0, 0 };
 static int ip_ping_group_range_max[] = { GID_T_MAX, GID_T_MAX };
 
 /* Update system visible IP port range */
-static void set_local_port_range(int range[2])
+static void set_local_port_range(struct net *net, int range[2])
 {
-	write_seqlock(&sysctl_local_ports.lock);
-	sysctl_local_ports.range[0] = range[0];
-	sysctl_local_ports.range[1] = range[1];
-	write_sequnlock(&sysctl_local_ports.lock);
+	write_seqlock(&net->ipv4.sysctl_local_ports.lock);
+	net->ipv4.sysctl_local_ports.range[0] = range[0];
+	net->ipv4.sysctl_local_ports.range[1] = range[1];
+	write_sequnlock(&net->ipv4.sysctl_local_ports.lock);
 }
 
 /* Validate changes from /proc interface. */
@@ -56,6 +56,8 @@ static int ipv4_local_port_range(struct ctl_table *table, int write,
 				 void __user *buffer,
 				 size_t *lenp, loff_t *ppos)
 {
+	struct net *net =
+		container_of(table->data, struct net, ipv4.sysctl_local_ports.range);
 	int ret;
 	int range[2];
 	struct ctl_table tmp = {
@@ -66,14 +68,15 @@ static int ipv4_local_port_range(struct ctl_table *table, int write,
 		.extra2 = &ip_local_port_range_max,
 	};
 
-	inet_get_local_port_range(range, range + 1);
+	inet_get_local_port_range(net, &range[0], &range[1]);
+
 	ret = proc_dointvec_minmax(&tmp, write, buffer, lenp, ppos);
 
 	if (write && ret == 0) {
 		if (range[1] < range[0])
 			ret = -EINVAL;
 		else
-			set_local_port_range(range);
+			set_local_port_range(net, range);
 	}
 
 	return ret;
@@ -83,23 +86,27 @@ static int ipv4_local_port_range(struct ctl_table *table, int write,
 static void inet_get_ping_group_range_table(struct ctl_table *table, kgid_t *low, kgid_t *high)
 {
 	kgid_t *data = table->data;
+	struct net *net =
+		container_of(table->data, struct net, ipv4.sysctl_ping_group_range);
 	unsigned int seq;
 	do {
-		seq = read_seqbegin(&sysctl_local_ports.lock);
+		seq = read_seqbegin(&net->ipv4.sysctl_local_ports.lock);
 
 		*low = data[0];
 		*high = data[1];
-	} while (read_seqretry(&sysctl_local_ports.lock, seq));
+	} while (read_seqretry(&net->ipv4.sysctl_local_ports.lock, seq));
 }
 
 /* Update system visible IP port range */
 static void set_ping_group_range(struct ctl_table *table, kgid_t low, kgid_t high)
 {
 	kgid_t *data = table->data;
-	write_seqlock(&sysctl_local_ports.lock);
+	struct net *net =
+		container_of(table->data, struct net, ipv4.sysctl_ping_group_range);
+	write_seqlock(&net->ipv4.sysctl_local_ports.lock);
 	data[0] = low;
 	data[1] = high;
-	write_sequnlock(&sysctl_local_ports.lock);
+	write_sequnlock(&net->ipv4.sysctl_local_ports.lock);
 }
 
 /* Validate changes from /proc interface. */
@@ -475,13 +482,6 @@ static struct ctl_table ipv4_table[] = {
 		.proc_handler	= proc_dointvec
 	},
 	{
-		.procname	= "ip_local_port_range",
-		.data		= &sysctl_local_ports.range,
-		.maxlen		= sizeof(sysctl_local_ports.range),
-		.mode		= 0644,
-		.proc_handler	= ipv4_local_port_range,
-	},
-	{
 		.procname	= "ip_local_reserved_ports",
 		.data		= NULL, /* initialized in sysctl_ipv4_init */
 		.maxlen		= 65536,
@@ -854,6 +854,13 @@ static struct ctl_table ipv4_net_table[] = {
 		.proc_handler	= proc_dointvec
 	},
 	{
+		.procname	= "ip_local_port_range",
+		.maxlen		= sizeof(init_net.ipv4.sysctl_local_ports.range),
+		.data		= &init_net.ipv4.sysctl_local_ports.range,
+		.mode		= 0644,
+		.proc_handler	= ipv4_local_port_range,
+	},
+	{
 		.procname	= "tcp_mem",
 		.maxlen		= sizeof(init_net.ipv4.sysctl_tcp_mem),
 		.mode		= 0644,
@@ -888,6 +895,8 @@ static __net_init int ipv4_sysctl_init_net(struct net *net)
 			&net->ipv4.sysctl_ping_group_range;
 		table[7].data =
 			&net->ipv4.sysctl_tcp_ecn;
+		table[8].data =
+			&net->ipv4.sysctl_local_ports.range;
 
 		/* Don't export sysctls to unprivileged users */
 		if (net->user_ns != &init_user_ns)
@@ -901,6 +910,13 @@ static __net_init int ipv4_sysctl_init_net(struct net *net)
 	net->ipv4.sysctl_ping_group_range[0] = make_kgid(&init_user_ns, 1);
 	net->ipv4.sysctl_ping_group_range[1] = make_kgid(&init_user_ns, 0);
 
+	/*
+	 * Set defaults for local port range
+	 */
+	seqlock_init(&net->ipv4.sysctl_local_ports.lock);
+	net->ipv4.sysctl_local_ports.range[0] =  32768;
+	net->ipv4.sysctl_local_ports.range[1] =  61000;
+
 	tcp_init_mem(net);
 
 	net->ipv4.ipv4_hdr = register_net_sysctl(net, "net/ipv4", table);
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 74d2c95..460a4c1 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -219,7 +219,7 @@ int udp_lib_get_port(struct sock *sk, unsigned short snum,
 		unsigned short first, last;
 		DECLARE_BITMAP(bitmap, PORTS_PER_CHAIN);
 
-		inet_get_local_port_range(&low, &high);
+		inet_get_local_port_range(net, &low, &high);
 		remaining = (high - low) + 1;
 
 		rand = net_random();
diff --git a/net/openvswitch/vport-vxlan.c b/net/openvswitch/vport-vxlan.c
index a481c03..56e22b7 100644
--- a/net/openvswitch/vport-vxlan.c
+++ b/net/openvswitch/vport-vxlan.c
@@ -173,7 +173,7 @@ static int vxlan_tnl_send(struct vport *vport, struct sk_buff *skb)
 
 	skb->local_df = 1;
 
-	inet_get_local_port_range(&port_min, &port_max);
+	inet_get_local_port_range(net, &port_min, &port_max);
 	src_port = vxlan_src_port(port_min, port_max, skb);
 
 	err = vxlan_xmit_skb(vxlan_port->vs, rt, skb,
diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index 911b71b..72046b9 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -5890,7 +5890,7 @@ static long sctp_get_port_local(struct sock *sk, union sctp_addr *addr)
 		int low, high, remaining, index;
 		unsigned int rover;
 
-		inet_get_local_port_range(&low, &high);
+		inet_get_local_port_range(sock_net(sk), &low, &high);
 		remaining = (high - low) + 1;
 		rover = net_random() % remaining + low;
 
diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index a5091ec..568c769 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -3929,7 +3929,7 @@ static int selinux_socket_bind(struct socket *sock, struct sockaddr *address, in
 		if (snum) {
 			int low, high;
 
-			inet_get_local_port_range(&low, &high);
+			inet_get_local_port_range(sock_net(sk), &low, &high);
 
 			if (snum < max(PROT_SOCK, low) || snum > high) {
 				err = sel_netport_sid(sk->sk_protocol,
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH 1/3] tg3: add support a phy at an address different than 01
From: Hauke Mehrtens @ 2013-09-28 21:15 UTC (permalink / raw)
  To: davem; +Cc: nsujir, mchan, netdev, Hauke Mehrtens

When phylib was in use tg3 only searched at address 01 on the mdio
bus and did not work with any other address. On the BCM4705 SoCs the
switch is connected as a PHY behind the MAC driven by tg3 and it is at
PHY address 30 in most cases. This is a preparation patch to allow
support for such switches.

phy_addr is set to TG3_PHY_MII_ADDR for all devices, which are using
phylib, so this should not change any behavior.

Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
---
 drivers/net/ethernet/broadcom/tg3.c |   38 +++++++++++++++++------------------
 1 file changed, 19 insertions(+), 19 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/tg3.c b/drivers/net/ethernet/broadcom/tg3.c
index 221a181..853a05e 100644
--- a/drivers/net/ethernet/broadcom/tg3.c
+++ b/drivers/net/ethernet/broadcom/tg3.c
@@ -1375,7 +1375,7 @@ static int tg3_mdio_read(struct mii_bus *bp, int mii_id, int reg)
 
 	spin_lock_bh(&tp->lock);
 
-	if (tg3_readphy(tp, reg, &val))
+	if (__tg3_readphy(tp, mii_id, reg, &val))
 		val = -EIO;
 
 	spin_unlock_bh(&tp->lock);
@@ -1390,7 +1390,7 @@ static int tg3_mdio_write(struct mii_bus *bp, int mii_id, int reg, u16 val)
 
 	spin_lock_bh(&tp->lock);
 
-	if (tg3_writephy(tp, reg, val))
+	if (__tg3_writephy(tp, mii_id, reg, val))
 		ret = -EIO;
 
 	spin_unlock_bh(&tp->lock);
@@ -1408,7 +1408,7 @@ static void tg3_mdio_config_5785(struct tg3 *tp)
 	u32 val;
 	struct phy_device *phydev;
 
-	phydev = tp->mdio_bus->phy_map[TG3_PHY_MII_ADDR];
+	phydev = tp->mdio_bus->phy_map[tp->phy_addr];
 	switch (phydev->drv->phy_id & phydev->drv->phy_id_mask) {
 	case PHY_ID_BCM50610:
 	case PHY_ID_BCM50610M:
@@ -1533,7 +1533,7 @@ static int tg3_mdio_init(struct tg3 *tp)
 	tp->mdio_bus->read     = &tg3_mdio_read;
 	tp->mdio_bus->write    = &tg3_mdio_write;
 	tp->mdio_bus->reset    = &tg3_mdio_reset;
-	tp->mdio_bus->phy_mask = ~(1 << TG3_PHY_MII_ADDR);
+	tp->mdio_bus->phy_mask = ~(1 << tp->phy_addr);
 	tp->mdio_bus->irq      = &tp->mdio_irq[0];
 
 	for (i = 0; i < PHY_MAX_ADDR; i++)
@@ -1554,7 +1554,7 @@ static int tg3_mdio_init(struct tg3 *tp)
 		return i;
 	}
 
-	phydev = tp->mdio_bus->phy_map[TG3_PHY_MII_ADDR];
+	phydev = tp->mdio_bus->phy_map[tp->phy_addr];
 
 	if (!phydev || !phydev->drv) {
 		dev_warn(&tp->pdev->dev, "No PHY devices\n");
@@ -1964,7 +1964,7 @@ static void tg3_setup_flow_control(struct tg3 *tp, u32 lcladv, u32 rmtadv)
 	u32 old_tx_mode = tp->tx_mode;
 
 	if (tg3_flag(tp, USE_PHYLIB))
-		autoneg = tp->mdio_bus->phy_map[TG3_PHY_MII_ADDR]->autoneg;
+		autoneg = tp->mdio_bus->phy_map[tp->phy_addr]->autoneg;
 	else
 		autoneg = tp->link_config.autoneg;
 
@@ -2000,7 +2000,7 @@ static void tg3_adjust_link(struct net_device *dev)
 	u8 oldflowctrl, linkmesg = 0;
 	u32 mac_mode, lcl_adv, rmt_adv;
 	struct tg3 *tp = netdev_priv(dev);
-	struct phy_device *phydev = tp->mdio_bus->phy_map[TG3_PHY_MII_ADDR];
+	struct phy_device *phydev = tp->mdio_bus->phy_map[tp->phy_addr];
 
 	spin_lock_bh(&tp->lock);
 
@@ -2089,7 +2089,7 @@ static int tg3_phy_init(struct tg3 *tp)
 	/* Bring the PHY back to a known state. */
 	tg3_bmcr_reset(tp);
 
-	phydev = tp->mdio_bus->phy_map[TG3_PHY_MII_ADDR];
+	phydev = tp->mdio_bus->phy_map[tp->phy_addr];
 
 	/* Attach the MAC to the PHY. */
 	phydev = phy_connect(tp->dev, dev_name(&phydev->dev),
@@ -2116,7 +2116,7 @@ static int tg3_phy_init(struct tg3 *tp)
 				      SUPPORTED_Asym_Pause);
 		break;
 	default:
-		phy_disconnect(tp->mdio_bus->phy_map[TG3_PHY_MII_ADDR]);
+		phy_disconnect(tp->mdio_bus->phy_map[tp->phy_addr]);
 		return -EINVAL;
 	}
 
@@ -2134,7 +2134,7 @@ static void tg3_phy_start(struct tg3 *tp)
 	if (!(tp->phy_flags & TG3_PHYFLG_IS_CONNECTED))
 		return;
 
-	phydev = tp->mdio_bus->phy_map[TG3_PHY_MII_ADDR];
+	phydev = tp->mdio_bus->phy_map[tp->phy_addr];
 
 	if (tp->phy_flags & TG3_PHYFLG_IS_LOW_POWER) {
 		tp->phy_flags &= ~TG3_PHYFLG_IS_LOW_POWER;
@@ -2154,13 +2154,13 @@ static void tg3_phy_stop(struct tg3 *tp)
 	if (!(tp->phy_flags & TG3_PHYFLG_IS_CONNECTED))
 		return;
 
-	phy_stop(tp->mdio_bus->phy_map[TG3_PHY_MII_ADDR]);
+	phy_stop(tp->mdio_bus->phy_map[tp->phy_addr]);
 }
 
 static void tg3_phy_fini(struct tg3 *tp)
 {
 	if (tp->phy_flags & TG3_PHYFLG_IS_CONNECTED) {
-		phy_disconnect(tp->mdio_bus->phy_map[TG3_PHY_MII_ADDR]);
+		phy_disconnect(tp->mdio_bus->phy_map[tp->phy_addr]);
 		tp->phy_flags &= ~TG3_PHYFLG_IS_CONNECTED;
 	}
 }
@@ -4034,7 +4034,7 @@ static int tg3_power_down_prepare(struct tg3 *tp)
 			struct phy_device *phydev;
 			u32 phyid, advertising;
 
-			phydev = tp->mdio_bus->phy_map[TG3_PHY_MII_ADDR];
+			phydev = tp->mdio_bus->phy_map[tp->phy_addr];
 
 			tp->phy_flags |= TG3_PHYFLG_IS_LOW_POWER;
 
@@ -11922,7 +11922,7 @@ static int tg3_get_settings(struct net_device *dev, struct ethtool_cmd *cmd)
 		struct phy_device *phydev;
 		if (!(tp->phy_flags & TG3_PHYFLG_IS_CONNECTED))
 			return -EAGAIN;
-		phydev = tp->mdio_bus->phy_map[TG3_PHY_MII_ADDR];
+		phydev = tp->mdio_bus->phy_map[tp->phy_addr];
 		return phy_ethtool_gset(phydev, cmd);
 	}
 
@@ -11989,7 +11989,7 @@ static int tg3_set_settings(struct net_device *dev, struct ethtool_cmd *cmd)
 		struct phy_device *phydev;
 		if (!(tp->phy_flags & TG3_PHYFLG_IS_CONNECTED))
 			return -EAGAIN;
-		phydev = tp->mdio_bus->phy_map[TG3_PHY_MII_ADDR];
+		phydev = tp->mdio_bus->phy_map[tp->phy_addr];
 		return phy_ethtool_sset(phydev, cmd);
 	}
 
@@ -12144,7 +12144,7 @@ static int tg3_nway_reset(struct net_device *dev)
 	if (tg3_flag(tp, USE_PHYLIB)) {
 		if (!(tp->phy_flags & TG3_PHYFLG_IS_CONNECTED))
 			return -EAGAIN;
-		r = phy_start_aneg(tp->mdio_bus->phy_map[TG3_PHY_MII_ADDR]);
+		r = phy_start_aneg(tp->mdio_bus->phy_map[tp->phy_addr]);
 	} else {
 		u32 bmcr;
 
@@ -12260,7 +12260,7 @@ static int tg3_set_pauseparam(struct net_device *dev, struct ethtool_pauseparam
 		u32 newadv;
 		struct phy_device *phydev;
 
-		phydev = tp->mdio_bus->phy_map[TG3_PHY_MII_ADDR];
+		phydev = tp->mdio_bus->phy_map[tp->phy_addr];
 
 		if (!(phydev->supported & SUPPORTED_Pause) ||
 		    (!(phydev->supported & SUPPORTED_Asym_Pause) &&
@@ -13696,7 +13696,7 @@ static int tg3_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd)
 		struct phy_device *phydev;
 		if (!(tp->phy_flags & TG3_PHYFLG_IS_CONNECTED))
 			return -EAGAIN;
-		phydev = tp->mdio_bus->phy_map[TG3_PHY_MII_ADDR];
+		phydev = tp->mdio_bus->phy_map[tp->phy_addr];
 		return phy_mii_ioctl(phydev, ifr, cmd);
 	}
 
@@ -17635,7 +17635,7 @@ static int tg3_init_one(struct pci_dev *pdev,
 
 	if (tp->phy_flags & TG3_PHYFLG_IS_CONNECTED) {
 		struct phy_device *phydev;
-		phydev = tp->mdio_bus->phy_map[TG3_PHY_MII_ADDR];
+		phydev = tp->mdio_bus->phy_map[tp->phy_addr];
 		netdev_info(dev,
 			    "attached PHY driver [%s] (mii_bus:phy_addr=%s)\n",
 			    phydev->drv->name, dev_name(&phydev->dev));
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH 2/3] ssb: provide phy address for Gigabit Ethernet driver
From: Hauke Mehrtens @ 2013-09-28 21:15 UTC (permalink / raw)
  To: davem; +Cc: nsujir, mchan, netdev, Hauke Mehrtens
In-Reply-To: <1380402928-11480-1-git-send-email-hauke@hauke-m.de>

Add a function to provide the phy address which should be used to the
Gigabit Ethernet driver connected to ssb.

Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
---
 include/linux/ssb/ssb_driver_gige.h |   14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/include/linux/ssb/ssb_driver_gige.h b/include/linux/ssb/ssb_driver_gige.h
index 86a12b0..0688472 100644
--- a/include/linux/ssb/ssb_driver_gige.h
+++ b/include/linux/ssb/ssb_driver_gige.h
@@ -108,6 +108,16 @@ static inline int ssb_gige_get_macaddr(struct pci_dev *pdev, u8 *macaddr)
 	return 0;
 }
 
+/* Get the device phy address */
+static inline int ssb_gige_get_phyaddr(struct pci_dev *pdev)
+{
+	struct ssb_gige *dev = pdev_to_ssb_gige(pdev);
+	if (!dev)
+		return -ENODEV;
+
+	return dev->dev->bus->sprom.et0phyaddr;
+}
+
 extern int ssb_gige_pcibios_plat_dev_init(struct ssb_device *sdev,
 					  struct pci_dev *pdev);
 extern int ssb_gige_map_irq(struct ssb_device *sdev,
@@ -174,6 +184,10 @@ static inline int ssb_gige_get_macaddr(struct pci_dev *pdev, u8 *macaddr)
 {
 	return -ENODEV;
 }
+static inline int ssb_gige_get_phyaddr(struct pci_dev *pdev)
+{
+	return -ENODEV;
+}
 
 #endif /* CONFIG_SSB_DRIVER_GIGE */
 #endif /* LINUX_SSB_DRIVER_GIGE_H_ */
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH 3/3] tg3: use phylib when robo switch is in use
From: Hauke Mehrtens @ 2013-09-28 21:15 UTC (permalink / raw)
  To: davem; +Cc: nsujir, mchan, netdev, Hauke Mehrtens
In-Reply-To: <1380402928-11480-1-git-send-email-hauke@hauke-m.de>

When a switch is connected as a PHY to the MAC driven by tg3, use
phylib and provide the phy address to tg3 from the sprom.

Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
---
 drivers/net/ethernet/broadcom/tg3.c |   11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/broadcom/tg3.c b/drivers/net/ethernet/broadcom/tg3.c
index 853a05e..a17a3c9 100644
--- a/drivers/net/ethernet/broadcom/tg3.c
+++ b/drivers/net/ethernet/broadcom/tg3.c
@@ -1513,6 +1513,13 @@ static int tg3_mdio_init(struct tg3 *tp)
 				    TG3_CPMU_PHY_STRAP_IS_SERDES;
 		if (is_serdes)
 			tp->phy_addr += 7;
+	} else if (tg3_flag(tp, IS_SSB_CORE) && tg3_flag(tp, ROBOSWITCH)) {
+		int addr;
+
+		addr = ssb_gige_get_phyaddr(tp->pdev);
+		if (addr < 0)
+			return addr;
+		tp->phy_addr = addr;
 	} else
 		tp->phy_addr = TG3_PHY_MII_ADDR;
 
@@ -17366,8 +17373,10 @@ static int tg3_init_one(struct pci_dev *pdev,
 			tg3_flag_set(tp, FLUSH_POSTED_WRITES);
 		if (ssb_gige_one_dma_at_once(pdev))
 			tg3_flag_set(tp, ONE_DMA_AT_ONCE);
-		if (ssb_gige_have_roboswitch(pdev))
+		if (ssb_gige_have_roboswitch(pdev)) {
+			tg3_flag_set(tp, USE_PHYLIB);
 			tg3_flag_set(tp, ROBOSWITCH);
+		}
 		if (ssb_gige_is_rgmii(pdev))
 			tg3_flag_set(tp, RGMII_MODE);
 	}
-- 
1.7.10.4

^ permalink raw reply related

* Re: IPv6 path MTU discovery broken
From: Hannes Frederic Sowa @ 2013-09-28 21:19 UTC (permalink / raw)
  To: Steinar H. Gunderson; +Cc: netdev, edumazet
In-Reply-To: <20130928205131.GB20124@sesse.net>

On Sat, Sep 28, 2013 at 10:51:31PM +0200, Steinar H. Gunderson wrote:
> On Sat, Sep 28, 2013 at 10:33:18PM +0200, Hannes Frederic Sowa wrote:
> >> Could this be related somehow to the packets coming from 2001:67c:29f4::31,
> >> while the default route is to a link-local address? (An RPF issue?) This used
> >> to work (although it was often flaky for me) in 3.10 and before. I can't
> >> easily bisect, though, as I don't boot this machine too often.
> > This looks like a bug and should definitely get fixed. There should be
> > no RPF issue. May I have a look at your /proc/net/ipv6_route?
> 
> Hi,
> 
> I removed all the “weird” routes, and confirmed it fixed the problem.
> However, upon adding them back again, the problem was still gone
> (despite flushing the route cache).
> 
> This means that the issue has gone back to being intermittent, which is of
> course the worst kind of bug to trace down. :-) I'll dump
> /proc/net/ipv6_route and send you once I see the bug manifest itself again,
> OK?

Yes, that would be very helpful.

Also, you can try to churn up your bgp connection a bit so that the fib
serial numbers get incremented a lot (drop and install new routes). When
tcp_ipv6 processes the icmp errors it will drop the in-socket cached
routing entry then and will reinstall a relookuped one.  This is my only
suspect currently. If that would help to reproduce the problem the suspects
would be the changes in the next-hop selection. Sorry, no other idea
currently.

Thanks,

  Hannes

^ permalink raw reply

* [PATCH] b44: add support for Byte Queue Limits
From: Hauke Mehrtens @ 2013-09-28 21:22 UTC (permalink / raw)
  To: davem; +Cc: zambrano, netdev, Hauke Mehrtens

This makes it possible to use some more advanced queuing
techniques with this driver.

Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
---
 drivers/net/ethernet/broadcom/b44.c |   10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/drivers/net/ethernet/broadcom/b44.c b/drivers/net/ethernet/broadcom/b44.c
index 9b017d9..c96930f 100644
--- a/drivers/net/ethernet/broadcom/b44.c
+++ b/drivers/net/ethernet/broadcom/b44.c
@@ -596,6 +596,7 @@ static void b44_timer(unsigned long __opaque)
 static void b44_tx(struct b44 *bp)
 {
 	u32 cur, cons;
+	unsigned bytes_compl = 0, pkts_compl = 0;
 
 	cur  = br32(bp, B44_DMATX_STAT) & DMATX_STAT_CDMASK;
 	cur /= sizeof(struct dma_desc);
@@ -612,9 +613,14 @@ static void b44_tx(struct b44 *bp)
 				 skb->len,
 				 DMA_TO_DEVICE);
 		rp->skb = NULL;
+
+		bytes_compl += skb->len;
+		pkts_compl++;
+
 		dev_kfree_skb_irq(skb);
 	}
 
+	netdev_completed_queue(bp->dev, pkts_compl, bytes_compl);
 	bp->tx_cons = cons;
 	if (netif_queue_stopped(bp->dev) &&
 	    TX_BUFFS_AVAIL(bp) > B44_TX_WAKEUP_THRESH)
@@ -1018,6 +1024,8 @@ static netdev_tx_t b44_start_xmit(struct sk_buff *skb, struct net_device *dev)
 	if (bp->flags & B44_FLAG_REORDER_BUG)
 		br32(bp, B44_DMATX_PTR);
 
+	netdev_sent_queue(dev, skb->len);
+
 	if (TX_BUFFS_AVAIL(bp) < 1)
 		netif_stop_queue(dev);
 
@@ -1416,6 +1424,8 @@ static void b44_init_hw(struct b44 *bp, int reset_kind)
 
 	val = br32(bp, B44_ENET_CTRL);
 	bw32(bp, B44_ENET_CTRL, (val | ENET_CTRL_ENABLE));
+
+	netdev_reset_queue(bp->dev);
 }
 
 static int b44_open(struct net_device *dev)
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH] bgmac: add support for Byte Queue Limits
From: Hauke Mehrtens @ 2013-09-28 21:23 UTC (permalink / raw)
  To: davem; +Cc: zajec5, netdev, Hauke Mehrtens

This makes it possible to use some more advanced queuing
techniques with this driver.

When multi queue support will be added some changes to Byte Queue
handling is needed.

Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
---
 drivers/net/ethernet/broadcom/bgmac.c |   10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/drivers/net/ethernet/broadcom/bgmac.c b/drivers/net/ethernet/broadcom/bgmac.c
index 249468f..80d4f52 100644
--- a/drivers/net/ethernet/broadcom/bgmac.c
+++ b/drivers/net/ethernet/broadcom/bgmac.c
@@ -178,6 +178,7 @@ static void bgmac_dma_tx_free(struct bgmac *bgmac, struct bgmac_dma_ring *ring)
 	struct device *dma_dev = bgmac->core->dma_dev;
 	int empty_slot;
 	bool freed = false;
+	unsigned bytes_compl = 0, pkts_compl = 0;
 
 	/* The last slot that hardware didn't consume yet */
 	empty_slot = bgmac_read(bgmac, ring->mmio_base + BGMAC_DMA_TX_STATUS);
@@ -195,6 +196,9 @@ static void bgmac_dma_tx_free(struct bgmac *bgmac, struct bgmac_dma_ring *ring)
 					 slot->skb->len, DMA_TO_DEVICE);
 			slot->dma_addr = 0;
 
+			bytes_compl += slot->skb->len;
+			pkts_compl++;
+
 			/* Free memory! :) */
 			dev_kfree_skb(slot->skb);
 			slot->skb = NULL;
@@ -208,6 +212,8 @@ static void bgmac_dma_tx_free(struct bgmac *bgmac, struct bgmac_dma_ring *ring)
 		freed = true;
 	}
 
+	netdev_completed_queue(bgmac->net_dev, pkts_compl, bytes_compl);
+
 	if (freed && netif_queue_stopped(bgmac->net_dev))
 		netif_wake_queue(bgmac->net_dev);
 }
@@ -988,6 +994,8 @@ static void bgmac_chip_reset(struct bgmac *bgmac)
 	bgmac_miiconfig(bgmac);
 	bgmac_phy_init(bgmac);
 
+	netdev_reset_queue(bgmac->net_dev);
+
 	bgmac->int_status = 0;
 }
 
@@ -1198,6 +1206,8 @@ static netdev_tx_t bgmac_start_xmit(struct sk_buff *skb,
 	struct bgmac *bgmac = netdev_priv(net_dev);
 	struct bgmac_dma_ring *ring;
 
+	netdev_sent_queue(net_dev, skb->len);
+
 	/* No QOS support yet */
 	ring = &bgmac->tx_ring[0];
 	return bgmac_dma_tx_add(bgmac, ring, skb);
-- 
1.7.10.4

^ permalink raw reply related

* Re: [PATCH RESEND] iproute2: GRE over IPv6 tunnel support.
From: Hannes Frederic Sowa @ 2013-09-28 21:27 UTC (permalink / raw)
  To: Dmitry Kozlov; +Cc: Templin, Fred L, Stephen Hemminger, netdev
In-Reply-To: <20130928113251.75738a49@comp1>

On Sat, Sep 28, 2013 at 11:32:51AM +0400, Dmitry Kozlov wrote:
> GRE over IPv6 tunnel support.
> 
> Signed-off-by: Dmitry Kozlov <xeb@mail.ru>

Dmitry, thanks a lot for resending!

One small nit:

lib/ll_types.c would need a new entry in ll_type_n2a for ARPHDR_IP6GRE
so ip link ls can pretty print the link type.

^ permalink raw reply

* Re: [PATCH] net: Delay default_device_exit_batch until no devices are unregistering v2
From: David Miller @ 2013-09-28 22:14 UTC (permalink / raw)
  To: ebiederm; +Cc: fruggeri, edumazet, jiri, alexander.h.duyck, amwang, netdev
In-Reply-To: <87a9j26eca.fsf_-_@xmission.com>

From: ebiederm@xmission.com (Eric W. Biederman)
Date: Mon, 23 Sep 2013 21:19:49 -0700

> 
> There is currently serialization network namespaces exiting and
> network devices exiting as the final part of netdev_run_todo does not
> happen under the rtnl_lock.  This is compounded by the fact that the
> only list of devices unregistering in netdev_run_todo is local to the
> netdev_run_todo.
> 
> This lack of serialization in extreme cases results in network devices
> unregistering in netdev_run_todo after the loopback device of their
> network namespace has been freed (making dst_ifdown unsafe), and after
> the their network namespace has exited (making the NETDEV_UNREGISTER,
> and NETDEV_UNREGISTER_FINAL callbacks unsafe).
> 
> Add the missing serialization by a per network namespace count of how
> many network devices are unregistering and having a wait queue that is
> woken up whenever the count is decreased.  The count and wait queue
> allow default_device_exit_batch to wait until all of the unregistration
> activity for a network namespace has finished before proceeding to
> unregister the loopback device and then allowing the network namespace
> to exit.
> 
> Only a single global wait queue is used because there is a single global
> lock, and there is a single waiter, per network namespace wait queues
> would be a waste of resources.
> 
> The per network namespace count of unregistering devices gives a
> progress guarantee because the number of network devices unregistering
> in an exiting network namespace must ultimately drop to zero (assuming
> network device unregistration completes).
> 
> The basic logic remains the same as in v1.  This patch is now half
> comment and half rtnl_lock_unregistering an expanded version of
> wait_event performs no extra work in the common case where no network
> devices are unregistering when we get to default_device_exit_batch.
> 
> Reported-by: Francesco Ruggeri <fruggeri@aristanetworks.com>
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

Applied, thanks for following up on this Eric.

^ permalink raw reply

* Re: [PATCH] net: net_secret should not depend on TCP
From: David Miller @ 2013-09-28 22:20 UTC (permalink / raw)
  To: hannes; +Cc: eric.dumazet, therbert, netdev, jesse.brandeburg
In-Reply-To: <20130924235147.GA9335@order.stressinduktion.org>

From: Hannes Frederic Sowa <hannes@stressinduktion.org>
Date: Wed, 25 Sep 2013 01:51:47 +0200

> On Tue, Sep 24, 2013 at 06:19:57AM -0700, Eric Dumazet wrote:
>> From: Eric Dumazet <edumazet@google.com>
>> 
>> A host might need net_secret[] and never open a single socket. 
>> 
>> Problem added in commit aebda156a570782
>> ("net: defer net_secret[] initialization")
>> 
>> Based on prior patch from Hannes Frederic Sowa.
>> 
>> Reported-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
>> Signed-off-by: Eric Dumazet <edumazet@google.com>
> 
> FWIW, I reviewed that the loop indeed cannot block and gave the patch a
> quick test drive. Also, the ipv6_hash_secret is only used by the ehash
> functions. So this is a superior patch than mine:
> 
> Acked-by: Hannes Frederic Sowa <hannes@strressinduktion.org>

Applied and queued up for -stable, thanks.

^ permalink raw reply

* Re: [PATCH] b44: add support for Byte Queue Limits
From: Eric Dumazet @ 2013-09-28 22:14 UTC (permalink / raw)
  To: Hauke Mehrtens; +Cc: davem, zambrano, netdev
In-Reply-To: <1380403338-12056-1-git-send-email-hauke@hauke-m.de>

On Sat, 2013-09-28 at 23:22 +0200, Hauke Mehrtens wrote:
> This makes it possible to use some more advanced queuing
> techniques with this driver.
> 
> Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
> ---
>  drivers/net/ethernet/broadcom/b44.c |   10 ++++++++++
>  1 file changed, 10 insertions(+)

Reviewed-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply

* Re: [PATCH net-next v3 0/2] ipv4: per-datagram IP_TOS and IP_TTL via sendmsg()
From: David Miller @ 2013-09-28 22:22 UTC (permalink / raw)
  To: ffusco; +Cc: netdev
In-Reply-To: <cover.1379944641.git.ffusco@redhat.com>

From: Francesco Fusco <ffusco@redhat.com>
Date: Tue, 24 Sep 2013 15:43:07 +0200

> There is no way to set the IP_TOS field on a per-packet basis in IPv4, while
> IPv6 has such a mechanism. Therefore one has to fall back to the setsockopt()
> in case of IPv4. 
> 
> Using the existing per-socket option is not convenient particularly in the
> situations where multiple threads have to use the same socket data requiring
> per-thread TOS values. In fact this would involve calling setsockopt() before
> sendmsg() every time.

Series applied, thank you.

^ permalink raw reply

* Re: [PATCH net-next 0/6] bnx2x: enhancements patch series
From: David Miller @ 2013-09-28 22:24 UTC (permalink / raw)
  To: yuvalmin; +Cc: netdev, ariele, eilong
In-Reply-To: <1380347172-16670-1-git-send-email-yuvalmin@broadcom.com>

From: "Yuval Mintz" <yuvalmin@broadcom.com>
Date: Sat, 28 Sep 2013 08:46:06 +0300

> This patch series contains several modifications to the driver in all areas;
> It also includes a patch which might be considered as bug fixes but since
> it fixes a benign issue we're posting it in this series.
> 
> Please consider applying these patches to `net-next'.

Series applied, thanks.

^ permalink raw reply

* Re: [PATCH] bgmac: add support for Byte Queue Limits
From: Eric Dumazet @ 2013-09-28 22:18 UTC (permalink / raw)
  To: Hauke Mehrtens; +Cc: davem, zajec5, netdev
In-Reply-To: <1380403393-12113-1-git-send-email-hauke@hauke-m.de>

On Sat, 2013-09-28 at 23:23 +0200, Hauke Mehrtens wrote:
> This makes it possible to use some more advanced queuing
> techniques with this driver.

> @@ -1198,6 +1206,8 @@ static netdev_tx_t bgmac_start_xmit(struct sk_buff *skb,
>  	struct bgmac *bgmac = netdev_priv(net_dev);
>  	struct bgmac_dma_ring *ring;
>  
> +	netdev_sent_queue(net_dev, skb->len);
> +
>  	/* No QOS support yet */
>  	ring = &bgmac->tx_ring[0];
>  	return bgmac_dma_tx_add(bgmac, ring, skb);


Well, no.

You must call netdev_sent_queue() only if packet is really queued.

^ permalink raw reply

* Re: [PATCH net-next 0/2] bonding: fix 3ad slave (de)init
From: David Miller @ 2013-09-28 22:28 UTC (permalink / raw)
  To: vfalico; +Cc: netdev, fubar, andy
In-Reply-To: <1380287458-3488-1-git-send-email-vfalico@redhat.com>

From: Veaceslav Falico <vfalico@redhat.com>
Date: Fri, 27 Sep 2013 15:10:56 +0200

> After 1f718f0f4f97145f4072d2d72dcf85069ca7226d ("bonding: populate
> neighbour's private on enslave") the (de)linking of slaves in
> bond_enslave/bond_release_one happens in the correct places - after we've
> completely initialized the slave (for bond_enslave) and before we've even
> began to de-init the slave (for bond_release_one, respectively).
> 
> This was done to prevent any RCU readers to see the half-initialized slave
> (because the RCU readers aren't blocked by bond->lock or rtnl_lock
> usually).
> 
> However, 802.3ad logic, in several places, relied on the fact that the
> slave is still linked to the bond.
> 
> Fix it by correctly handling these cases - we shouldn't rely that the slave
> is linked before fully initialized and, respectively, that the slave is
> still linked while it's being removed.
> 
> CC: Jay Vosburgh <fubar@us.ibm.com>
> CC: Andy Gospodarek <andy@greyhouse.net>
> Signed-off-by: Veaceslav Falico <vfalico@redhat.com>

Series applied.

^ permalink raw reply

* Re: [PATCH net-next 0/9] bonding: remove bond_next_slave()
From: David Miller @ 2013-09-28 22:29 UTC (permalink / raw)
  To: vfalico; +Cc: netdev, nikolay, bhutchings, fubar, andy
In-Reply-To: <1380291125-5671-1-git-send-email-vfalico@redhat.com>

From: Veaceslav Falico <vfalico@redhat.com>
Date: Fri, 27 Sep 2013 16:11:56 +0200

> (on top of "[PATCH net-next 0/2] bonding: fix 3ad slave (de)init" - the
> patchset is essential)
> 
> Hi,
> 
> As Ben Hutchings and Nikolay Aleksandrov correctly noted -
> bond_next_slave() already is not O(1), but rather O(n), so it's only useful
> for one-off operations and shouldn't be used widely - for example, for list
> traversal, which will take O(n^2) time, which will be disastrous for any
> hot path with a large number of slaves.
> 
> Currently, bond_next_slave() is used several times in 802.3ad and for
> seq_read - bond_info_seq_next().
> 
> The 802.3ad part uses it only in constructs like:
> 
> for (X = __get_first_X(); X; X = __get_next_X()) {
> 
> where __get_next_X() uses bond_next_slave().
> 
> This for can (and should) actually be replaced by the standard
> 
> bond_for_each_slave(bond, slave, iter) {
> 	X = __get_X_by_slave(slave);
> 
> it's faster, easier to read, debug and maintain. Also, removes a lot of
> code lines.
> 
> After replacing it, the only user of bond_next_slave() is
> bond_info_seq_next() - which can, actually, implement it by itself, and not
> call another function.
> 
> So, that way, we can completely remove the bond_next_slave(), cleanup and
> optimize a bit.
> 
> p.s. the 802.3ad code is horrible, both style-wise and from the logical
> point of view - so I've decided to *not* change anything except that what
> this patch is intended to provide. The cleanup and some refactoring should
> be done in another patchset, which I've began to work on already.
> 
> Thank you!
> 
> CC: Jay Vosburgh <fubar@us.ibm.com>
> CC: Andy Gospodarek <andy@greyhouse.net>
> Signed-off-by: Veaceslav Falico <vfalico@redhat.com>

Also applied, thanks.

^ permalink raw reply

* Re: [PATCH v2] declance: Remove `incompatible pointer type' warnings
From: David Miller @ 2013-09-28 22:33 UTC (permalink / raw)
  To: macro; +Cc: sergei.shtylyov, netdev
In-Reply-To: <alpine.LFD.2.03.1309222113300.16797@linux-mips.org>

From: "Maciej W. Rozycki" <macro@linux-mips.org>
Date: Sun, 22 Sep 2013 21:19:01 +0100 (BST)

> Revert damage caused by 43d620c82985b19008d87a437b4cf83f356264f7 
> [drivers/net: Remove casts of void *]:
> 
> .../declance.c: In function 'cp_to_buf':
> .../declance.c:347: warning: assignment from incompatible pointer type
> .../declance.c:348: warning: assignment from incompatible pointer type
> .../declance.c: In function 'cp_from_buf':
> .../declance.c:406: warning: assignment from incompatible pointer type
> .../declance.c:407: warning: assignment from incompatible pointer type
> 
> Also add a `const' qualifier where applicable and adjust formatting.
> 
> Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
> ---
>> Applied to net-next
> 
>  Thanks, by Sergei's request please use this version instead that has the 
> reference to the original commit updated (no change to the patch itself).

I can't undo the original commit once it has been installed in my tree.

^ permalink raw reply

* Re: [PATCH v2 net-next] net: introduce SO_MAX_PACING_RATE
From: David Miller @ 2013-09-28 22:36 UTC (permalink / raw)
  To: eric.dumazet; +Cc: netdev, sesse, mtk.manpages
In-Reply-To: <1380036052.3165.71.camel@edumazet-glaptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Tue, 24 Sep 2013 08:20:52 -0700

> From: Eric Dumazet <edumazet@google.com>
> 
> As mentioned in commit afe4fd062416b ("pkt_sched: fq: Fair Queue packet
> scheduler"), this patch adds a new socket option.
> 
> SO_MAX_PACING_RATE offers the application the ability to cap the
> rate computed by transport layer. Value is in bytes per second.
> 
> u32 val = 1000000;
> setsockopt(sockfd, SOL_SOCKET, SO_MAX_PACING_RATE, &val, sizeof(val));
> 
> To be effectively paced, a flow must use FQ packet scheduler.
> 
> Note that a packet scheduler takes into account the headers for its
> computations. The effective payload rate depends on MSS and retransmits
> if any.
> 
> I chose to make this pacing rate a SOL_SOCKET option instead of a
> TCP one because this can be used by other protocols.
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Applied, thanks Eric.

^ permalink raw reply

* [PATCH v2] bgmac: add support for Byte Queue Limits
From: Hauke Mehrtens @ 2013-09-28 22:35 UTC (permalink / raw)
  To: davem; +Cc: zajec5, eric.dumazet, netdev, Hauke Mehrtens

This makes it possible to use some more advanced queuing
techniques with this driver.

When multi queue support will be added some changes to Byte Queue
handling is needed.

Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
---
 drivers/net/ethernet/broadcom/bgmac.c |   10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/drivers/net/ethernet/broadcom/bgmac.c b/drivers/net/ethernet/broadcom/bgmac.c
index 249468f..e5519f1 100644
--- a/drivers/net/ethernet/broadcom/bgmac.c
+++ b/drivers/net/ethernet/broadcom/bgmac.c
@@ -164,6 +164,8 @@ static netdev_tx_t bgmac_dma_tx_add(struct bgmac *bgmac,
 	if (--free_slots == 1)
 		netif_stop_queue(net_dev);
 
+	netdev_sent_queue(net_dev, skb->len);
+
 	return NETDEV_TX_OK;
 
 err_stop_drop:
@@ -178,6 +180,7 @@ static void bgmac_dma_tx_free(struct bgmac *bgmac, struct bgmac_dma_ring *ring)
 	struct device *dma_dev = bgmac->core->dma_dev;
 	int empty_slot;
 	bool freed = false;
+	unsigned bytes_compl = 0, pkts_compl = 0;
 
 	/* The last slot that hardware didn't consume yet */
 	empty_slot = bgmac_read(bgmac, ring->mmio_base + BGMAC_DMA_TX_STATUS);
@@ -195,6 +198,9 @@ static void bgmac_dma_tx_free(struct bgmac *bgmac, struct bgmac_dma_ring *ring)
 					 slot->skb->len, DMA_TO_DEVICE);
 			slot->dma_addr = 0;
 
+			bytes_compl += slot->skb->len;
+			pkts_compl++;
+
 			/* Free memory! :) */
 			dev_kfree_skb(slot->skb);
 			slot->skb = NULL;
@@ -208,6 +214,8 @@ static void bgmac_dma_tx_free(struct bgmac *bgmac, struct bgmac_dma_ring *ring)
 		freed = true;
 	}
 
+	netdev_completed_queue(bgmac->net_dev, pkts_compl, bytes_compl);
+
 	if (freed && netif_queue_stopped(bgmac->net_dev))
 		netif_wake_queue(bgmac->net_dev);
 }
@@ -988,6 +996,8 @@ static void bgmac_chip_reset(struct bgmac *bgmac)
 	bgmac_miiconfig(bgmac);
 	bgmac_phy_init(bgmac);
 
+	netdev_reset_queue(bgmac->net_dev);
+
 	bgmac->int_status = 0;
 }
 
-- 
1.7.10.4

^ permalink raw reply related

* Re: [PATCH v2] bgmac: add support for Byte Queue Limits
From: Eric Dumazet @ 2013-09-28 23:03 UTC (permalink / raw)
  To: Hauke Mehrtens; +Cc: davem, zajec5, netdev
In-Reply-To: <1380407726-20691-1-git-send-email-hauke@hauke-m.de>

On Sun, 2013-09-29 at 00:35 +0200, Hauke Mehrtens wrote:
> This makes it possible to use some more advanced queuing
> techniques with this driver.
> 
> When multi queue support will be added some changes to Byte Queue
> handling is needed.
> 
> Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
> ---
>  drivers/net/ethernet/broadcom/bgmac.c |   10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/drivers/net/ethernet/broadcom/bgmac.c b/drivers/net/ethernet/broadcom/bgmac.c
> index 249468f..e5519f1 100644
> --- a/drivers/net/ethernet/broadcom/bgmac.c
> +++ b/drivers/net/ethernet/broadcom/bgmac.c
> @@ -164,6 +164,8 @@ static netdev_tx_t bgmac_dma_tx_add(struct bgmac *bgmac,
>  	if (--free_slots == 1)
>  		netif_stop_queue(net_dev);
>  
> +	netdev_sent_queue(net_dev, skb->len);
> +
>  	return NETDEV_TX_OK;
>  

Unfortunately, skb->len is unsafe : hardware might already sent the
packet and TX completion have freed it.

You must cache skb->len in a variable before allowing hardware to send
the frame.

^ permalink raw reply

* Re: [PATCH v2] bgmac: add support for Byte Queue Limits
From: Eric Dumazet @ 2013-09-28 23:22 UTC (permalink / raw)
  To: Hauke Mehrtens; +Cc: davem, zajec5, netdev
In-Reply-To: <1380409400.3596.9.camel@edumazet-glaptop.roam.corp.google.com>

On Sat, 2013-09-28 at 16:03 -0700, Eric Dumazet wrote:

> Unfortunately, skb->len is unsafe : hardware might already sent the
> packet and TX completion have freed it.
> 
> You must cache skb->len in a variable before allowing hardware to send
> the frame.
> 

More exactly you must call netdev_sent_queue() _before_ giving skb to
hardware, or netdev_completed_queue() might panic.

^ permalink raw reply

* Re: [PATCH v2] declance: Remove `incompatible pointer type' warnings
From: Maciej W. Rozycki @ 2013-09-29  0:09 UTC (permalink / raw)
  To: David Miller; +Cc: sergei.shtylyov, netdev
In-Reply-To: <20130928.153316.1766304011894422920.davem@davemloft.net>

On Sat, 28 Sep 2013, David Miller wrote:

> >  Thanks, by Sergei's request please use this version instead that has the 
> > reference to the original commit updated (no change to the patch itself).
> 
> I can't undo the original commit once it has been installed in my tree.

 No worries, and good to know, thanks.

  Maciej

^ permalink raw reply

* Re: [PATCH v5] IPv6 NAT: Do not drop DNATed 6to4/6rd packets
From: Joe Perches @ 2013-09-29  0:24 UTC (permalink / raw)
  To: David Miller; +Cc: hannes, catab, netdev, yoshfuji
In-Reply-To: <20130928.155750.1130089685321379918.davem@davemloft.net>

On Sat, 2013-09-28 at 15:57 -0400, David Miller wrote:
> Applied, but Catalin please strictly refer to changes in the following
> precise format:
> 
>         commit $SHA1_ID ("Commit message header line text")
> 
> Because SHA1_IDs are ambiguous, especially when the change in question
> is backported into various -stable branches.

There are now enough commits that using only
8 byte SHA1 IDs generates some collisions so
please use 12 bytes or so of the SHA1.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox