Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH net-next-2.6] bridge: add __rcu annotations
From: Eric Dumazet @ 2010-11-13 17:58 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: David Miller, netdev
In-Reply-To: <20101113093545.6fe9c077@nehalam>

Le samedi 13 novembre 2010 à 09:35 -0800, Stephen Hemminger a écrit :
> On Sat, 13 Nov 2010 09:15:28 +0100
> Eric Dumazet <eric.dumazet@gmail.com> wrote:
> 
> > diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> > index 578debb..ffbd177 100644
> > --- a/include/linux/netdevice.h
> > +++ b/include/linux/netdevice.h
> > @@ -996,7 +996,10 @@ struct net_device {
> >  #endif
> >  
> >  	rx_handler_func_t	*rx_handler;
> > -	void			*rx_handler_data;
> > +	union {
> > +		void				*rx_handler_data;
> > +		struct net_bridge_port __rcu	*br_port_rcu;
> > +	};
> >  
> >  	struct netdev_queue __rcu *ingress_queue;
> 
> I don't like making the generic hook typed again.
> We don't do this for other callbacks, timers, workqueues, ...
> Why is it necessary for RCU notation.
> 

because rcu_dereference() needs the type for __CHECKER__/sparse checks

#define __rcu_dereference_check(p, c, space) \
        ({ \
                typeof(*p) *_________p1 = (typeof(*p)*__force )ACCESS_ONCE(p); \
                rcu_lockdep_assert(c); \
                rcu_dereference_sparse(p, space); \
                smp_read_barrier_depends(); \
                ((typeof(*p) __force __kernel *)(_________p1)); \
        })

So using a "void *ptr" is not an option

Its also cleaner to use

rcu_dereference(dev->br_port_rcu)

instead of 

(struct net_bridge_port *)rcu_dereference(dev->rx_handler_data)




^ permalink raw reply

* Re: [PATCH net-next-2.6] bridge: add __rcu annotations
From: Stephen Hemminger @ 2010-11-13 17:35 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, netdev
In-Reply-To: <1289636128.2743.15.camel@edumazet-laptop>

On Sat, 13 Nov 2010 09:15:28 +0100
Eric Dumazet <eric.dumazet@gmail.com> wrote:

> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index 578debb..ffbd177 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -996,7 +996,10 @@ struct net_device {
>  #endif
>  
>  	rx_handler_func_t	*rx_handler;
> -	void			*rx_handler_data;
> +	union {
> +		void				*rx_handler_data;
> +		struct net_bridge_port __rcu	*br_port_rcu;
> +	};
>  
>  	struct netdev_queue __rcu *ingress_queue;

I don't like making the generic hook typed again.
We don't do this for other callbacks, timers, workqueues, ...
Why is it necessary for RCU notation.

-- 

^ permalink raw reply

* YOU HAVE WON $1,900,000 USD
From: MICROSOFT AWARD 2010 PROMO @ 2010-11-13 13:55 UTC (permalink / raw)


You have just been selected as a lucky winner from the Microsoft monthly 
draws and you have won $1,900,000 USD, please contact
(mic.awardmicrosoft21@yahoo.com.hk ) with your name address, and telephone numbers 
for claims for claims. 

Congratulation!! Once again. 

Yours in service, 
The Award Team (Microsoft Corporation)

^ permalink raw reply

* Re: ethtool maintenance
From: Ben Hutchings @ 2010-11-13 14:30 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: NetDev, David Miller, Peter Martuccelli
In-Reply-To: <4CDE502D.6000506@garzik.org>

On Sat, 2010-11-13 at 03:45 -0500, Jeff Garzik wrote:
> So, a recent emergency surgery has really set me back, work-wise. 
> ethtool [the userspace utility] 2.6.36 is still not out, and personally 
> it remains a third or fourth priority.
> 
> While it's likely that I could get back to ethtool's patch queue next 
> week, it continues to be low man on the totem pole.  Seems only fair to 
> see if anyone else is interested in maintaining it.
> 
> I emailed Ben Hutchings privately about this, but haven't heard back, so 
> I thought I'd go ahead and email the list.
> 
> Anyone interested?

I am interested, but will need to clear it with my boss before making
such a commitment.

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply

* [PATCH update] firewire: net: throttle TX queue before running out of tlabels
From: Stefan Richter @ 2010-11-13 12:16 UTC (permalink / raw)
  To: linux1394-devel; +Cc: linux-kernel, netdev
In-Reply-To: <tkrat.39c164e4c52e2fc8@s5r6.in-berlin.de>

This prevents firewire-net from submitting write requests in fast
succession until failure due to all 64 transaction labels used up for
unfinished split transactions.  The netif_stop/wake_queue API is used
for this purpose.

Without this stop/wake mechanism, datagrams were simply lost whenever
the tlabel pool was exhausted.  Plus, tlabel exhaustion by firewire-net
also prevented other unrelated outbound transactions to be initiated.

The high watermark is set to considerably less than 64 (I chose 8)
because peers which run current Linux firewire-ohci are still easily
saturated by this (i.e. some datagrams are dropped with ack-busy-*
events), depending on the hardware at transmitter and receiver side.

I did not see changes to resulting throughput that were discernible from
the usual measuring noise.  To do:  Revisit the choice of queue depth
once firewire-ohci's AR DMA was improved.

I wonder what a good net_device.tx_queue_len value is.  I just set it
to the same value as the chosen watermark for now.

Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
---
Update:  Stricter version with an early NETDEV_TX_BUSY return if the
.ndo_start_xmit method is called while the driver is stopping (or has
stopped) the transmit queue.  Thus there can really be never more than
FWNET_MAX_QUEUED_DATAGRAMS of pending outbound 1394 transactions.

 drivers/firewire/net.c |   53 ++++++++++++++++++++++++++++++-----------
 1 file changed, 39 insertions(+), 14 deletions(-)

Index: b/drivers/firewire/net.c
===================================================================
--- a/drivers/firewire/net.c
+++ b/drivers/firewire/net.c
@@ -28,8 +28,15 @@
 #include <asm/unaligned.h>
 #include <net/arp.h>
 
-#define FWNET_MAX_FRAGMENTS	25	/* arbitrary limit */
-#define FWNET_ISO_PAGE_COUNT	(PAGE_SIZE < 16 * 1024 ? 4 : 2)
+/* rx limits */
+#define FWNET_MAX_FRAGMENTS		25 /* arbitrary limit */
+#define FWNET_ISO_PAGE_COUNT		(PAGE_SIZE < 16*1024 ? 4 : 2)
+
+/* tx limits */
+#define FWNET_MAX_QUEUED_DATAGRAMS	8 /* should keep AT DMA busy enough */
+#define FWNET_MIN_QUEUED_DATAGRAMS	2
+#define FWNET_TX_QUEUE_STOPPED		FWNET_MAX_QUEUED_DATAGRAMS
+#define FWNET_TX_QUEUE_LEN		FWNET_MAX_QUEUED_DATAGRAMS /* ? */
 
 #define IEEE1394_BROADCAST_CHANNEL	31
 #define IEEE1394_ALL_NODES		(0xffc0 | 0x003f)
@@ -892,6 +899,16 @@ static void fwnet_free_ptask(struct fwne
 	kmem_cache_free(fwnet_packet_task_cache, ptask);
 }
 
+/* Caller must hold dev->lock. */
+static void dec_queued_datagrams(struct fwnet_device *dev)
+{
+	if (--dev->queued_datagrams ==
+			FWNET_MIN_QUEUED_DATAGRAMS + FWNET_TX_QUEUE_STOPPED) {
+		dev->queued_datagrams -= FWNET_TX_QUEUE_STOPPED;
+		netif_wake_queue(dev->netdev);
+	}
+}
+
 static int fwnet_send_packet(struct fwnet_packet_task *ptask);
 
 static void fwnet_transmit_packet_done(struct fwnet_packet_task *ptask)
@@ -908,7 +925,7 @@ static void fwnet_transmit_packet_done(s
 	/* Check whether we or the networking TX soft-IRQ is last user. */
 	free = (ptask->outstanding_pkts == 0 && ptask->enqueued);
 	if (free)
-		dev->queued_datagrams--;
+		dec_queued_datagrams(dev);
 
 	if (ptask->outstanding_pkts == 0) {
 		dev->netdev->stats.tx_packets++;
@@ -979,7 +996,7 @@ static void fwnet_transmit_packet_failed
 	/* Check whether we or the networking TX soft-IRQ is last user. */
 	free = ptask->enqueued;
 	if (free)
-		dev->queued_datagrams--;
+		dec_queued_datagrams(dev);
 
 	dev->netdev->stats.tx_dropped++;
 	dev->netdev->stats.tx_errors++;
@@ -1064,7 +1081,7 @@ static int fwnet_send_packet(struct fwne
 		if (!free)
 			ptask->enqueued = true;
 		else
-			dev->queued_datagrams--;
+			dec_queued_datagrams(dev);
 
 		spin_unlock_irqrestore(&dev->lock, flags);
 
@@ -1083,7 +1100,7 @@ static int fwnet_send_packet(struct fwne
 	if (!free)
 		ptask->enqueued = true;
 	else
-		dev->queued_datagrams--;
+		dec_queued_datagrams(dev);
 
 	spin_unlock_irqrestore(&dev->lock, flags);
 
@@ -1249,6 +1266,14 @@ static netdev_tx_t fwnet_tx(struct sk_bu
 	struct fwnet_peer *peer;
 	unsigned long flags;
 
+	spin_lock_irqsave(&dev->lock, flags);
+
+	if (dev->queued_datagrams > FWNET_MAX_QUEUED_DATAGRAMS) {
+		spin_unlock_irqrestore(&dev->lock, flags);
+
+		return NETDEV_TX_BUSY;
+	}
+
 	ptask = kmem_cache_alloc(fwnet_packet_task_cache, GFP_ATOMIC);
 	if (ptask == NULL)
 		goto fail;
@@ -1267,9 +1292,6 @@ static netdev_tx_t fwnet_tx(struct sk_bu
 	proto = hdr_buf.h_proto;
 	dg_size = skb->len;
 
-	/* serialize access to peer, including peer->datagram_label */
-	spin_lock_irqsave(&dev->lock, flags);
-
 	/*
 	 * Set the transmission type for the packet.  ARP packets and IP
 	 * broadcast packets are sent via GASP.
@@ -1291,7 +1313,7 @@ static netdev_tx_t fwnet_tx(struct sk_bu
 
 		peer = fwnet_peer_find_by_guid(dev, be64_to_cpu(guid));
 		if (!peer || peer->fifo == FWNET_NO_FIFO_ADDR)
-			goto fail_unlock;
+			goto fail;
 
 		generation         = peer->generation;
 		dest_node          = peer->node_id;
@@ -1345,7 +1367,10 @@ static netdev_tx_t fwnet_tx(struct sk_bu
 		max_payload += RFC2374_FRAG_HDR_SIZE;
 	}
 
-	dev->queued_datagrams++;
+	if (++dev->queued_datagrams == FWNET_MAX_QUEUED_DATAGRAMS) {
+		dev->queued_datagrams += FWNET_TX_QUEUE_STOPPED;
+		netif_stop_queue(dev->netdev);
+	}
 
 	spin_unlock_irqrestore(&dev->lock, flags);
 
@@ -1356,9 +1381,9 @@ static netdev_tx_t fwnet_tx(struct sk_bu
 
 	return NETDEV_TX_OK;
 
- fail_unlock:
-	spin_unlock_irqrestore(&dev->lock, flags);
  fail:
+	spin_unlock_irqrestore(&dev->lock, flags);
+
 	if (ptask)
 		kmem_cache_free(fwnet_packet_task_cache, ptask);
 
@@ -1415,7 +1440,7 @@ static void fwnet_init_dev(struct net_de
 	net->addr_len		= FWNET_ALEN;
 	net->hard_header_len	= FWNET_HLEN;
 	net->type		= ARPHRD_IEEE1394;
-	net->tx_queue_len	= 10;
+	net->tx_queue_len	= FWNET_TX_QUEUE_LEN;
 	SET_ETHTOOL_OPS(net, &fwnet_ethtool_ops);
 }
 

-- 
Stefan Richter
-=====-==-=- =-== -==-=
http://arcgraph.de/sr/


^ permalink raw reply

* ethtool maintenance
From: Jeff Garzik @ 2010-11-13  8:45 UTC (permalink / raw)
  To: NetDev; +Cc: David Miller, Peter Martuccelli, Ben Hutchings


So, a recent emergency surgery has really set me back, work-wise. 
ethtool [the userspace utility] 2.6.36 is still not out, and personally 
it remains a third or fourth priority.

While it's likely that I could get back to ethtool's patch queue next 
week, it continues to be low man on the totem pole.  Seems only fair to 
see if anyone else is interested in maintaining it.

I emailed Ben Hutchings privately about this, but haven't heard back, so 
I thought I'd go ahead and email the list.

Anyone interested?

	Jeff





^ permalink raw reply

* [RFC] pull linux-2.6 into net-next-2.6 ?
From: Eric Dumazet @ 2010-11-13  8:21 UTC (permalink / raw)
  To: David Miller; +Cc: netdev

Hi David

Andrew pushed to Linus the patch introducing atomic_inc_not_zero_hint()

(commit 3f9d35b9514da675)

Would it be possible to get it in net-next-2.6 so that I can start using
it in network stack ?

Thanks !



^ permalink raw reply

* [PATCH net-next-2.6] bridge: add __rcu annotations
From: Eric Dumazet @ 2010-11-13  8:15 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Stephen Hemminger

Add modern __rcu annotations to bridge code, to reduce sparse errors,
and self document code.
(CONFIG_SPARSE_RCU_POINTER=y)

Use of an anonymous union in net_device to get proper type for
net_dev->br_port_rcu, to get cleaner br_port_get_rcu() definition.

br_port_get() renamed to br_port_get_rtnl() to make clear RTNL is held.

Note: Add br_should_route_hook_t typedef, this is the only way we can
get a clean RCU implementation for function pointer.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Stephen Hemminger <shemminger@linux-foundation.org>
---
 include/linux/if_bridge.h             |    3 
 include/linux/netdevice.h             |    5 +
 net/bridge/br.c                       |    5 -
 net/bridge/br_if.c                    |    2 
 net/bridge/br_input.c                 |    4 -
 net/bridge/br_multicast.c             |   78 +++++++++++++++---------
 net/bridge/br_netlink.c               |    4 -
 net/bridge/br_notify.c                |    4 -
 net/bridge/br_private.h               |   11 +--
 net/bridge/netfilter/ebtable_broute.c |    2 
 10 files changed, 72 insertions(+), 46 deletions(-)

diff --git a/include/linux/if_bridge.h b/include/linux/if_bridge.h
index 0d241a5..dc813e9 100644
--- a/include/linux/if_bridge.h
+++ b/include/linux/if_bridge.h
@@ -102,7 +102,8 @@ struct __fdb_entry {
 #include <linux/netdevice.h>
 
 extern void brioctl_set(int (*ioctl_hook)(struct net *, unsigned int, void __user *));
-extern int (*br_should_route_hook)(struct sk_buff *skb);
+typedef int (*br_should_route_hook_t)(struct sk_buff *skb);
+extern br_should_route_hook_t __rcu *br_should_route_hook;
 
 #endif
 
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 578debb..ffbd177 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -996,7 +996,10 @@ struct net_device {
 #endif
 
 	rx_handler_func_t	*rx_handler;
-	void			*rx_handler_data;
+	union {
+		void				*rx_handler_data;
+		struct net_bridge_port __rcu	*br_port_rcu;
+	};
 
 	struct netdev_queue __rcu *ingress_queue;
 
diff --git a/net/bridge/br.c b/net/bridge/br.c
index c8436fa..9fad125 100644
--- a/net/bridge/br.c
+++ b/net/bridge/br.c
@@ -22,7 +22,8 @@
 
 #include "br_private.h"
 
-int (*br_should_route_hook)(struct sk_buff *skb);
+br_should_route_hook_t __rcu *br_should_route_hook __read_mostly;
+EXPORT_SYMBOL(br_should_route_hook);
 
 static const struct stp_proto br_stp_proto = {
 	.rcv	= br_stp_rcv,
@@ -102,8 +103,6 @@ static void __exit br_deinit(void)
 	br_fdb_fini();
 }
 
-EXPORT_SYMBOL(br_should_route_hook);
-
 module_init(br_init)
 module_exit(br_deinit)
 MODULE_LICENSE("GPL");
diff --git a/net/bridge/br_if.c b/net/bridge/br_if.c
index 89ad25a..3a611d2 100644
--- a/net/bridge/br_if.c
+++ b/net/bridge/br_if.c
@@ -478,7 +478,7 @@ int br_del_if(struct net_bridge *br, struct net_device *dev)
 	if (!br_port_exists(dev))
 		return -EINVAL;
 
-	p = br_port_get(dev);
+	p = br_port_get_rtnl(dev);
 	if (p->br != br)
 		return -EINVAL;
 
diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c
index 25207a1..948c921 100644
--- a/net/bridge/br_input.c
+++ b/net/bridge/br_input.c
@@ -139,7 +139,7 @@ struct sk_buff *br_handle_frame(struct sk_buff *skb)
 {
 	struct net_bridge_port *p;
 	const unsigned char *dest = eth_hdr(skb)->h_dest;
-	int (*rhook)(struct sk_buff *skb);
+	br_should_route_hook_t *rhook;
 
 	if (unlikely(skb->pkt_type == PACKET_LOOPBACK))
 		return skb;
@@ -174,7 +174,7 @@ forward:
 	case BR_STATE_FORWARDING:
 		rhook = rcu_dereference(br_should_route_hook);
 		if (rhook != NULL) {
-			if (rhook(skb))
+			if ((*rhook)(skb))
 				return skb;
 			dest = eth_hdr(skb)->h_dest;
 		}
diff --git a/net/bridge/br_multicast.c b/net/bridge/br_multicast.c
index eb5b256..326e599 100644
--- a/net/bridge/br_multicast.c
+++ b/net/bridge/br_multicast.c
@@ -33,6 +33,9 @@
 
 #include "br_private.h"
 
+#define mlock_dereference(X, br) \
+	rcu_dereference_protected(X, lockdep_is_held(&br->multicast_lock))
+
 #if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE)
 static inline int ipv6_is_local_multicast(const struct in6_addr *addr)
 {
@@ -135,7 +138,7 @@ static struct net_bridge_mdb_entry *br_mdb_ip6_get(
 struct net_bridge_mdb_entry *br_mdb_get(struct net_bridge *br,
 					struct sk_buff *skb)
 {
-	struct net_bridge_mdb_htable *mdb = br->mdb;
+	struct net_bridge_mdb_htable *mdb = rcu_dereference(br->mdb);
 	struct br_ip ip;
 
 	if (br->multicast_disabled)
@@ -235,7 +238,8 @@ static void br_multicast_group_expired(unsigned long data)
 	if (mp->ports)
 		goto out;
 
-	mdb = br->mdb;
+	mdb = mlock_dereference(br->mdb, br);
+
 	hlist_del_rcu(&mp->hlist[mdb->ver]);
 	mdb->size--;
 
@@ -249,16 +253,20 @@ out:
 static void br_multicast_del_pg(struct net_bridge *br,
 				struct net_bridge_port_group *pg)
 {
-	struct net_bridge_mdb_htable *mdb = br->mdb;
+	struct net_bridge_mdb_htable *mdb;
 	struct net_bridge_mdb_entry *mp;
 	struct net_bridge_port_group *p;
-	struct net_bridge_port_group **pp;
+	struct net_bridge_port_group __rcu **pp;
+
+	mdb = mlock_dereference(br->mdb, br);
 
 	mp = br_mdb_ip_get(mdb, &pg->addr);
 	if (WARN_ON(!mp))
 		return;
 
-	for (pp = &mp->ports; (p = *pp); pp = &p->next) {
+	for (pp = &mp->ports;
+	     (p = mlock_dereference(*pp, br)) != NULL;
+	     pp = &p->next) {
 		if (p != pg)
 			continue;
 
@@ -294,10 +302,10 @@ out:
 	spin_unlock(&br->multicast_lock);
 }
 
-static int br_mdb_rehash(struct net_bridge_mdb_htable **mdbp, int max,
+static int br_mdb_rehash(struct net_bridge_mdb_htable __rcu **mdbp, int max,
 			 int elasticity)
 {
-	struct net_bridge_mdb_htable *old = *mdbp;
+	struct net_bridge_mdb_htable *old = rcu_dereference_protected(*mdbp, 1);
 	struct net_bridge_mdb_htable *mdb;
 	int err;
 
@@ -569,7 +577,7 @@ static struct net_bridge_mdb_entry *br_multicast_get_group(
 	struct net_bridge *br, struct net_bridge_port *port,
 	struct br_ip *group, int hash)
 {
-	struct net_bridge_mdb_htable *mdb = br->mdb;
+	struct net_bridge_mdb_htable *mdb;
 	struct net_bridge_mdb_entry *mp;
 	struct hlist_node *p;
 	unsigned count = 0;
@@ -577,6 +585,7 @@ static struct net_bridge_mdb_entry *br_multicast_get_group(
 	int elasticity;
 	int err;
 
+	mdb = rcu_dereference_protected(br->mdb, 1);
 	hlist_for_each_entry(mp, p, &mdb->mhash[hash], hlist[mdb->ver]) {
 		count++;
 		if (unlikely(br_ip_equal(group, &mp->addr)))
@@ -642,10 +651,11 @@ static struct net_bridge_mdb_entry *br_multicast_new_group(
 	struct net_bridge *br, struct net_bridge_port *port,
 	struct br_ip *group)
 {
-	struct net_bridge_mdb_htable *mdb = br->mdb;
+	struct net_bridge_mdb_htable *mdb;
 	struct net_bridge_mdb_entry *mp;
 	int hash;
 
+	mdb = rcu_dereference_protected(br->mdb, 1);
 	if (!mdb) {
 		if (br_mdb_rehash(&br->mdb, BR_HASH_SIZE, 0))
 			return NULL;
@@ -660,7 +670,7 @@ static struct net_bridge_mdb_entry *br_multicast_new_group(
 
 	case -EAGAIN:
 rehash:
-		mdb = br->mdb;
+		mdb = rcu_dereference_protected(br->mdb, 1);
 		hash = br_ip_hash(mdb, group);
 		break;
 
@@ -692,7 +702,7 @@ static int br_multicast_add_group(struct net_bridge *br,
 {
 	struct net_bridge_mdb_entry *mp;
 	struct net_bridge_port_group *p;
-	struct net_bridge_port_group **pp;
+	struct net_bridge_port_group __rcu **pp;
 	unsigned long now = jiffies;
 	int err;
 
@@ -712,7 +722,9 @@ static int br_multicast_add_group(struct net_bridge *br,
 		goto out;
 	}
 
-	for (pp = &mp->ports; (p = *pp); pp = &p->next) {
+	for (pp = &mp->ports;
+	     (p = mlock_dereference(*pp, br)) != NULL;
+	     pp = &p->next) {
 		if (p->port == port)
 			goto found;
 		if ((unsigned long)p->port < (unsigned long)port)
@@ -1106,7 +1118,7 @@ static int br_ip4_multicast_query(struct net_bridge *br,
 	struct net_bridge_mdb_entry *mp;
 	struct igmpv3_query *ih3;
 	struct net_bridge_port_group *p;
-	struct net_bridge_port_group **pp;
+	struct net_bridge_port_group __rcu **pp;
 	unsigned long max_delay;
 	unsigned long now = jiffies;
 	__be32 group;
@@ -1145,7 +1157,7 @@ static int br_ip4_multicast_query(struct net_bridge *br,
 	if (!group)
 		goto out;
 
-	mp = br_mdb_ip4_get(br->mdb, group);
+	mp = br_mdb_ip4_get(mlock_dereference(br->mdb, br), group);
 	if (!mp)
 		goto out;
 
@@ -1157,7 +1169,9 @@ static int br_ip4_multicast_query(struct net_bridge *br,
 	     try_to_del_timer_sync(&mp->timer) >= 0))
 		mod_timer(&mp->timer, now + max_delay);
 
-	for (pp = &mp->ports; (p = *pp); pp = &p->next) {
+	for (pp = &mp->ports;
+	     (p = mlock_dereference(*pp, br)) != NULL;
+	     pp = &p->next) {
 		if (timer_pending(&p->timer) ?
 		    time_after(p->timer.expires, now + max_delay) :
 		    try_to_del_timer_sync(&p->timer) >= 0)
@@ -1178,7 +1192,8 @@ static int br_ip6_multicast_query(struct net_bridge *br,
 	struct mld_msg *mld = (struct mld_msg *) icmp6_hdr(skb);
 	struct net_bridge_mdb_entry *mp;
 	struct mld2_query *mld2q;
-	struct net_bridge_port_group *p, **pp;
+	struct net_bridge_port_group *p;
+	struct net_bridge_port_group __rcu **pp;
 	unsigned long max_delay;
 	unsigned long now = jiffies;
 	struct in6_addr *group = NULL;
@@ -1214,7 +1229,7 @@ static int br_ip6_multicast_query(struct net_bridge *br,
 	if (!group)
 		goto out;
 
-	mp = br_mdb_ip6_get(br->mdb, group);
+	mp = br_mdb_ip6_get(mlock_dereference(br->mdb, br), group);
 	if (!mp)
 		goto out;
 
@@ -1225,7 +1240,9 @@ static int br_ip6_multicast_query(struct net_bridge *br,
 	     try_to_del_timer_sync(&mp->timer) >= 0))
 		mod_timer(&mp->timer, now + max_delay);
 
-	for (pp = &mp->ports; (p = *pp); pp = &p->next) {
+	for (pp = &mp->ports;
+	     (p = mlock_dereference(*pp, br)) != NULL;
+	     pp = &p->next) {
 		if (timer_pending(&p->timer) ?
 		    time_after(p->timer.expires, now + max_delay) :
 		    try_to_del_timer_sync(&p->timer) >= 0)
@@ -1254,7 +1271,7 @@ static void br_multicast_leave_group(struct net_bridge *br,
 	    timer_pending(&br->multicast_querier_timer))
 		goto out;
 
-	mdb = br->mdb;
+	mdb = mlock_dereference(br->mdb, br);
 	mp = br_mdb_ip_get(mdb, group);
 	if (!mp)
 		goto out;
@@ -1277,7 +1294,9 @@ static void br_multicast_leave_group(struct net_bridge *br,
 		goto out;
 	}
 
-	for (p = mp->ports; p; p = p->next) {
+	for (p = mlock_dereference(mp->ports, br);
+	     p != NULL;
+	     p = mlock_dereference(p->next, br)) {
 		if (p->port != port)
 			continue;
 
@@ -1625,7 +1644,7 @@ void br_multicast_stop(struct net_bridge *br)
 	del_timer_sync(&br->multicast_query_timer);
 
 	spin_lock_bh(&br->multicast_lock);
-	mdb = br->mdb;
+	mdb = mlock_dereference(br->mdb, br);
 	if (!mdb)
 		goto out;
 
@@ -1729,6 +1748,7 @@ int br_multicast_toggle(struct net_bridge *br, unsigned long val)
 {
 	struct net_bridge_port *port;
 	int err = 0;
+	struct net_bridge_mdb_htable *mdb;
 
 	spin_lock(&br->multicast_lock);
 	if (br->multicast_disabled == !val)
@@ -1741,15 +1761,16 @@ int br_multicast_toggle(struct net_bridge *br, unsigned long val)
 	if (!netif_running(br->dev))
 		goto unlock;
 
-	if (br->mdb) {
-		if (br->mdb->old) {
+	mdb = mlock_dereference(br->mdb, br);
+	if (mdb) {
+		if (mdb->old) {
 			err = -EEXIST;
 rollback:
 			br->multicast_disabled = !!val;
 			goto unlock;
 		}
 
-		err = br_mdb_rehash(&br->mdb, br->mdb->max,
+		err = br_mdb_rehash(&br->mdb, mdb->max,
 				    br->hash_elasticity);
 		if (err)
 			goto rollback;
@@ -1774,6 +1795,7 @@ int br_multicast_set_hash_max(struct net_bridge *br, unsigned long val)
 {
 	int err = -ENOENT;
 	u32 old;
+	struct net_bridge_mdb_htable *mdb;
 
 	spin_lock(&br->multicast_lock);
 	if (!netif_running(br->dev))
@@ -1782,7 +1804,9 @@ int br_multicast_set_hash_max(struct net_bridge *br, unsigned long val)
 	err = -EINVAL;
 	if (!is_power_of_2(val))
 		goto unlock;
-	if (br->mdb && val < br->mdb->size)
+
+	mdb = mlock_dereference(br->mdb, br);
+	if (mdb && val < mdb->size)
 		goto unlock;
 
 	err = 0;
@@ -1790,8 +1814,8 @@ int br_multicast_set_hash_max(struct net_bridge *br, unsigned long val)
 	old = br->hash_max;
 	br->hash_max = val;
 
-	if (br->mdb) {
-		if (br->mdb->old) {
+	if (mdb) {
+		if (mdb->old) {
 			err = -EEXIST;
 rollback:
 			br->hash_max = old;
diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
index 4a6a378..b301dfc 100644
--- a/net/bridge/br_netlink.c
+++ b/net/bridge/br_netlink.c
@@ -123,7 +123,7 @@ static int br_dump_ifinfo(struct sk_buff *skb, struct netlink_callback *cb)
 		if (!br_port_exists(dev) || idx < cb->args[0])
 			goto skip;
 
-		if (br_fill_ifinfo(skb, br_port_get(dev),
+		if (br_fill_ifinfo(skb, br_port_get_rtnl(dev),
 				   NETLINK_CB(cb->skb).pid,
 				   cb->nlh->nlmsg_seq, RTM_NEWLINK,
 				   NLM_F_MULTI) < 0)
@@ -171,7 +171,7 @@ static int br_rtm_setlink(struct sk_buff *skb,  struct nlmsghdr *nlh, void *arg)
 
 	if (!br_port_exists(dev))
 		return -EINVAL;
-	p = br_port_get(dev);
+	p = br_port_get_rtnl(dev);
 
 	/* if kernel STP is running, don't allow changes */
 	if (p->br->stp_enabled == BR_KERNEL_STP)
diff --git a/net/bridge/br_notify.c b/net/bridge/br_notify.c
index 404d4e1..e72e49e 100644
--- a/net/bridge/br_notify.c
+++ b/net/bridge/br_notify.c
@@ -32,7 +32,7 @@ struct notifier_block br_device_notifier = {
 static int br_device_event(struct notifier_block *unused, unsigned long event, void *ptr)
 {
 	struct net_device *dev = ptr;
-	struct net_bridge_port *p = br_port_get(dev);
+	struct net_bridge_port *p;
 	struct net_bridge *br;
 	int err;
 
@@ -40,7 +40,7 @@ static int br_device_event(struct notifier_block *unused, unsigned long event, v
 	if (!br_port_exists(dev))
 		return NOTIFY_DONE;
 
-	p = br_port_get(dev);
+	p = br_port_get_rtnl(dev);
 	br = p->br;
 
 	switch (event) {
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index 75c90ed..32235d4 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -72,7 +72,7 @@ struct net_bridge_fdb_entry
 
 struct net_bridge_port_group {
 	struct net_bridge_port		*port;
-	struct net_bridge_port_group	*next;
+	struct net_bridge_port_group __rcu	*next;
 	struct hlist_node		mglist;
 	struct rcu_head			rcu;
 	struct timer_list		timer;
@@ -86,7 +86,7 @@ struct net_bridge_mdb_entry
 	struct hlist_node		hlist[2];
 	struct hlist_node		mglist;
 	struct net_bridge		*br;
-	struct net_bridge_port_group	*ports;
+	struct net_bridge_port_group __rcu	*ports;
 	struct rcu_head			rcu;
 	struct timer_list		timer;
 	struct timer_list		query_timer;
@@ -151,9 +151,8 @@ struct net_bridge_port
 #endif
 };
 
-#define br_port_get_rcu(dev) \
-	((struct net_bridge_port *) rcu_dereference(dev->rx_handler_data))
-#define br_port_get(dev) ((struct net_bridge_port *) dev->rx_handler_data)
+#define br_port_get_rcu(dev) rcu_dereference(dev->br_port_rcu)
+#define br_port_get_rtnl(dev) rtnl_dereference(dev->br_port_rcu)
 #define br_port_exists(dev) (dev->priv_flags & IFF_BRIDGE_PORT)
 
 struct br_cpu_netstats {
@@ -227,7 +226,7 @@ struct net_bridge
 	unsigned long			multicast_startup_query_interval;
 
 	spinlock_t			multicast_lock;
-	struct net_bridge_mdb_htable	*mdb;
+	struct net_bridge_mdb_htable __rcu	*mdb;
 	struct hlist_head		router_list;
 	struct hlist_head		mglist;
 
diff --git a/net/bridge/netfilter/ebtable_broute.c b/net/bridge/netfilter/ebtable_broute.c
index ae3f106..4ce9d8a 100644
--- a/net/bridge/netfilter/ebtable_broute.c
+++ b/net/bridge/netfilter/ebtable_broute.c
@@ -87,7 +87,7 @@ static int __init ebtable_broute_init(void)
 	if (ret < 0)
 		return ret;
 	/* see br_input.c */
-	rcu_assign_pointer(br_should_route_hook, ebt_broute);
+	rcu_assign_pointer(br_should_route_hook, (br_should_route_hook_t *)ebt_broute);
 	return 0;
 }
 



^ permalink raw reply related

* Re: [PATCH net-next-2.6 V2] igmp: RCU conversion of in_dev->mc_list
From: Américo Wang @ 2010-11-13  6:44 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Américo Wang, Cypher Wu, linux-kernel, netdev, David Miller
In-Reply-To: <1289576810.3185.261.camel@edumazet-laptop>

>
>Here is an updated version.
>
>[PATCH net-next-2.6 V2] igmp: RCU conversion of in_dev->mc_list
>
>in_dev->mc_list is protected by one rwlock (in_dev->mc_list_lock).
>
>This can easily be converted to a RCU protection.
>
>Writers hold RTNL, so mc_list_lock is removed, not replaced by a
>spinlock.

Ah, this saves much work.

>
>Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
>Cc: Cypher Wu <cypher.w@gmail.com>
>Cc: Américo Wang <xiyou.wangcong@gmail.com>

I just did a quick look, it looks good to me.

Thanks!

^ permalink raw reply

* Re: Kernel rwlock design, Multicore and IGMP
From: Américo Wang @ 2010-11-13  6:35 UTC (permalink / raw)
  To: Cypher Wu
  Cc: Américo Wang, Yong Zhang, Eric Dumazet, linux-kernel, netdev
In-Reply-To: <AANLkTi=0Z1zVqCvYwKiQ8cUZoeD_rLgZ+C07GuFgA7E7@mail.gmail.com>

On Fri, Nov 12, 2010 at 07:06:47PM +0800, Cypher Wu wrote:
>>
>> Note, on Tile, it uses a little different algorithm.
>>
>
>It seems that rwlock on x86 and tile have different behavior, x86 use
>RW_LOCK_BIAS, when read_lock() it will test if the lock is 0, and if
>so then the read_lock() have to 'spinning', otherwise it dec the lock;
>when write_lock() tried it first check if lock is It seems that rwlock
>on x86 and tile have different behavior, x86 use RW_LOCK_BIAS and if
>so, set lock to 0 and continue, otherwise it will 'spinning'.
>I'm not very familiar with x86 architecture, but the code seems like
>working that way.

No, they should be the same, sorry I made a mistake in the above reply.

Although Tile uses shifts in implementation while x86 uses inc/dec,
the idea is same, either writers use higher bits and readers use
lower bits or vice-versa.

-- 
Live like a child, think like the god.
 

^ permalink raw reply

* Re: Kernel rwlock design, Multicore and IGMP
From: Américo Wang @ 2010-11-13  6:28 UTC (permalink / raw)
  To: Yong Zhang
  Cc: Américo Wang, Eric Dumazet, Cypher Wu, linux-kernel, netdev
In-Reply-To: <20101112130017.GA9752@zhy>

On Fri, Nov 12, 2010 at 09:00:17PM +0800, Yong Zhang wrote:
>On Fri, Nov 12, 2010 at 05:18:18PM +0800, Américo Wang wrote:
>> On Fri, Nov 12, 2010 at 05:09:45PM +0800, Yong Zhang wrote:
>> >On Fri, Nov 12, 2010 at 4:19 PM, Américo Wang <xiyou.wangcong@gmail.com> wrote:
>> >> On Fri, Nov 12, 2010 at 08:27:54AM +0100, Eric Dumazet wrote:
>> >>>Le vendredi 12 novembre 2010 à 15:13 +0800, Américo Wang a écrit :
>> >>>> On Fri, Nov 12, 2010 at 11:32:59AM +0800, Cypher Wu wrote:
>> >>>> >On Thu, Nov 11, 2010 at 11:23 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>> >>>> >> Le jeudi 11 novembre 2010 à 21:49 +0800, Cypher Wu a écrit :
>> >>>> >>
>> >>>> >> Hi
>> >>>> >>
>> >>>> >> CC netdev, since you ask questions about network stuff _and_ rwlock
>> >>>> >>
>> >>>> >>
>> >>>> >>> I'm using TILEPro and its rwlock in kernel is a liitle different than
>> >>>> >>> other platforms. It have a priority for write lock that when tried it
>> >>>> >>> will block the following read lock even if read lock is hold by
>> >>>> >>> others. Its code can be read in Linux Kernel 2.6.36 in
>> >>>> >>> arch/tile/lib/spinlock_32.c.
>> >>>> >>
>> >>>> >> This seems a bug to me.
>> >>>> >>
>> >>>> >> read_lock() can be nested. We used such a schem in the past in iptables
>> >>>> >> (it can re-enter itself),
>> >>>> >> and we used instead a spinlock(), but with many discussions with lkml
>> >>>> >> and Linus himself if I remember well.
>> >>>> >>
>> >>>> >It seems not a problem that read_lock() can be nested or not since
>> >>>> >rwlock doesn't have 'owner', it's just that should we give
>> >>>> >write_lock() a priority than read_lock() since if there have a lot
>> >>>> >read_lock()s then they'll starve write_lock().
>> >>>> >We should work out a well defined behavior so all the
>> >>>> >platform-dependent raw_rwlock has to design under that principle.
>> >>>>
>> >>>
>> >>>AFAIK, Lockdep allows read_lock() to be nested.
>> >>>
>> >>>> It is a known weakness of rwlock, it is designed like that. :)
>> >>>>
>> >>>
>> >>>Agreed.
>> >>>
>> >>
>> >> Just for record, both Tile and X86 implement rwlock with a write-bias,
>> >> this somewhat reduces the write-starvation problem.
>> >
>> >Are you sure(on x86)?
>> >
>> >It seems that we never realize writer-bias rwlock.
>> >
>> 
>> Try
>> 
>> % grep RW_LOCK_BIAS -nr arch/x86
>> 
>> *And* read the code to see how it works. :)
>
>If read_lock()/write_lock() fails, the subtracted value(1 for
>read_lock() and RW_LOCK_BIAS for write_lock()) is added back.
>So reader and writer will contend on the same lock fairly.
>
>And RW_LOCK_BIAS based rwlock is a variant of sighed-test
>rwlock, so it works in the same way to highest-bit-set mode
>rwlock.
>
>Seem you're cheated by it's name(RW_LOCK_BIAS). :)

Ah, no, I made a mistake that I thought the initial value
of rwlock is something like 0, but clearly it is RW_LOCK_BIAS.
Yeah, then there is certainly no bias to writers, and x86
must be using almost the same algorithm with Tile.

-- 
Live like a child, think like the god.
 

^ permalink raw reply

* [PATCH] net: use the macros defined for the members of flowi
From: Changli Gao @ 2010-11-13  4:43 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Changli Gao

Use the macros defined for the members of flowi to clean the code up.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
---
 include/net/route.h             |   12 ++++-------
 net/atm/clip.c                  |    3 +-
 net/bridge/br_netfilter.c       |    9 +-------
 net/dccp/ipv4.c                 |   13 ++++--------
 net/decnet/dn_route.c           |   22 ++++++++------------
 net/decnet/dn_rules.c           |    2 -
 net/ipv4/af_inet.c              |   18 ++++------------
 net/ipv4/arp.c                  |   12 +++++------
 net/ipv4/fib_frontend.c         |   28 +++++++-------------------
 net/ipv4/fib_semantics.c        |    8 +------
 net/ipv4/icmp.c                 |   28 ++++++++------------------
 net/ipv4/igmp.c                 |    8 ++-----
 net/ipv4/inet_connection_sock.c |   15 +++++--------
 net/ipv4/ip_gre.c               |   30 ++++++++-------------------
 net/ipv4/ip_output.c            |   25 +++++++++--------------
 net/ipv4/ipip.c                 |   20 +++++-------------
 net/ipv4/ipmr.c                 |   18 ++++------------
 net/ipv4/netfilter.c            |    8 +++----
 net/ipv4/raw.c                  |    7 ++----
 net/ipv4/route.c                |   43 +++++++++++++++-------------------------
 net/ipv4/syncookies.c           |   15 +++++--------
 net/ipv4/udp.c                  |   12 ++++-------
 net/ipv4/xfrm4_policy.c         |    8 +------
 net/ipv6/ip6mr.c                |    4 ---
 net/ipv6/netfilter.c            |    6 +----
 net/ipv6/route.c                |   24 +++++-----------------
 net/ipv6/sit.c                  |   14 +++++--------
 net/l2tp/l2tp_ip.c              |   12 ++++-------
 net/netfilter/ipvs/ip_vs_ctl.c  |    6 +----
 net/netfilter/ipvs/ip_vs_xmit.c |   34 +++++++------------------------
 net/netfilter/xt_TEE.c          |   12 +++++------
 net/rxrpc/ar-peer.c             |   10 ++++-----
 32 files changed, 171 insertions(+), 315 deletions(-)
diff --git a/include/net/route.h b/include/net/route.h
index 5cd46d1..b8c1f77 100644
--- a/include/net/route.h
+++ b/include/net/route.h
@@ -169,14 +169,12 @@ static inline int ip_route_connect(struct rtable **rp, __be32 dst,
 {
 	struct flowi fl = { .oif = oif,
 			    .mark = sk->sk_mark,
-			    .nl_u = { .ip4_u = { .daddr = dst,
-						 .saddr = src,
-						 .tos   = tos } },
+			    .fl4_dst = dst,
+			    .fl4_src = src,
+			    .fl4_tos = tos,
 			    .proto = protocol,
-			    .uli_u = { .ports =
-				       { .sport = sport,
-					 .dport = dport } } };
-
+			    .fl_ip_sport = sport,
+			    .fl_ip_dport = dport };
 	int err;
 	struct net *net = sock_net(sk);
 
diff --git a/net/atm/clip.c b/net/atm/clip.c
index ff956d1..d257da5 100644
--- a/net/atm/clip.c
+++ b/net/atm/clip.c
@@ -502,7 +502,8 @@ static int clip_setentry(struct atm_vcc *vcc, __be32 ip)
 	struct atmarp_entry *entry;
 	int error;
 	struct clip_vcc *clip_vcc;
-	struct flowi fl = { .nl_u = { .ip4_u = { .daddr = ip, .tos = 1}} };
+	struct flowi fl = { .fl4_dst = ip,
+			    .fl4_tos = 1 };
 	struct rtable *rt;
 
 	if (vcc->push != clip_push) {
diff --git a/net/bridge/br_netfilter.c b/net/bridge/br_netfilter.c
index 865fd76..36cd0b7 100644
--- a/net/bridge/br_netfilter.c
+++ b/net/bridge/br_netfilter.c
@@ -412,13 +412,8 @@ static int br_nf_pre_routing_finish(struct sk_buff *skb)
 	if (dnat_took_place(skb)) {
 		if ((err = ip_route_input(skb, iph->daddr, iph->saddr, iph->tos, dev))) {
 			struct flowi fl = {
-				.nl_u = {
-					.ip4_u = {
-						 .daddr = iph->daddr,
-						 .saddr = 0,
-						 .tos = RT_TOS(iph->tos) },
-				},
-				.proto = 0,
+				.fl4_dst = iph->daddr,
+				.fl4_tos = RT_TOS(iph->tos),
 			};
 			struct in_device *in_dev = __in_dev_get_rcu(dev);
 
diff --git a/net/dccp/ipv4.c b/net/dccp/ipv4.c
index 3f69ea1..45a434f 100644
--- a/net/dccp/ipv4.c
+++ b/net/dccp/ipv4.c
@@ -462,15 +462,12 @@ static struct dst_entry* dccp_v4_route_skb(struct net *net, struct sock *sk,
 {
 	struct rtable *rt;
 	struct flowi fl = { .oif = skb_rtable(skb)->rt_iif,
-			    .nl_u = { .ip4_u =
-				      { .daddr = ip_hdr(skb)->saddr,
-					.saddr = ip_hdr(skb)->daddr,
-					.tos = RT_CONN_FLAGS(sk) } },
+			    .fl4_dst = ip_hdr(skb)->saddr,
+			    .fl4_src = ip_hdr(skb)->daddr,
+			    .fl4_tos = RT_CONN_FLAGS(sk),
 			    .proto = sk->sk_protocol,
-			    .uli_u = { .ports =
-				       { .sport = dccp_hdr(skb)->dccph_dport,
-					 .dport = dccp_hdr(skb)->dccph_sport }
-				     }
+			    .fl_ip_sport = dccp_hdr(skb)->dccph_dport,
+			    .fl_ip_dport = dccp_hdr(skb)->dccph_sport
 			  };
 
 	security_skb_classify_flow(skb, &fl);
diff --git a/net/decnet/dn_route.c b/net/decnet/dn_route.c
index 474d54d..8280e43 100644
--- a/net/decnet/dn_route.c
+++ b/net/decnet/dn_route.c
@@ -271,10 +271,10 @@ static void dn_dst_link_failure(struct sk_buff *skb)
 
 static inline int compare_keys(struct flowi *fl1, struct flowi *fl2)
 {
-	return ((fl1->nl_u.dn_u.daddr ^ fl2->nl_u.dn_u.daddr) |
-		(fl1->nl_u.dn_u.saddr ^ fl2->nl_u.dn_u.saddr) |
+	return ((fl1->fld_dst ^ fl2->fld_dst) |
+		(fl1->fld_src ^ fl2->fld_src) |
 		(fl1->mark ^ fl2->mark) |
-		(fl1->nl_u.dn_u.scope ^ fl2->nl_u.dn_u.scope) |
+		(fl1->fld_scope ^ fl2->fld_scope) |
 		(fl1->oif ^ fl2->oif) |
 		(fl1->iif ^ fl2->iif)) == 0;
 }
@@ -882,11 +882,9 @@ static inline __le16 dn_fib_rules_map_destination(__le16 daddr, struct dn_fib_re
 
 static int dn_route_output_slow(struct dst_entry **pprt, const struct flowi *oldflp, int try_hard)
 {
-	struct flowi fl = { .nl_u = { .dn_u =
-				      { .daddr = oldflp->fld_dst,
-					.saddr = oldflp->fld_src,
-					.scope = RT_SCOPE_UNIVERSE,
-				     } },
+	struct flowi fl = { .fld_dst = oldflp->fld_dst,
+			    .fld_src = oldflp->fld_src,
+			    .fld_scope = RT_SCOPE_UNIVERSE,
 			    .mark = oldflp->mark,
 			    .iif = init_net.loopback_dev->ifindex,
 			    .oif = oldflp->oif };
@@ -1230,11 +1228,9 @@ static int dn_route_input_slow(struct sk_buff *skb)
 	int flags = 0;
 	__le16 gateway = 0;
 	__le16 local_src = 0;
-	struct flowi fl = { .nl_u = { .dn_u =
-				     { .daddr = cb->dst,
-				       .saddr = cb->src,
-				       .scope = RT_SCOPE_UNIVERSE,
-				    } },
+	struct flowi fl = { .fld_dst = cb->dst,
+			    .fld_src = cb->src,
+			    .fld_scope = RT_SCOPE_UNIVERSE,
 			    .mark = skb->mark,
 			    .iif = skb->dev->ifindex };
 	struct dn_fib_res res = { .fi = NULL, .type = RTN_UNREACHABLE };
diff --git a/net/decnet/dn_rules.c b/net/decnet/dn_rules.c
index 48fdf10..6eb91df 100644
--- a/net/decnet/dn_rules.c
+++ b/net/decnet/dn_rules.c
@@ -175,7 +175,7 @@ static int dn_fib_rule_compare(struct fib_rule *rule, struct fib_rule_hdr *frh,
 
 unsigned dnet_addr_type(__le16 addr)
 {
-	struct flowi fl = { .nl_u = { .dn_u = { .daddr = addr } } };
+	struct flowi fl = { .fld_dst = addr };
 	struct dn_fib_res res;
 	unsigned ret = RTN_UNICAST;
 	struct dn_fib_table *tb = dn_fib_get_table(RT_TABLE_LOCAL, 0);
diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index f581f77..f2b6110 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -1148,21 +1148,13 @@ int inet_sk_rebuild_header(struct sock *sk)
 	struct flowi fl = {
 		.oif = sk->sk_bound_dev_if,
 		.mark = sk->sk_mark,
-		.nl_u = {
-			.ip4_u = {
-				.daddr	= daddr,
-				.saddr	= inet->inet_saddr,
-				.tos	= RT_CONN_FLAGS(sk),
-			},
-		},
+		.fl4_dst = daddr,
+		.fl4_src = inet->inet_saddr,
+		.fl4_tos = RT_CONN_FLAGS(sk),
 		.proto = sk->sk_protocol,
 		.flags = inet_sk_flowi_flags(sk),
-		.uli_u = {
-			.ports = {
-				.sport = inet->inet_sport,
-				.dport = inet->inet_dport,
-			},
-		},
+		.fl_ip_sport = inet->inet_sport,
+		.fl_ip_dport = inet->inet_dport,
 	};
 
 	security_sk_classify_flow(sk, &fl);
diff --git a/net/ipv4/arp.c b/net/ipv4/arp.c
index d8e540c..b564b76 100644
--- a/net/ipv4/arp.c
+++ b/net/ipv4/arp.c
@@ -433,8 +433,8 @@ static int arp_ignore(struct in_device *in_dev, __be32 sip, __be32 tip)
 
 static int arp_filter(__be32 sip, __be32 tip, struct net_device *dev)
 {
-	struct flowi fl = { .nl_u = { .ip4_u = { .daddr = sip,
-						 .saddr = tip } } };
+	struct flowi fl = { .fl4_dst = sip,
+			    .fl4_src = tip };
 	struct rtable *rt;
 	int flag = 0;
 	/*unsigned long now; */
@@ -1061,8 +1061,8 @@ static int arp_req_set(struct net *net, struct arpreq *r,
 	if (r->arp_flags & ATF_PERM)
 		r->arp_flags |= ATF_COM;
 	if (dev == NULL) {
-		struct flowi fl = { .nl_u.ip4_u = { .daddr = ip,
-						    .tos = RTO_ONLINK } };
+		struct flowi fl = { .fl4_dst = ip,
+				    .fl4_tos = RTO_ONLINK };
 		struct rtable *rt;
 		err = ip_route_output_key(net, &rt, &fl);
 		if (err != 0)
@@ -1169,8 +1169,8 @@ static int arp_req_delete(struct net *net, struct arpreq *r,
 
 	ip = ((struct sockaddr_in *)&r->arp_pa)->sin_addr.s_addr;
 	if (dev == NULL) {
-		struct flowi fl = { .nl_u.ip4_u = { .daddr = ip,
-						    .tos = RTO_ONLINK } };
+		struct flowi fl = { .fl4_dst = ip,
+				    .fl4_tos = RTO_ONLINK };
 		struct rtable *rt;
 		err = ip_route_output_key(net, &rt, &fl);
 		if (err != 0)
diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c
index eb6f69a..d3a1112 100644
--- a/net/ipv4/fib_frontend.c
+++ b/net/ipv4/fib_frontend.c
@@ -158,11 +158,7 @@ static void fib_flush(struct net *net)
 struct net_device *__ip_dev_find(struct net *net, __be32 addr, bool devref)
 {
 	struct flowi fl = {
-		.nl_u = {
-			.ip4_u = {
-				.daddr = addr
-			}
-		},
+		.fl4_dst = addr,
 		.flags = FLOWI_FLAG_MATCH_ANY_IIF
 	};
 	struct fib_result res = { 0 };
@@ -193,7 +189,7 @@ static inline unsigned __inet_dev_addr_type(struct net *net,
 					    const struct net_device *dev,
 					    __be32 addr)
 {
-	struct flowi		fl = { .nl_u = { .ip4_u = { .daddr = addr } } };
+	struct flowi		fl = { .fl4_dst = addr };
 	struct fib_result	res;
 	unsigned ret = RTN_BROADCAST;
 	struct fib_table *local_table;
@@ -247,13 +243,9 @@ int fib_validate_source(__be32 src, __be32 dst, u8 tos, int oif,
 {
 	struct in_device *in_dev;
 	struct flowi fl = {
-		.nl_u = {
-			.ip4_u = {
-				.daddr = src,
-				.saddr = dst,
-				.tos = tos
-			}
-		},
+		.fl4_dst = src,
+		.fl4_src = dst,
+		.fl4_tos = tos,
 		.mark = mark,
 		.iif = oif
 	};
@@ -853,13 +845,9 @@ static void nl_fib_lookup(struct fib_result_nl *frn, struct fib_table *tb)
 	struct fib_result       res;
 	struct flowi            fl = {
 		.mark = frn->fl_mark,
-		.nl_u = {
-			.ip4_u = {
-				.daddr = frn->fl_addr,
-				.tos = frn->fl_tos,
-				.scope = frn->fl_scope
-			}
-		}
+		.fl4_dst = frn->fl_addr,
+		.fl4_tos = frn->fl_tos,
+		.fl4_scope = frn->fl_scope,
 	};
 
 #ifdef CONFIG_IP_MULTIPLE_TABLES
diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c
index 3e0da3e..12d3dc3 100644
--- a/net/ipv4/fib_semantics.c
+++ b/net/ipv4/fib_semantics.c
@@ -563,12 +563,8 @@ static int fib_check_nh(struct fib_config *cfg, struct fib_info *fi,
 		rcu_read_lock();
 		{
 			struct flowi fl = {
-				.nl_u = {
-					.ip4_u = {
-						.daddr = nh->nh_gw,
-						.scope = cfg->fc_scope + 1,
-					},
-				},
+				.fl4_dst = nh->nh_gw,
+				.fl4_scope = cfg->fc_scope + 1,
 				.oif = nh->nh_oif,
 			};
 
diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c
index c6e2aff..4daebd1 100644
--- a/net/ipv4/icmp.c
+++ b/net/ipv4/icmp.c
@@ -386,10 +386,9 @@ static void icmp_reply(struct icmp_bxm *icmp_param, struct sk_buff *skb)
 			daddr = icmp_param->replyopts.faddr;
 	}
 	{
-		struct flowi fl = { .nl_u = { .ip4_u =
-					      { .daddr = daddr,
-						.saddr = rt->rt_spec_dst,
-						.tos = RT_TOS(ip_hdr(skb)->tos) } },
+		struct flowi fl = { .fl4_dst= daddr,
+				    .fl4_src = rt->rt_spec_dst,
+				    .fl4_tos = RT_TOS(ip_hdr(skb)->tos),
 				    .proto = IPPROTO_ICMP };
 		security_skb_classify_flow(skb, &fl);
 		if (ip_route_output_key(net, &rt, &fl))
@@ -542,22 +541,13 @@ void icmp_send(struct sk_buff *skb_in, int type, int code, __be32 info)
 
 	{
 		struct flowi fl = {
-			.nl_u = {
-				.ip4_u = {
-					.daddr = icmp_param.replyopts.srr ?
-						icmp_param.replyopts.faddr :
-						iph->saddr,
-					.saddr = saddr,
-					.tos = RT_TOS(tos)
-				}
-			},
+			.fl4_dst = icmp_param.replyopts.srr ?
+				   icmp_param.replyopts.faddr : iph->saddr,
+			.fl4_src = saddr,
+			.fl4_tos = RT_TOS(tos),
 			.proto = IPPROTO_ICMP,
-			.uli_u = {
-				.icmpt = {
-					.type = type,
-					.code = code
-				}
-			}
+			.fl_icmp_type = type,
+			.fl_icmp_code = code,
 		};
 		int err;
 		struct rtable *rt2;
diff --git a/net/ipv4/igmp.c b/net/ipv4/igmp.c
index 08d0d81..606f92c 100644
--- a/net/ipv4/igmp.c
+++ b/net/ipv4/igmp.c
@@ -298,8 +298,7 @@ static struct sk_buff *igmpv3_newpack(struct net_device *dev, int size)
 
 	{
 		struct flowi fl = { .oif = dev->ifindex,
-				    .nl_u = { .ip4_u = {
-				    .daddr = IGMPV3_ALL_MCR } },
+				    .fl4_dst = IGMPV3_ALL_MCR,
 				    .proto = IPPROTO_IGMP };
 		if (ip_route_output_key(net, &rt, &fl)) {
 			kfree_skb(skb);
@@ -644,7 +643,7 @@ static int igmp_send_report(struct in_device *in_dev, struct ip_mc_list *pmc,
 
 	{
 		struct flowi fl = { .oif = dev->ifindex,
-				    .nl_u = { .ip4_u = { .daddr = dst } },
+				    .fl4_dst = dst,
 				    .proto = IPPROTO_IGMP };
 		if (ip_route_output_key(net, &rt, &fl))
 			return -1;
@@ -1421,8 +1420,7 @@ void ip_mc_destroy_dev(struct in_device *in_dev)
 /* RTNL is locked */
 static struct in_device *ip_mc_find_dev(struct net *net, struct ip_mreqn *imr)
 {
-	struct flowi fl = { .nl_u = { .ip4_u =
-				      { .daddr = imr->imr_multiaddr.s_addr } } };
+	struct flowi fl = { .fl4_dst = imr->imr_multiaddr.s_addr };
 	struct rtable *rt;
 	struct net_device *dev = NULL;
 	struct in_device *idev = NULL;
diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
index 7174370..06f5f8f 100644
--- a/net/ipv4/inet_connection_sock.c
+++ b/net/ipv4/inet_connection_sock.c
@@ -358,17 +358,14 @@ struct dst_entry *inet_csk_route_req(struct sock *sk,
 	struct ip_options *opt = inet_rsk(req)->opt;
 	struct flowi fl = { .oif = sk->sk_bound_dev_if,
 			    .mark = sk->sk_mark,
-			    .nl_u = { .ip4_u =
-				      { .daddr = ((opt && opt->srr) ?
-						  opt->faddr :
-						  ireq->rmt_addr),
-					.saddr = ireq->loc_addr,
-					.tos = RT_CONN_FLAGS(sk) } },
+			    .fl4_dst = ((opt && opt->srr) ?
+					  opt->faddr : ireq->rmt_addr),
+			    .fl4_src = ireq->loc_addr,
+			    .fl4_tos = RT_CONN_FLAGS(sk),
 			    .proto = sk->sk_protocol,
 			    .flags = inet_sk_flowi_flags(sk),
-			    .uli_u = { .ports =
-				       { .sport = inet_sk(sk)->inet_sport,
-					 .dport = ireq->rmt_port } } };
+			    .fl_ip_sport = inet_sk(sk)->inet_sport,
+			    .fl_ip_dport = ireq->rmt_port };
 	struct net *net = sock_net(sk);
 
 	security_req_classify_flow(req, &fl);
diff --git a/net/ipv4/ip_gre.c b/net/ipv4/ip_gre.c
index cab2057..a2e9cfd 100644
--- a/net/ipv4/ip_gre.c
+++ b/net/ipv4/ip_gre.c
@@ -772,13 +772,9 @@ static netdev_tx_t ipgre_tunnel_xmit(struct sk_buff *skb, struct net_device *dev
 	{
 		struct flowi fl = {
 			.oif = tunnel->parms.link,
-			.nl_u = {
-				.ip4_u = {
-					.daddr = dst,
-					.saddr = tiph->saddr,
-					.tos = RT_TOS(tos)
-				}
-			},
+			.fl4_dst = dst,
+			.fl4_src = tiph->saddr,
+			.fl4_tos = RT_TOS(tos),
 			.proto = IPPROTO_GRE
 		}
 ;
@@ -951,13 +947,9 @@ static int ipgre_tunnel_bind_dev(struct net_device *dev)
 	if (iph->daddr) {
 		struct flowi fl = {
 			.oif = tunnel->parms.link,
-			.nl_u = {
-				.ip4_u = {
-					.daddr = iph->daddr,
-					.saddr = iph->saddr,
-					.tos = RT_TOS(iph->tos)
-				}
-			},
+			.fl4_dst = iph->daddr,
+			.fl4_src = iph->saddr,
+			.fl4_tos = RT_TOS(iph->tos),
 			.proto = IPPROTO_GRE
 		};
 		struct rtable *rt;
@@ -1216,13 +1208,9 @@ static int ipgre_open(struct net_device *dev)
 	if (ipv4_is_multicast(t->parms.iph.daddr)) {
 		struct flowi fl = {
 			.oif = t->parms.link,
-			.nl_u = {
-				.ip4_u = {
-					.daddr = t->parms.iph.daddr,
-					.saddr = t->parms.iph.saddr,
-					.tos = RT_TOS(t->parms.iph.tos)
-				}
-			},
+			.fl4_dst = t->parms.iph.daddr,
+			.fl4_src = t->parms.iph.saddr,
+			.fl4_tos = RT_TOS(t->parms.iph.tos),
 			.proto = IPPROTO_GRE
 		};
 		struct rtable *rt;
diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index 439d2a3..5090c7f 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -341,15 +341,13 @@ int ip_queue_xmit(struct sk_buff *skb)
 		{
 			struct flowi fl = { .oif = sk->sk_bound_dev_if,
 					    .mark = sk->sk_mark,
-					    .nl_u = { .ip4_u =
-						      { .daddr = daddr,
-							.saddr = inet->inet_saddr,
-							.tos = RT_CONN_FLAGS(sk) } },
+					    .fl4_dst = daddr,
+					    .fl4_src = inet->inet_saddr,
+					    .fl4_tos = RT_CONN_FLAGS(sk),
 					    .proto = sk->sk_protocol,
 					    .flags = inet_sk_flowi_flags(sk),
-					    .uli_u = { .ports =
-						       { .sport = inet->inet_sport,
-							 .dport = inet->inet_dport } } };
+					    .fl_ip_sport = inet->inet_sport,
+					    .fl_ip_dport = inet->inet_dport };
 
 			/* If this fails, retransmit mechanism of transport layer will
 			 * keep trying until route appears or the connection times
@@ -1404,14 +1402,11 @@ void ip_send_reply(struct sock *sk, struct sk_buff *skb, struct ip_reply_arg *ar
 
 	{
 		struct flowi fl = { .oif = arg->bound_dev_if,
-				    .nl_u = { .ip4_u =
-					      { .daddr = daddr,
-						.saddr = rt->rt_spec_dst,
-						.tos = RT_TOS(ip_hdr(skb)->tos) } },
-				    /* Not quite clean, but right. */
-				    .uli_u = { .ports =
-					       { .sport = tcp_hdr(skb)->dest,
-						 .dport = tcp_hdr(skb)->source } },
+				    .fl4_dst = daddr,
+				    .fl4_src = rt->rt_spec_dst,
+				    .fl4_tos = RT_TOS(ip_hdr(skb)->tos),
+				    .fl_ip_sport = tcp_hdr(skb)->dest,
+				    .fl_ip_dport = tcp_hdr(skb)->source,
 				    .proto = sk->sk_protocol,
 				    .flags = ip_reply_arg_flowi_flags(arg) };
 		security_skb_classify_flow(skb, &fl);
diff --git a/net/ipv4/ipip.c b/net/ipv4/ipip.c
index cd300aa..e70ad58 100644
--- a/net/ipv4/ipip.c
+++ b/net/ipv4/ipip.c
@@ -463,13 +463,9 @@ static netdev_tx_t ipip_tunnel_xmit(struct sk_buff *skb, struct net_device *dev)
 	{
 		struct flowi fl = {
 			.oif = tunnel->parms.link,
-			.nl_u = {
-				.ip4_u = {
-					.daddr = dst,
-					.saddr = tiph->saddr,
-					.tos = RT_TOS(tos)
-				}
-			},
+			.fl4_dst = dst,
+			.fl4_src= tiph->saddr,
+			.fl4_tos = RT_TOS(tos),
 			.proto = IPPROTO_IPIP
 		};
 
@@ -589,13 +585,9 @@ static void ipip_tunnel_bind_dev(struct net_device *dev)
 	if (iph->daddr) {
 		struct flowi fl = {
 			.oif = tunnel->parms.link,
-			.nl_u = {
-				.ip4_u = {
-					.daddr = iph->daddr,
-					.saddr = iph->saddr,
-					.tos = RT_TOS(iph->tos)
-				}
-			},
+			.fl4_dst = iph->daddr,
+			.fl4_src = iph->saddr,
+			.fl4_tos = RT_TOS(iph->tos),
 			.proto = IPPROTO_IPIP
 		};
 		struct rtable *rt;
diff --git a/net/ipv4/ipmr.c b/net/ipv4/ipmr.c
index ef2b008..92aaa3d 100644
--- a/net/ipv4/ipmr.c
+++ b/net/ipv4/ipmr.c
@@ -1537,13 +1537,9 @@ static void ipmr_queue_xmit(struct net *net, struct mr_table *mrt,
 	if (vif->flags & VIFF_TUNNEL) {
 		struct flowi fl = {
 			.oif = vif->link,
-			.nl_u = {
-				.ip4_u = {
-					.daddr = vif->remote,
-					.saddr = vif->local,
-					.tos = RT_TOS(iph->tos)
-				}
-			},
+			.fl4_dst = vif->remote,
+			.fl4_src = vif->local,
+			.fl4_tos = RT_TOS(iph->tos),
 			.proto = IPPROTO_IPIP
 		};
 
@@ -1553,12 +1549,8 @@ static void ipmr_queue_xmit(struct net *net, struct mr_table *mrt,
 	} else {
 		struct flowi fl = {
 			.oif = vif->link,
-			.nl_u = {
-				.ip4_u = {
-					.daddr = iph->daddr,
-					.tos = RT_TOS(iph->tos)
-				}
-			},
+			.fl4_dst = iph->daddr,
+			.fl4_tos = RT_TOS(iph->tos),
 			.proto = IPPROTO_IPIP
 		};
 
diff --git a/net/ipv4/netfilter.c b/net/ipv4/netfilter.c
index d88a46c..994a1f2 100644
--- a/net/ipv4/netfilter.c
+++ b/net/ipv4/netfilter.c
@@ -31,10 +31,10 @@ int ip_route_me_harder(struct sk_buff *skb, unsigned addr_type)
 	 * packets with foreign saddr to appear on the NF_INET_LOCAL_OUT hook.
 	 */
 	if (addr_type == RTN_LOCAL) {
-		fl.nl_u.ip4_u.daddr = iph->daddr;
+		fl.fl4_dst = iph->daddr;
 		if (type == RTN_LOCAL)
-			fl.nl_u.ip4_u.saddr = iph->saddr;
-		fl.nl_u.ip4_u.tos = RT_TOS(iph->tos);
+			fl.fl4_src = iph->saddr;
+		fl.fl4_tos = RT_TOS(iph->tos);
 		fl.oif = skb->sk ? skb->sk->sk_bound_dev_if : 0;
 		fl.mark = skb->mark;
 		fl.flags = skb->sk ? inet_sk_flowi_flags(skb->sk) : 0;
@@ -47,7 +47,7 @@ int ip_route_me_harder(struct sk_buff *skb, unsigned addr_type)
 	} else {
 		/* non-local src, find valid iif to satisfy
 		 * rp-filter when calling ip_route_input. */
-		fl.nl_u.ip4_u.daddr = iph->saddr;
+		fl.fl4_dst = iph->saddr;
 		if (ip_route_output_key(net, &rt, &fl) != 0)
 			return -1;
 
diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c
index 1f85ef2..a3d5ab7 100644
--- a/net/ipv4/raw.c
+++ b/net/ipv4/raw.c
@@ -549,10 +549,9 @@ static int raw_sendmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
 	{
 		struct flowi fl = { .oif = ipc.oif,
 				    .mark = sk->sk_mark,
-				    .nl_u = { .ip4_u =
-					      { .daddr = daddr,
-						.saddr = saddr,
-						.tos = tos } },
+				    .fl4_dst = daddr,
+				    .fl4_src = saddr,
+				    .fl4_tos = tos,
 				    .proto = inet->hdrincl ? IPPROTO_RAW :
 							     sk->sk_protocol,
 				  };
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 66610ea..ec2333f 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -684,17 +684,17 @@ static inline bool rt_caching(const struct net *net)
 static inline bool compare_hash_inputs(const struct flowi *fl1,
 					const struct flowi *fl2)
 {
-	return ((((__force u32)fl1->nl_u.ip4_u.daddr ^ (__force u32)fl2->nl_u.ip4_u.daddr) |
-		((__force u32)fl1->nl_u.ip4_u.saddr ^ (__force u32)fl2->nl_u.ip4_u.saddr) |
+	return ((((__force u32)fl1->fl4_dst ^ (__force u32)fl2->fl4_dst) |
+		((__force u32)fl1->fl4_src ^ (__force u32)fl2->fl4_src) |
 		(fl1->iif ^ fl2->iif)) == 0);
 }
 
 static inline int compare_keys(struct flowi *fl1, struct flowi *fl2)
 {
-	return (((__force u32)fl1->nl_u.ip4_u.daddr ^ (__force u32)fl2->nl_u.ip4_u.daddr) |
-		((__force u32)fl1->nl_u.ip4_u.saddr ^ (__force u32)fl2->nl_u.ip4_u.saddr) |
+	return (((__force u32)fl1->fl4_dst ^ (__force u32)fl2->fl4_dst) |
+		((__force u32)fl1->fl4_src ^ (__force u32)fl2->fl4_src) |
 		(fl1->mark ^ fl2->mark) |
-		(*(u16 *)&fl1->nl_u.ip4_u.tos ^ *(u16 *)&fl2->nl_u.ip4_u.tos) |
+		(*(u16 *)&fl1->fl4_tos ^ *(u16 *)&fl2->fl4_tos) |
 		(fl1->oif ^ fl2->oif) |
 		(fl1->iif ^ fl2->iif)) == 0;
 }
@@ -2089,12 +2089,10 @@ static int ip_route_input_slow(struct sk_buff *skb, __be32 daddr, __be32 saddr,
 {
 	struct fib_result res;
 	struct in_device *in_dev = __in_dev_get_rcu(dev);
-	struct flowi fl = { .nl_u = { .ip4_u =
-				      { .daddr = daddr,
-					.saddr = saddr,
-					.tos = tos,
-					.scope = RT_SCOPE_UNIVERSE,
-				      } },
+	struct flowi fl = { .fl4_dst	= daddr,
+			    .fl4_src	= saddr,
+			    .fl4_tos	= tos,
+			    .fl4_scope	= RT_SCOPE_UNIVERSE,
 			    .mark = skb->mark,
 			    .iif = dev->ifindex };
 	unsigned	flags = 0;
@@ -2480,14 +2478,11 @@ static int ip_route_output_slow(struct net *net, struct rtable **rp,
 				const struct flowi *oldflp)
 {
 	u32 tos	= RT_FL_TOS(oldflp);
-	struct flowi fl = { .nl_u = { .ip4_u =
-				      { .daddr = oldflp->fl4_dst,
-					.saddr = oldflp->fl4_src,
-					.tos = tos & IPTOS_RT_MASK,
-					.scope = ((tos & RTO_ONLINK) ?
-						  RT_SCOPE_LINK :
-						  RT_SCOPE_UNIVERSE),
-				      } },
+	struct flowi fl = { .fl4_dst = oldflp->fl4_dst,
+			    .fl4_src = oldflp->fl4_src,
+			    .fl4_tos = tos & IPTOS_RT_MASK,
+			    .fl4_scope = ((tos & RTO_ONLINK) ?
+					  RT_SCOPE_LINK : RT_SCOPE_UNIVERSE),
 			    .mark = oldflp->mark,
 			    .iif = net->loopback_dev->ifindex,
 			    .oif = oldflp->oif };
@@ -2944,13 +2939,9 @@ static int inet_rtm_getroute(struct sk_buff *in_skb, struct nlmsghdr* nlh, void
 			err = -rt->dst.error;
 	} else {
 		struct flowi fl = {
-			.nl_u = {
-				.ip4_u = {
-					.daddr = dst,
-					.saddr = src,
-					.tos = rtm->rtm_tos,
-				},
-			},
+			.fl4_dst = dst,
+			.fl4_src = src,
+			.fl4_tos = rtm->rtm_tos,
 			.oif = tb[RTA_OIF] ? nla_get_u32(tb[RTA_OIF]) : 0,
 			.mark = mark,
 		};
diff --git a/net/ipv4/syncookies.c b/net/ipv4/syncookies.c
index 650cace..4751920 100644
--- a/net/ipv4/syncookies.c
+++ b/net/ipv4/syncookies.c
@@ -346,17 +346,14 @@ struct sock *cookie_v4_check(struct sock *sk, struct sk_buff *skb,
 	 */
 	{
 		struct flowi fl = { .mark = sk->sk_mark,
-				    .nl_u = { .ip4_u =
-					      { .daddr = ((opt && opt->srr) ?
-							  opt->faddr :
-							  ireq->rmt_addr),
-						.saddr = ireq->loc_addr,
-						.tos = RT_CONN_FLAGS(sk) } },
+				    .fl4_dst = ((opt && opt->srr) ?
+						opt->faddr : ireq->rmt_addr),
+				    .fl4_src = ireq->loc_addr,
+				    .fl4_tos = RT_CONN_FLAGS(sk),
 				    .proto = IPPROTO_TCP,
 				    .flags = inet_sk_flowi_flags(sk),
-				    .uli_u = { .ports =
-					       { .sport = th->dest,
-						 .dport = th->source } } };
+				    .fl_ip_sport = th->dest,
+				    .fl_ip_dport = th->source };
 		security_req_classify_flow(req, &fl);
 		if (ip_route_output_key(sock_net(sk), &rt, &fl)) {
 			reqsk_free(req);
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 28cb2d7..803887f 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -890,15 +890,13 @@ int udp_sendmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
 	if (rt == NULL) {
 		struct flowi fl = { .oif = ipc.oif,
 				    .mark = sk->sk_mark,
-				    .nl_u = { .ip4_u =
-					      { .daddr = faddr,
-						.saddr = saddr,
-						.tos = tos } },
+				    .fl4_dst = faddr,
+				    .fl4_src = saddr,
+				    .fl4_tos = tos,
 				    .proto = sk->sk_protocol,
 				    .flags = inet_sk_flowi_flags(sk),
-				    .uli_u = { .ports =
-					       { .sport = inet->inet_sport,
-						 .dport = dport } } };
+				    .fl_ip_sport = inet->inet_sport,
+				    .fl_ip_dport = dport };
 		struct net *net = sock_net(sk);
 
 		security_sk_classify_flow(sk, &fl);
diff --git a/net/ipv4/xfrm4_policy.c b/net/ipv4/xfrm4_policy.c
index dd1fd8c..b9e28b9 100644
--- a/net/ipv4/xfrm4_policy.c
+++ b/net/ipv4/xfrm4_policy.c
@@ -22,12 +22,8 @@ static struct dst_entry *xfrm4_dst_lookup(struct net *net, int tos,
 					  xfrm_address_t *daddr)
 {
 	struct flowi fl = {
-		.nl_u = {
-			.ip4_u = {
-				.tos = tos,
-				.daddr = daddr->a4,
-			},
-		},
+		.fl4_dst = daddr->a4,
+		.fl4_tos = tos,
 	};
 	struct dst_entry *dst;
 	struct rtable *rt;
diff --git a/net/ipv6/ip6mr.c b/net/ipv6/ip6mr.c
index 6f32ffc..9fab274 100644
--- a/net/ipv6/ip6mr.c
+++ b/net/ipv6/ip6mr.c
@@ -1843,9 +1843,7 @@ static int ip6mr_forward2(struct net *net, struct mr6_table *mrt,
 
 	fl = (struct flowi) {
 		.oif = vif->link,
-		.nl_u = { .ip6_u =
-				{ .daddr = ipv6h->daddr, }
-		}
+		.fl6_dst = ipv6h->daddr,
 	};
 
 	dst = ip6_route_output(net, NULL, &fl);
diff --git a/net/ipv6/netfilter.c b/net/ipv6/netfilter.c
index 7155b24..35915e8 100644
--- a/net/ipv6/netfilter.c
+++ b/net/ipv6/netfilter.c
@@ -18,10 +18,8 @@ int ip6_route_me_harder(struct sk_buff *skb)
 	struct flowi fl = {
 		.oif = skb->sk ? skb->sk->sk_bound_dev_if : 0,
 		.mark = skb->mark,
-		.nl_u =
-		{ .ip6_u =
-		  { .daddr = iph->daddr,
-		    .saddr = iph->saddr, } },
+		.fl6_dst = iph->daddr,
+		.fl6_src = iph->saddr,
 	};
 
 	dst = ip6_route_output(net, skb->sk, &fl);
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index fc32833..7763663 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -558,11 +558,7 @@ struct rt6_info *rt6_lookup(struct net *net, const struct in6_addr *daddr,
 {
 	struct flowi fl = {
 		.oif = oif,
-		.nl_u = {
-			.ip6_u = {
-				.daddr = *daddr,
-			},
-		},
+		.fl6_dst = *daddr,
 	};
 	struct dst_entry *dst;
 	int flags = strict ? RT6_LOOKUP_F_IFACE : 0;
@@ -778,13 +774,9 @@ void ip6_route_input(struct sk_buff *skb)
 	int flags = RT6_LOOKUP_F_HAS_SADDR;
 	struct flowi fl = {
 		.iif = skb->dev->ifindex,
-		.nl_u = {
-			.ip6_u = {
-				.daddr = iph->daddr,
-				.saddr = iph->saddr,
-				.flowlabel = (* (__be32 *) iph)&IPV6_FLOWINFO_MASK,
-			},
-		},
+		.fl6_dst = iph->daddr,
+		.fl6_src = iph->saddr,
+		.fl6_flowlabel = (* (__be32 *) iph)&IPV6_FLOWINFO_MASK,
 		.mark = skb->mark,
 		.proto = iph->nexthdr,
 	};
@@ -1463,12 +1455,8 @@ static struct rt6_info *ip6_route_redirect(struct in6_addr *dest,
 	struct ip6rd_flowi rdfl = {
 		.fl = {
 			.oif = dev->ifindex,
-			.nl_u = {
-				.ip6_u = {
-					.daddr = *dest,
-					.saddr = *src,
-				},
-			},
+			.fl6_dst = *dest,
+			.fl6_src = *src,
 		},
 	};
 
diff --git a/net/ipv6/sit.c b/net/ipv6/sit.c
index d6bfaec..6e48a80 100644
--- a/net/ipv6/sit.c
+++ b/net/ipv6/sit.c
@@ -730,10 +730,9 @@ static netdev_tx_t ipip6_tunnel_xmit(struct sk_buff *skb,
 	}
 
 	{
-		struct flowi fl = { .nl_u = { .ip4_u =
-					      { .daddr = dst,
-						.saddr = tiph->saddr,
-						.tos = RT_TOS(tos) } },
+		struct flowi fl = { .fl4_dst = dst,
+				    .fl4_src = tiph->saddr,
+				    .fl4_tos = RT_TOS(tos),
 				    .oif = tunnel->parms.link,
 				    .proto = IPPROTO_IPV6 };
 		if (ip_route_output_key(dev_net(dev), &rt, &fl)) {
@@ -855,10 +854,9 @@ static void ipip6_tunnel_bind_dev(struct net_device *dev)
 	iph = &tunnel->parms.iph;
 
 	if (iph->daddr) {
-		struct flowi fl = { .nl_u = { .ip4_u =
-					      { .daddr = iph->daddr,
-						.saddr = iph->saddr,
-						.tos = RT_TOS(iph->tos) } },
+		struct flowi fl = { .fl4_dst = iph->daddr,
+				    .fl4_src = iph->saddr,
+				    .fl4_tos = RT_TOS(iph->tos),
 				    .oif = tunnel->parms.link,
 				    .proto = IPPROTO_IPV6 };
 		struct rtable *rt;
diff --git a/net/l2tp/l2tp_ip.c b/net/l2tp/l2tp_ip.c
index 0bf6a59..04635e8 100644
--- a/net/l2tp/l2tp_ip.c
+++ b/net/l2tp/l2tp_ip.c
@@ -476,15 +476,13 @@ static int l2tp_ip_sendmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *m
 
 		{
 			struct flowi fl = { .oif = sk->sk_bound_dev_if,
-					    .nl_u = { .ip4_u = {
-							.daddr = daddr,
-							.saddr = inet->inet_saddr,
-							.tos = RT_CONN_FLAGS(sk) } },
+					    .fl4_dst = daddr,
+					    .fl4_src = inet->inet_saddr,
+					    .fl4_tos = RT_CONN_FLAGS(sk),
 					    .proto = sk->sk_protocol,
 					    .flags = inet_sk_flowi_flags(sk),
-					    .uli_u = { .ports = {
-							 .sport = inet->inet_sport,
-							 .dport = inet->inet_dport } } };
+					    .fl_ip_sport = inet->inet_sport,
+					    .fl_ip_dport = inet->inet_dport };
 
 			/* If this fails, retransmit mechanism of transport layer will
 			 * keep trying until route appears or the connection times
diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c
index 5f5daa3..c6f2936 100644
--- a/net/netfilter/ipvs/ip_vs_ctl.c
+++ b/net/netfilter/ipvs/ip_vs_ctl.c
@@ -110,10 +110,8 @@ static int __ip_vs_addr_is_local_v6(const struct in6_addr *addr)
 	struct rt6_info *rt;
 	struct flowi fl = {
 		.oif = 0,
-		.nl_u = {
-			.ip6_u = {
-				.daddr = *addr,
-				.saddr = { .s6_addr32 = {0, 0, 0, 0} }, } },
+		.fl6_dst = *addr,
+		.fl6_src = { .s6_addr32 = {0, 0, 0, 0} },
 	};
 
 	rt = (struct rt6_info *)ip6_route_output(&init_net, NULL, &fl);
diff --git a/net/netfilter/ipvs/ip_vs_xmit.c b/net/netfilter/ipvs/ip_vs_xmit.c
index 10bd39c..5325a3f 100644
--- a/net/netfilter/ipvs/ip_vs_xmit.c
+++ b/net/netfilter/ipvs/ip_vs_xmit.c
@@ -96,12 +96,8 @@ __ip_vs_get_out_rt(struct sk_buff *skb, struct ip_vs_dest *dest,
 		if (!(rt = (struct rtable *)
 		      __ip_vs_dst_check(dest, rtos))) {
 			struct flowi fl = {
-				.oif = 0,
-				.nl_u = {
-					.ip4_u = {
-						.daddr = dest->addr.ip,
-						.saddr = 0,
-						.tos = rtos, } },
+				.fl4_dst = dest->addr.ip,
+				.fl4_tos = rtos,
 			};
 
 			if (ip_route_output_key(net, &rt, &fl)) {
@@ -118,12 +114,8 @@ __ip_vs_get_out_rt(struct sk_buff *skb, struct ip_vs_dest *dest,
 		spin_unlock(&dest->dst_lock);
 	} else {
 		struct flowi fl = {
-			.oif = 0,
-			.nl_u = {
-				.ip4_u = {
-					.daddr = daddr,
-					.saddr = 0,
-					.tos = rtos, } },
+			.fl4_dst = daddr,
+			.fl4_tos = rtos,
 		};
 
 		if (ip_route_output_key(net, &rt, &fl)) {
@@ -178,14 +170,9 @@ __ip_vs_reroute_locally(struct sk_buff *skb)
 		refdst_drop(orefdst);
 	} else {
 		struct flowi fl = {
-			.oif = 0,
-			.nl_u = {
-				.ip4_u = {
-					.daddr = iph->daddr,
-					.saddr = iph->saddr,
-					.tos = RT_TOS(iph->tos),
-				}
-			},
+			.fl4_dst = iph->daddr,
+			.fl4_src = iph->saddr,
+			.fl4_tos = RT_TOS(iph->tos),
 			.mark = skb->mark,
 		};
 		struct rtable *rt;
@@ -216,12 +203,7 @@ __ip_vs_route_output_v6(struct net *net, struct in6_addr *daddr,
 {
 	struct dst_entry *dst;
 	struct flowi fl = {
-		.oif = 0,
-		.nl_u = {
-			.ip6_u = {
-				.daddr = *daddr,
-			},
-		},
+		.fl6_dst = *daddr,
 	};
 
 	dst = ip6_route_output(net, NULL, &fl);
diff --git a/net/netfilter/xt_TEE.c b/net/netfilter/xt_TEE.c
index 22a2d42..231867d 100644
--- a/net/netfilter/xt_TEE.c
+++ b/net/netfilter/xt_TEE.c
@@ -70,9 +70,9 @@ tee_tg_route4(struct sk_buff *skb, const struct xt_tee_tginfo *info)
 			return false;
 		fl.oif = info->priv->oif;
 	}
-	fl.nl_u.ip4_u.daddr = info->gw.ip;
-	fl.nl_u.ip4_u.tos   = RT_TOS(iph->tos);
-	fl.nl_u.ip4_u.scope = RT_SCOPE_UNIVERSE;
+	fl.fl4_dst = info->gw.ip;
+	fl.fl4_tos = RT_TOS(iph->tos);
+	fl.fl4_scope = RT_SCOPE_UNIVERSE;
 	if (ip_route_output_key(net, &rt, &fl) != 0)
 		return false;
 
@@ -150,9 +150,9 @@ tee_tg_route6(struct sk_buff *skb, const struct xt_tee_tginfo *info)
 			return false;
 		fl.oif = info->priv->oif;
 	}
-	fl.nl_u.ip6_u.daddr = info->gw.in6;
-	fl.nl_u.ip6_u.flowlabel = ((iph->flow_lbl[0] & 0xF) << 16) |
-				  (iph->flow_lbl[1] << 8) | iph->flow_lbl[2];
+	fl.fl6_dst = info->gw.in6;
+	fl.fl6_flowlabel = ((iph->flow_lbl[0] & 0xF) << 16) |
+			   (iph->flow_lbl[1] << 8) | iph->flow_lbl[2];
 	dst = ip6_route_output(net, NULL, &fl);
 	if (dst == NULL)
 		return false;
diff --git a/net/rxrpc/ar-peer.c b/net/rxrpc/ar-peer.c
index 9f1729b..a53fb25 100644
--- a/net/rxrpc/ar-peer.c
+++ b/net/rxrpc/ar-peer.c
@@ -47,12 +47,12 @@ static void rxrpc_assess_MTU_size(struct rxrpc_peer *peer)
 	case AF_INET:
 		fl.oif = 0;
 		fl.proto = IPPROTO_UDP,
-		fl.nl_u.ip4_u.saddr = 0;
-		fl.nl_u.ip4_u.daddr = peer->srx.transport.sin.sin_addr.s_addr;
-		fl.nl_u.ip4_u.tos = 0;
+		fl.fl4_dst = peer->srx.transport.sin.sin_addr.s_addr;
+		fl.fl4_src = 0;
+		fl.fl4_tos = 0;
 		/* assume AFS.CM talking to AFS.FS */
-		fl.uli_u.ports.sport = htons(7001);
-		fl.uli_u.ports.dport = htons(7000);
+		fl.fl_ip_sport = htons(7001);
+		fl.fl_ip_dport = htons(7000);
 		break;
 	default:
 		BUG();

^ permalink raw reply related

* Re: possible kernel oops from user MSS
From: David Miller @ 2010-11-12 23:26 UTC (permalink / raw)
  To: mzhang; +Cc: netdev
In-Reply-To: <4CDDC6EE.2010005@mvista.com>

From: Min Zhang <mzhang@mvista.com>
Date: Fri, 12 Nov 2010 14:59:58 -0800

> Regarding commit 7a1abd08d52fdeddb3e9a5a33f2f15cc6a5674d2 ("tcp:
> Increase TCP_MAXSEG socket option minimum"). What is the reason
> TCP_MAXSEG minimum be 64? Isn't the exact be 40 which is
> TCPOLEN_MD5SIG_ALIGNED(20) + TCPOLEN_TSTAMP_ALIGNED(12) + 8?
> 
> Or is it better to use TCP_MIN_MSS from tcp.h:
> 
> /* Minimal accepted MSS. It is (60+60+8) - (20+20). */
> #define TCP_MIN_MSS        88U

I suppose TCP_MIN_MSS would be better to use, I'll make that
change, thanks.

^ permalink raw reply

* RE: [RFC PATCH] network: return errors if we know tcp_connect failed
From: Hua Zhong @ 2010-11-12 23:14 UTC (permalink / raw)
  To: 'Patrick McHardy'
  Cc: 'Eric Paris', netdev, linux-kernel, davem, kuznet, pekkas,
	jmorris, yoshfuji
In-Reply-To: <4CDCEE65.3060105@trash.net>

> On 11.11.2010 22:58, Hua Zhong wrote:
> >> Yes, I realize this is little different than if the
> >> SYN was dropped in the first network device, but it is different
> >> because we know what happened!  We know that connect() call failed
> >> and that there isn't anything coming back.
> >
> > I would argue that -j DROP should behave exactly as the packet is
> dropped in the network, while -j REJECT should signal the failure to
> the application as soon as possible (which it doesn't seem to do).
> 
> It sends an ICMP error or TCP reset. Interpretation is up to TCP.

Huh? It's the OUTPUT chain we are talking about. There is no ICMP error or
TCP reset.


^ permalink raw reply

* Re: [PATCH] r8169: fix checksum broken
From: Francois Romieu @ 2010-11-12 23:13 UTC (permalink / raw)
  To: Shan Wei; +Cc: netdev@vger.kernel.org, David Miller
In-Reply-To: <20101112224746.GA6676@electric-eye.fr.zoreil.com>

Francois Romieu <romieu@fr.zoreil.com> :
[...]
> Which kind of device do you use : PCI-E 8168 / 810x or PCI 8169 ?

Wrong page. Forget it.

Acked-by: Francois Romieu <romieu@fr.zoreil.com>

-- 
Ueimor

^ permalink raw reply

* Re: possible kernel oops from user MSS
From: Min Zhang @ 2010-11-12 22:59 UTC (permalink / raw)
  To: netdev
In-Reply-To: <AANLkTin-gXceUQxKvQeP8Nc8oXZDJnyjoFUjYD5x_g_y@mail.gmail.com>

Regarding commit 7a1abd08d52fdeddb3e9a5a33f2f15cc6a5674d2 ("tcp:
Increase TCP_MAXSEG socket option minimum"). What is the reason
TCP_MAXSEG minimum be 64? Isn't the exact be 40 which is
TCPOLEN_MD5SIG_ALIGNED(20) + TCPOLEN_TSTAMP_ALIGNED(12) + 8?

Or is it better to use TCP_MIN_MSS from tcp.h:

/* Minimal accepted MSS. It is (60+60+8) - (20+20). */
#define TCP_MIN_MSS        88U

^ permalink raw reply

* Re: [PATCH] r8169: fix checksum broken
From: Francois Romieu @ 2010-11-12 22:47 UTC (permalink / raw)
  To: Shan Wei; +Cc: netdev@vger.kernel.org, David Miller
In-Reply-To: <4CDD13BD.7060109@cn.fujitsu.com>

Shan Wei <shanwei@cn.fujitsu.com> :
> If r8196 received packets with invalid sctp/igmp(not tcp, udp) checksum, r8196 set skb->ip_summed
> wit CHECKSUM_UNNECESSARY. This cause that upper protocol don't check checksum field.

...

Which kind of device do you use : PCI-E 8168 / 810x or PCI 8169 ?

Have a nice night.

-- 
Ueimor

^ permalink raw reply

* Re: [PATCH 4/10] Fix leaking of kernel heap addresses in net/
From: David Miller @ 2010-11-12 22:46 UTC (permalink / raw)
  To: adobriyan; +Cc: shemminger, netdev
In-Reply-To: <20101112224026.GA21134@core2.telecom.by>

From: Alexey Dobriyan <adobriyan@gmail.com>
Date: Sat, 13 Nov 2010 00:40:27 +0200

> I think we should not expose kernel pointers in future interfaces,
> but leave existing ones alone.

That seems like the most reasonable and pragmatic position I've seen
thus far :-)

^ permalink raw reply

* Re: [PATCH 4/10] Fix leaking of kernel heap addresses in net/
From: Alexey Dobriyan @ 2010-11-12 22:40 UTC (permalink / raw)
  To: David Miller; +Cc: shemminger, netdev
In-Reply-To: <20101112.123738.179940542.davem@davemloft.net>

On Fri, Nov 12, 2010 at 12:37:38PM -0800, David Miller wrote:
> From: Alexey Dobriyan <adobriyan@gmail.com>
> Date: Fri, 12 Nov 2010 22:18:50 +0200
> 
> > On Fri, Nov 12, 2010 at 08:33:15AM -0800, Stephen Hemminger wrote:
> >> Also, the whole idea needs to be under a config option, so only
> >> the paranoid idiots turn it on.
> > 
> > Would be fun if something will break because ffff8800bcd498c0
> > will become something else. :-)
> 
> Actually, this is not even a joke.
> 
> Take a look at how we track what sockets a user wants dumped via
> the inet_diag netlink facility, the socket pointer is used as
> the identification cookie.

I think we should not expose kernel pointers in future interfaces,
but leave existing ones alone.

^ permalink raw reply

* [GIT] Networking
From: David Miller @ 2010-11-12 22:38 UTC (permalink / raw)
  To: torvalds; +Cc: akpm, netdev, linux-kernel


1) mac80211 fix for handling of a scan cancellation, from Brian Cavagnolo

2) Minor overflow in can-bcm, from Oliver Hartkopp

3) cxgb4vf bug fixes from Casey Leedom

4) X.25 facility parsing validation bug fixes from Dan Rosenberg

5) ATM solos bug fixes from David Woodhouse.

6) TCP does state changes on unlocked socket, oops.  Based upon
   a report by Alexey Kuznetsov.

7) Don't allow socket filters to read uninitialized memory.  Based
   upon a patch by Eric Dumazet.

8) TCP_MAXSEG, if set low enough, can result in a divide by zero or
   by a negative number (because we sometimes subtract certain TCP
   option sizes), bump minimum from 8 to 64.

9) MaxBookAir3,1(2) bluetooth support, from Edgar Hucek.

10) ip_mc_drop_socket() does wrong refcounting, fix from Eric Dumazet.

11) VLAN regression fixes from Hao Zheng.

12) Link state reporting fix in virtio-net from Jason Wang.

13) ucc_geth bug fixes from Joakim Tjernlund.

14) queue_map can be used uninitialized in pktgen, from Junchang Wang.

15) axnet_cs resume fix from Ken Kawasaki.

16) Don't lose state of ipv6 addresses which will be kept across an
    ifdown/ifup cycle, fix and report from Lorenzo Colitti.

17) af_packet recvmsg() header size check is borked in the GSO case,
    fix from Mariusz Kozlowski.

18) dst cache build fix, missing linux/cache.h include, from Paul Mundt.

19) Fix from Rafael J. Wysocki for device_set_wakeup_enable() being called
    in gianfar driver from unsleepable context.

20) Fragment overlap check in ipv6 and netfilter/ipv6 is buggered, from
    Shan Wei.

21) Fix message size calculation in rtnetlink link messages, from Thomas
    Graf.

22) ax25/packet ->getsockopt() info leak to userland fix from Vasiliy
    Kulikov

23) r8169 regression fixes from françois romieu

24) Packet scheduler basic classifier doesn't report stats, fix from
    Stephen Hemminger.

Please pull, thanks a lot!

The following changes since commit 5398a64c63a69a0ac33dbae458ea4aab0dc23f14:

  Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6 (2010-11-08 10:55:29 -0800)

are available in the git repository at:

  master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6.git master

Ben Greear (2):
      docs: Add neigh/gc_thresh3 and route/max_size documentation.
      ipv6: Warn users if maximum number of routes is reached.

Brian Cavagnolo (1):
      mac80211: unset SDATA_STATE_OFFCHANNEL when cancelling a scan

Casey Leedom (6):
      cxgb4vf: don't implement trivial (and incorrect) ndo_select_queue()
      cxgb4vf: fix bug in Generic Receive Offload
      cxgb4vf: fix some errors in Gather List to skb conversion
      cxgb4vf: flesh out PCI Device ID Table ...
      cxgb4vf: Fail open if link_start() fails.
      cxgb4vf: add call to Firmware to reset VF State.

Christian Lamparter (1):
      carl9170: usbid table updates

Dan Rosenberg (1):
      x25: Prevent crashing when parsing bad X.25 facilities

Daniel Drake (1):
      libertas: terminate scan when stopping interface

David S. Miller (5):
      filter: make sure filters dont read uninitialized memory
      tcp: Increase TCP_MAXSEG socket option minimum.
      Merge branch 'master' of git://git.kernel.org/.../linville/wireless-2.6
      Merge git://git.kernel.org/.../kaber/nf-2.6
      tcp: Don't change unlocked socket state in tcp_v4_err().

David Woodhouse (2):
      solos: Add 'Firmware' attribute for Traverse overall firmware version
      solos: Refuse to upgrade firmware with older FPGA. It doesn't work.

Dmitry Torokhov (1):
      NET: pktgen - fix compile warning

Edgar (gimli) Hucek (1):
      Bluetooth: Add MacBookAir3,1(2) support

Eric Dumazet (3):
      inet: fix ip_mc_drop_socket()
      net/dst: dst_dev_event() called after other notifiers
      net: avoid limits overflow

Eric Paris (1):
      netfilter: NF_HOOK_COND has wrong conditional

Felix Fietkau (2):
      cfg80211: fix a crash in dev lookup on dump commands
      ath9k: check old power mode before clearing cycle counters

Frank Blaschka (1):
      qeth: fix race condition during device startup

Guillaume Chazarain (2):
      skge: Remove tx queue stopping in skge_devinit()
      net: Detect and ignore netif_stop_queue() calls before register_netdev()

Gustavo F. Padovan (1):
      Bluetooth: fix endianness conversion in L2CAP

Haitao Zhang (1):
      ath9k_htc: Add support for device ID 3346

Hao Zheng (3):
      vlan: Add function to retrieve EtherType from vlan packets.
      bnx2x: Look inside vlan when determining checksum proto.
      ixgbe: Look inside vlan when determining offload protocol.

Jason Wang (1):
      virtio-net: init link state correctly

Joakim Tjernlund (2):
      ucc_geth: Do not bring the whole IF down when TX failure.
      ucc_geth: Fix deadlock

Johan Hedberg (1):
      Bluetooth: Fix non-SSP auth request for HIGH security level sockets

Junchang Wang (1):
      pktgen: correct uninitialized queue_map

Ken Kawasaki (1):
      axnet_cs: fix resume problem for some Ax88790 chip

Kulikov Vasiliy (1):
      net: tipc: fix information leak to userland

Linus Torvalds (1):
      libipw: fix proc entry removal

Lorenzo Colitti (1):
      ipv6: addrconf: don't remove address state on ifdown if the address is being kept

Luiz Augusto von Dentz (1):
      Bluetooth: fix not setting security level when creating a rfcomm session

Mariusz Kozlowski (1):
      net: Fix header size check for GSO case in recvmsg (af_packet)

Matthew Garrett (1):
      Bluetooth: Enable USB autosuspend by default on btusb

Oliver Hartkopp (1):
      can-bcm: fix minor heap overflow

Paul Mundt (1):
      net dst: need linux/cache.h for ____cacheline_aligned_in_smp.

Pavel Emelyanov (1):
      rds: Fix rds message leak in rds_message_map_pages

Rafael J. Wysocki (1):
      gianfar: Do not call device_set_wakeup_enable() under a spinlock

Rajkumar Manoharan (3):
      ath9k: Avoid HW opmode overridden on monitor mode changes
      ath9k_htc: Fix probe failure if CONFIG_USB_DEBUG enabled
      ath9k_hw: Fix memory leak on ath9k_hw_rf_alloc_ext_banks failure

Randy Dunlap (1):
      Bluetooth: fix hidp kconfig dependency warning

Shan Wei (2):
      ipv6: fix overlap check for fragments
      netfilter: ipv6: fix overlap check for fragments

Thomas Graf (1):
      rtnetlink: Fix message size calculation for link messages

Ursula Braun (1):
      qeth: remove dev_queue_xmit invocation

Vasanthakumar Thiagarajan (1):
      ath9k_hw: Fix AR9280 surprise removal during frequent idle on/off

Vasiliy Kulikov (2):
      net: ax25: fix information leak to userland
      net: packet: fix information leak to userland

Vivek Natarajan (1):
      ath9k: Fix a DMA latency issue for Intel Pinetrail platforms.

Wey-Yi Guy (1):
      iwlwifi: dont use pci_dev before it being assign

françois romieu (2):
      r8169: revert "Handle rxfifo errors on 8168 chips"
      r8169: fix sleeping while holding spinlock.

stephen hemminger (1):
      classifier: report statistics for basic classifier

steven miao (1):
      Bluetooth: fix unaligned access to l2cap conf data

 Documentation/networking/ip-sysctl.txt       |    9 ++
 drivers/atm/solos-attrlist.c                 |    1 +
 drivers/atm/solos-pci.c                      |    8 ++
 drivers/bluetooth/btusb.c                    |    5 +
 drivers/net/bnx2x/bnx2x_cmn.c                |    2 +-
 drivers/net/cxgb4vf/cxgb4vf_main.c           |   42 ++++++---
 drivers/net/cxgb4vf/sge.c                    |  122 ++++++++++++++++----------
 drivers/net/cxgb4vf/t4vf_common.h            |    1 +
 drivers/net/cxgb4vf/t4vf_hw.c                |   19 ++++
 drivers/net/gianfar_ethtool.c                |    5 +-
 drivers/net/ixgbe/ixgbe_main.c               |   60 +++++++------
 drivers/net/pcmcia/axnet_cs.c                |   30 ++++--
 drivers/net/r8169.c                          |    9 +-
 drivers/net/skge.c                           |    1 -
 drivers/net/ucc_geth.c                       |   25 ++++--
 drivers/net/virtio_net.c                     |   12 ++-
 drivers/net/wireless/ath/ath9k/ar9002_hw.c   |    3 +
 drivers/net/wireless/ath/ath9k/ath9k.h       |    1 +
 drivers/net/wireless/ath/ath9k/hif_usb.c     |   31 +++----
 drivers/net/wireless/ath/ath9k/hw.c          |   15 +++-
 drivers/net/wireless/ath/ath9k/hw.h          |    1 +
 drivers/net/wireless/ath/ath9k/init.c        |    8 ++
 drivers/net/wireless/ath/ath9k/main.c        |   29 +++++--
 drivers/net/wireless/ath/ath9k/recv.c        |    4 +-
 drivers/net/wireless/ath/ath9k/reg.h         |    1 +
 drivers/net/wireless/ath/carl9170/usb.c      |    4 +-
 drivers/net/wireless/ipw2x00/libipw_module.c |    9 +-
 drivers/net/wireless/iwlwifi/iwl3945-base.c  |    3 +-
 drivers/net/wireless/libertas/cfg.c          |    5 +-
 drivers/net/wireless/libertas/dev.h          |    1 +
 drivers/net/wireless/libertas/main.c         |    7 ++
 drivers/s390/net/qeth_core.h                 |    9 --
 drivers/s390/net/qeth_core_main.c            |   55 ++----------
 include/linux/if_vlan.h                      |   25 +++++
 include/linux/netdevice.h                    |    5 +
 include/linux/netfilter.h                    |    2 +-
 include/net/dn.h                             |    2 +-
 include/net/dst_ops.h                        |    1 +
 include/net/sock.h                           |    4 +-
 include/net/tcp.h                            |    6 +-
 include/net/udp.h                            |    4 +-
 net/ax25/af_ax25.c                           |    2 +-
 net/bluetooth/hci_event.c                    |    6 ++
 net/bluetooth/hidp/Kconfig                   |    2 +-
 net/bluetooth/l2cap.c                        |    8 +-
 net/bluetooth/rfcomm/core.c                  |   13 ++-
 net/can/bcm.c                                |    2 +-
 net/core/dst.c                               |    1 +
 net/core/filter.c                            |   64 ++++++++------
 net/core/pktgen.c                            |    6 +-
 net/core/rtnetlink.c                         |    9 +-
 net/core/sock.c                              |   14 ++--
 net/decnet/af_decnet.c                       |    2 +-
 net/decnet/sysctl_net_decnet.c               |    4 +-
 net/ipv4/igmp.c                              |    4 +-
 net/ipv4/proc.c                              |    8 +-
 net/ipv4/sysctl_net_ipv4.c                   |    5 +-
 net/ipv4/tcp.c                               |    6 +-
 net/ipv4/tcp_input.c                         |   11 ++-
 net/ipv4/tcp_ipv4.c                          |    8 +-
 net/ipv4/udp.c                               |    4 +-
 net/ipv6/addrconf.c                          |   24 ++----
 net/ipv6/netfilter/nf_conntrack_reasm.c      |    2 +-
 net/ipv6/reassembly.c                        |    2 +-
 net/ipv6/route.c                             |    6 +-
 net/mac80211/iface.c                         |    6 +-
 net/packet/af_packet.c                       |    7 +-
 net/rds/message.c                            |    4 +-
 net/sched/cls_basic.c                        |    4 +
 net/sctp/protocol.c                          |    2 +-
 net/sctp/socket.c                            |    4 +-
 net/sctp/sysctl.c                            |    4 +-
 net/tipc/socket.c                            |    1 +
 net/wireless/nl80211.c                       |    4 +-
 net/x25/x25_facilities.c                     |   12 ++-
 75 files changed, 512 insertions(+), 335 deletions(-)

^ permalink raw reply

* Re: can-bcm: fix minor heap overflow
From: David Miller @ 2010-11-12 22:07 UTC (permalink / raw)
  To: socketcan; +Cc: netdev, drosenberg, torvalds, urs, security
In-Reply-To: <4CDB1856.4040001@hartkopp.net>

From: Oliver Hartkopp <socketcan@hartkopp.net>
Date: Wed, 10 Nov 2010 23:10:30 +0100

> On 64-bit platforms the ASCII representation of a pointer may be up to 17
> bytes long. This patch increases the length of the buffer accordingly.
> 
> http://marc.info/?l=linux-netdev&m=128872251418192&w=2
> 
> Reported-by: Dan Rosenberg <drosenberg@vsecurity.com>
> Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
> CC: Linus Torvalds <torvalds@linux-foundation.org>

Patch applied, thanks.

^ permalink raw reply

* Re: [PATCH] gianfar: Do not call device_set_wakeup_enable() under a spinlock
From: David Miller @ 2010-11-12 22:06 UTC (permalink / raw)
  To: rjw; +Cc: daniel.blueman, romieu, linux-kernel, netdev
In-Reply-To: <201011092254.19550.rjw@sisk.pl>

From: "Rafael J. Wysocki" <rjw@sisk.pl>
Date: Tue, 9 Nov 2010 22:54:19 +0100

> The gianfar driver calls device_set_wakeup_enable() under a spinlock,
> which causes a problem to happen after the recent core power
> management changes, because this function can sleep now.  Fix this
> by moving the device_set_wakeup_enable() call out of the
> spinlock-protected area.
> 
> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>

Patch applied, thank you.

^ permalink raw reply

* Re: [PATCH v2 2/2] ipv6: Warn users if maximum number of routes is reached.
From: David Miller @ 2010-11-12 22:04 UTC (permalink / raw)
  To: greearb; +Cc: netdev
In-Reply-To: <1289255628-9596-1-git-send-email-greearb@candelatech.com>

From: Ben Greear <greearb@candelatech.com>
Date: Mon,  8 Nov 2010 14:33:48 -0800

> This gives users at least some clue as to what the problem
> might be and how to go about fixing it.
> 
> Signed-off-by: Ben Greear <greearb@candelatech.com>

Applied.

^ permalink raw reply

* Re: [PATCH 1/2] docs: Add neigh/gc_thresh3 and route/max_size documentation.
From: David Miller @ 2010-11-12 22:04 UTC (permalink / raw)
  To: greearb; +Cc: netdev
In-Reply-To: <1289243629-20789-1-git-send-email-greearb@candelatech.com>

From: Ben Greear <greearb@candelatech.com>
Date: Mon,  8 Nov 2010 11:13:48 -0800

> Signed-off-by: Ben Greear <greearb@candelatech.com>

Applied.

> +	with large numbers of directly-connected peers. 

This line has trailing whitespace, please avoid that in the
future so I don't have to correct your patches by hand.

Thanks.

^ permalink raw reply

* Re: [PATCH kernel 2.6.37-rc1] axnet_cs: fix resume problem for some Ax88790 chip
From: David Miller @ 2010-11-12 22:01 UTC (permalink / raw)
  To: ken_kawasaki; +Cc: netdev
In-Reply-To: <20101107001124.7d8ef6c4.ken_kawasaki@spring.nifty.jp>

From: Ken Kawasaki <ken_kawasaki@spring.nifty.jp>
Date: Sun, 7 Nov 2010 00:11:24 +0900

> 
> axnet_cs:
>     Some Ax88790 chip need to reinitialize the CISREG_CCSR register
>     after resume.
> 
> Signed-off-by: Ken Kawasaki <ken_kawasaki@spring.nifty.jp>

Applied, thank you.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox