Netdev List

Netdev List
 help / color / mirror / Atom feed

* [PATCH 1/1] net: rtnetlink.h -- only include linux/netdevice.h when used by the kernel
From: Andy Whitcroft @ 2010-11-15 16:01 UTC (permalink / raw)
  To: David S. Miller", Eric Dumazet
  Cc: netdev, linux-kernel, Andy Whitcroft, Tim Gardner
In-Reply-To: <1289836919-19153-1-git-send-email-apw@canonical.com>

The commit below added a new helper dev_ingress_queue to cleanly obtain the
ingress queue pointer.  This necessitated including 'linux/netdevice.h':

  commit 24824a09e35402b8d58dcc5be803a5ad3937bdba
  Author: Eric Dumazet <eric.dumazet@gmail.com>
  Date:   Sat Oct 2 06:11:55 2010 +0000

    net: dynamic ingress_queue allocation

However this include triggers issues for applications in userspace
which use the rtnetlink interfaces.  Commonly this requires they include
'net/if.h' and 'linux/rtnetlink.h' leading to a compiler error as below:

  In file included from /usr/include/linux/netdevice.h:28:0,
                   from /usr/include/linux/rtnetlink.h:9,
                   from t.c:2:
  /usr/include/linux/if.h:135:8: error: redefinition of ‘struct ifmap’
  /usr/include/net/if.h:112:8: note: originally defined here
  /usr/include/linux/if.h:169:8: error: redefinition of ‘struct ifreq’
  /usr/include/net/if.h:127:8: note: originally defined here
  /usr/include/linux/if.h:218:8: error: redefinition of ‘struct ifconf’
  /usr/include/net/if.h:177:8: note: originally defined here

The new helper is only defined for the kernel and protected by __KERNEL__
therefore we can simply pull the include down into the same protected
section.

Signed-off-by: Andy Whitcroft <apw@canonical.com>
---
 include/linux/rtnetlink.h |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/include/linux/rtnetlink.h b/include/linux/rtnetlink.h
index d42f274..bbad657 100644
--- a/include/linux/rtnetlink.h
+++ b/include/linux/rtnetlink.h
@@ -6,7 +6,6 @@
 #include <linux/if_link.h>
 #include <linux/if_addr.h>
 #include <linux/neighbour.h>
-#include <linux/netdevice.h>

 /* rtnetlink families. Values up to 127 are reserved for real address
  * families, values above 128 may be used arbitrarily.
@@ -606,6 +605,7 @@ struct tcamsg {
 #ifdef __KERNEL__

 #include <linux/mutex.h>
+#include <linux/netdevice.h>

 static __inline__ int rtattr_strcmp(const struct rtattr *rta, const char *str)
 {
-- 
1.7.0.4

^ permalink raw reply related

* [PATCH 0/1] rtnetlink/netdevice include triggers userspace compiler errors
From: Andy Whitcroft @ 2010-11-15 16:01 UTC (permalink / raw)
  To: David S. Miller", Eric Dumazet
  Cc: netdev, linux-kernel, Andy Whitcroft, Tim Gardner

We have seen a number of reports of userspace applications (including
eglibc) which fail to compile when trying to use linux/rtnetlink.h.
It appears that a new helper function has necessitated the inclusion of
linux/netdevice.h which in turn causes a collision with userspace headers
from libc, with net/if.h.

It appears that this header is not required for the userspace exported
components of rtnetlink.h.  Following this email is a patch to pull this
include down in the the kernel specific section of this header.  It seems
to both fix this issue for userspace and still compiles correctly for
kernel use.

Against v2.6.37-rc1.

-apw

Andy Whitcroft (1):
  net: rtnetlink.h -- only include linux/netdevice.h when used by the
    kernel

 include/linux/rtnetlink.h |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

^ permalink raw reply

* Re: [RFC PATCH] network: return errors if we know tcp_connect failed
From: Patrick McHardy @ 2010-11-15 16:04 UTC (permalink / raw)
  To: Eric Paris
  Cc: Hua Zhong, netdev, linux-kernel, davem, kuznet, pekkas, jmorris,
	yoshfuji
In-Reply-To: <4CE15885.90003@trash.net>

On 15.11.2010 16:57, Patrick McHardy wrote:
> On 15.11.2010 16:47, Eric Paris wrote:
>>> iptables -A OUTPUT -p tcp -j REJECT --reject-with tcp-reset
>>>
>>> The second one will cause a hard error for the connection.
>>
>> Well I'm (I guess?) surprised that the --reject-with icmp doesn't do
>> anything with a local outgoing connection but --reject-with tcp-reset
>> does something like what I'm looking for.
>>
>> I notice the heavy lifting for this is done in 
>> net/ipv4/netfilter/ipt_REJECT.c::send_rest()
>> (and something very similar for IPv6)
>>
>> I really don't want to duplicate that code into SELinux (for obvious
>> reasons) and I'm wondering if anyone has objections to me making it
>> available outside of netlink and/or suggestions on how to make that code
>> available outside of netfilter (aka what header to expose it, and does
>> it still make logical sense in ipt_REJECT.c or somewhere else?)
> 
> I don't think having SELinux sending packets to handle local
> connections is a very elegant design, its not a firewall after
> all. What's wrong with reacting only to specific errno codes
> in tcp_connect()? You could f.i. return -ECONNREFUSED from
> SELinux, that one is pretty much guaranteed not to occur in
> the network stack itself and can be returned directly.

One more note: there is also the problem that the RST might never
reach the socket, f.i. because netfilter drops it, or TC actions
reroute it etc. With netfilter users are expected to make sure the
entire combination of network features does what the expect, but
that's probably not what you want for SELinux.

> That would need minor changes to nf_hook_slow so we can
> encode errno values in the upper 16 bits of the verdict,
> as we already do with the queue number. The added benefit
> is that we don't have to return EPERM anymore when f.i.
> rerouting fails.

^ permalink raw reply

* Re: [PATCH 0/5] bridge: RCU annotation and cleanup
From: Stephen Hemminger @ 2010-11-15 16:19 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: davem, eric.dumazet, netdev, bridge
In-Reply-To: <201011152123.HHB21896.HFOOVSMFOtLJQF@I-love.SAKURA.ne.jp>

On Mon, 15 Nov 2010 21:23:37 +0900
Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> wrote:

> > +			rp = rcu_dereference(hlist_next_rcu(rp->next));  
> 
> I think this one is hlist_next_rcu(rp).

Yes, you are correct.

-- 

^ permalink raw reply

* Re: [PATCH 14/44] drivers/net/ixgbe: Remove unnecessary semicolons
From: Rose, Gregory V @ 2010-11-15 16:24 UTC (permalink / raw)
  To: Joe Perches, Jiri Kosina
  Cc: e1000-devel@lists.sourceforge.net, Allan, Bruce W,
	Brandeburg, Jesse, linux-kernel@vger.kernel.org, Ronciak, John,
	Kirsher, Jeffrey T, netdev@vger.kernel.org
In-Reply-To: <7d2c334daa75c5221946a17d45c9de1901cf06e7.1289789604.git.joe@perches.com>

> -----Original Message-----
> From: Joe Perches [mailto:joe@perches.com]
> Sent: Sunday, November 14, 2010 7:05 PM
> To: Jiri Kosina
> Cc: Kirsher, Jeffrey T; Brandeburg, Jesse; Allan, Bruce W; Wyborny,
> Carolyn; Skidmore, Donald C; Rose, Gregory V; Waskiewicz Jr, Peter P;
> Duyck, Alexander H; Ronciak, John; e1000-devel@lists.sourceforge.net;
> netdev@vger.kernel.org; linux-kernel@vger.kernel.org
> Subject: [PATCH 14/44] drivers/net/ixgbe: Remove unnecessary semicolons
> 
> Signed-off-by: Joe Perches <joe@perches.com>
> ---
>  drivers/net/ixgbe/ixgbe_sriov.c |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/drivers/net/ixgbe/ixgbe_sriov.c
> b/drivers/net/ixgbe/ixgbe_sriov.c
> index 5428153..93f40bc 100644
> --- a/drivers/net/ixgbe/ixgbe_sriov.c
> +++ b/drivers/net/ixgbe/ixgbe_sriov.c
> @@ -68,7 +68,7 @@ static int ixgbe_set_vf_multicasts(struct ixgbe_adapter
> *adapter,
>  	 * addresses
>  	 */
>  	for (i = 0; i < entries; i++) {
> -		vfinfo->vf_mc_hashes[i] = hash_list[i];;
> +		vfinfo->vf_mc_hashes[i] = hash_list[i];
>  	}
> 
>  	for (i = 0; i < vfinfo->num_vf_mc_hashes; i++) {
> --
> 1.7.3.1.g432b3.dirty

Acked By: Greg Rose <Gregory.v.rose@intel.com>


------------------------------------------------------------------------------
Centralized Desktop Delivery: Dell and VMware Reference Architecture
Simplifying enterprise desktop deployment and management using
Dell EqualLogic storage and VMware View: A highly scalable, end-to-end
client virtualization framework. Read more!
http://p.sf.net/sfu/dell-eql-dev2dev
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit http://communities.intel.com/community/wired

^ permalink raw reply

* Re: [RFC PATCH] network: return errors if we know tcp_connect failed
From: Patrick McHardy @ 2010-11-15 16:36 UTC (permalink / raw)
  To: Eric Paris
  Cc: Hua Zhong, netdev, linux-kernel, davem, kuznet, pekkas, jmorris,
	yoshfuji, Netfilter Development Mailinglist
In-Reply-To: <4CE15885.90003@trash.net>

[-- Attachment #1: Type: text/plain, Size: 1599 bytes --]

On 15.11.2010 16:57, Patrick McHardy wrote:
> On 15.11.2010 16:47, Eric Paris wrote:
>> I notice the heavy lifting for this is done in 
>> net/ipv4/netfilter/ipt_REJECT.c::send_rest()
>> (and something very similar for IPv6)
>>
>> I really don't want to duplicate that code into SELinux (for obvious
>> reasons) and I'm wondering if anyone has objections to me making it
>> available outside of netlink and/or suggestions on how to make that code
>> available outside of netfilter (aka what header to expose it, and does
>> it still make logical sense in ipt_REJECT.c or somewhere else?)
> 
> I don't think having SELinux sending packets to handle local
> connections is a very elegant design, its not a firewall after
> all. What's wrong with reacting only to specific errno codes
> in tcp_connect()? You could f.i. return -ECONNREFUSED from
> SELinux, that one is pretty much guaranteed not to occur in
> the network stack itself and can be returned directly.
> 
> That would need minor changes to nf_hook_slow so we can
> encode errno values in the upper 16 bits of the verdict,
> as we already do with the queue number. The added benefit
> is that we don't have to return EPERM anymore when f.i.
> rerouting fails.

Patch for demonstration purposes attached. I've modified the
MARK target so it returns NF_DROP with an errno code of
-ECONNREFUSED:

# iptables -A OUTPUT -d 1.2.3.4 -j MARK --set-mark 1
# ping 1.2.3.4
PING 1.2.3.4 (1.2.3.4) 56(84) bytes of data.
ping: sendmsg: Connection refused
# telnet 1.2.3.4
Trying 1.2.3.4...
telnet: Unable to connect to remote host: Connection refused




[-- Attachment #2: x --]
[-- Type: text/plain, Size: 2252 bytes --]

diff --git a/include/linux/netfilter.h b/include/linux/netfilter.h
index 89341c3..ef2af8f 100644
--- a/include/linux/netfilter.h
+++ b/include/linux/netfilter.h
@@ -33,6 +33,8 @@
 
 #define NF_QUEUE_NR(x) ((((x) << NF_VERDICT_BITS) & NF_VERDICT_QMASK) | NF_QUEUE)
 
+#define NF_DROP_ERR(x) (((-x) << NF_VERDICT_BITS) | NF_DROP)
+
 /* only for userspace compatibility */
 #ifndef __KERNEL__
 /* Generic cache responses from hook functions.
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 05b1ecf..bb8f547 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -2592,6 +2592,7 @@ int tcp_connect(struct sock *sk)
 {
 	struct tcp_sock *tp = tcp_sk(sk);
 	struct sk_buff *buff;
+	int err;
 
 	tcp_connect_init(sk);
 
@@ -2614,7 +2615,9 @@ int tcp_connect(struct sock *sk)
 	sk->sk_wmem_queued += buff->truesize;
 	sk_mem_charge(sk, buff->truesize);
 	tp->packets_out += tcp_skb_pcount(buff);
-	tcp_transmit_skb(sk, buff, 1, sk->sk_allocation);
+	err = tcp_transmit_skb(sk, buff, 1, sk->sk_allocation);
+	if (err == -ECONNREFUSED)
+		return err;
 
 	/* We change tp->snd_nxt after the tcp_transmit_skb() call
 	 * in order to make this packet get counted in tcpOutSegs.
diff --git a/net/netfilter/core.c b/net/netfilter/core.c
index 85dabb8..32fcbe2 100644
--- a/net/netfilter/core.c
+++ b/net/netfilter/core.c
@@ -173,9 +173,11 @@ next_hook:
 			     outdev, &elem, okfn, hook_thresh);
 	if (verdict == NF_ACCEPT || verdict == NF_STOP) {
 		ret = 1;
-	} else if (verdict == NF_DROP) {
+	} else if ((verdict & NF_VERDICT_MASK) == NF_DROP) {
 		kfree_skb(skb);
-		ret = -EPERM;
+		ret = -(verdict >> NF_VERDICT_BITS);
+		if (ret == 0)
+			ret = -EPERM;
 	} else if ((verdict & NF_VERDICT_MASK) == NF_QUEUE) {
 		if (!nf_queue(skb, elem, pf, hook, indev, outdev, okfn,
 			      verdict >> NF_VERDICT_BITS))
diff --git a/net/netfilter/xt_mark.c b/net/netfilter/xt_mark.c
index 2334523..185330c 100644
--- a/net/netfilter/xt_mark.c
+++ b/net/netfilter/xt_mark.c
@@ -30,7 +30,7 @@ mark_tg(struct sk_buff *skb, const struct xt_action_param *par)
 	const struct xt_mark_tginfo2 *info = par->targinfo;
 
 	skb->mark = (skb->mark & ~info->mask) ^ info->mark;
-	return XT_CONTINUE;
+	return NF_DROP_ERR(-ECONNREFUSED);
 }
 
 static bool


^ permalink raw reply related

* [PATCH 3/5] netdev: add rcu annotations to receive handler hook
From: Stephen Hemminger @ 2010-11-15 16:38 UTC (permalink / raw)
  To: David Miller; +Cc: netdev
In-Reply-To: <20101115163809.215552143@vyatta.com>

[-- Attachment #1: rx_handler_rcu.patch --]
[-- Type: text/plain, Size: 595 bytes --]

Suggested by Eric's bridge RCU changes.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>

---
 include/linux/netdevice.h |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/include/linux/netdevice.h	2010-11-14 11:41:53.224298362 -0800
+++ b/include/linux/netdevice.h	2010-11-14 11:42:42.546359900 -0800
@@ -995,8 +995,8 @@ struct net_device {
 	unsigned int		real_num_rx_queues;
 #endif
 
-	rx_handler_func_t	*rx_handler;
-	void			*rx_handler_data;
+	rx_handler_func_t __rcu	*rx_handler;
+	void __rcu		*rx_handler_data;
 
 	struct netdev_queue __rcu *ingress_queue;
 



^ permalink raw reply

* [PATCH 4/5] bridge: fix RCU races with bridge port
From: Stephen Hemminger @ 2010-11-15 16:38 UTC (permalink / raw)
  To: David Miller; +Cc: netdev
In-Reply-To: <20101115163809.215552143@vyatta.com>

[-- Attachment #1: br-port-rcu-race.patch --]
[-- Type: text/plain, Size: 6605 bytes --]

The macro br_port_exists() is not enough protection when only
RCU is being used. There is a tiny race where other CPU has cleared port
handler hook, but is bridge port flag might still be set.


Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>

---
 net/bridge/br_fdb.c             |   15 +++++++++------
 net/bridge/br_if.c              |    5 +----
 net/bridge/br_netfilter.c       |   13 +++++++------
 net/bridge/br_netlink.c         |   10 ++++++----
 net/bridge/br_notify.c          |    2 +-
 net/bridge/br_private.h         |   16 +++++++++++++---
 net/bridge/br_stp_bpdu.c        |    8 ++++----
 net/bridge/netfilter/ebtables.c |   11 +++++------
 8 files changed, 46 insertions(+), 34 deletions(-)

--- a/net/bridge/br_netfilter.c	2010-11-15 08:24:41.696606136 -0800
+++ b/net/bridge/br_netfilter.c	2010-11-15 08:25:50.037644994 -0800
@@ -131,17 +131,18 @@ void br_netfilter_rtable_init(struct net
 
 static inline struct rtable *bridge_parent_rtable(const struct net_device *dev)
 {
-	if (!br_port_exists(dev))
-		return NULL;
-	return &br_port_get_rcu(dev)->br->fake_rtable;
+	struct net_bridge_port *port;
+
+	port = br_port_get_rcu(dev);
+	return port ? &port->br->fake_rtable : NULL;
 }
 
 static inline struct net_device *bridge_parent(const struct net_device *dev)
 {
-	if (!br_port_exists(dev))
-		return NULL;
+	struct net_bridge_port *port;
 
-	return br_port_get_rcu(dev)->br->dev;
+	port = br_port_get_rcu(dev);
+	return port ? port->br->dev : NULL;
 }
 
 static inline struct nf_bridge_info *nf_bridge_alloc(struct sk_buff *skb)
--- a/net/bridge/br_stp_bpdu.c	2010-11-15 08:24:41.696606136 -0800
+++ b/net/bridge/br_stp_bpdu.c	2010-11-15 08:25:50.037644994 -0800
@@ -141,10 +141,6 @@ void br_stp_rcv(const struct stp_proto *
 	struct net_bridge *br;
 	const unsigned char *buf;
 
-	if (!br_port_exists(dev))
-		goto err;
-	p = br_port_get_rcu(dev);
-
 	if (!pskb_may_pull(skb, 4))
 		goto err;
 
@@ -153,6 +149,10 @@ void br_stp_rcv(const struct stp_proto *
 	if (buf[0] != 0 || buf[1] != 0 || buf[2] != 0)
 		goto err;
 
+	p = br_port_get_rcu(dev);
+	if (!p)
+		goto err;
+
 	br = p->br;
 	spin_lock(&br->lock);
 
--- a/net/bridge/netfilter/ebtables.c	2010-11-15 08:24:41.696606136 -0800
+++ b/net/bridge/netfilter/ebtables.c	2010-11-15 08:25:50.041645485 -0800
@@ -128,6 +128,7 @@ ebt_basic_match(const struct ebt_entry *
                 const struct net_device *in, const struct net_device *out)
 {
 	const struct ethhdr *h = eth_hdr(skb);
+	const struct net_bridge_port *p;
 	__be16 ethproto;
 	int verdict, i;
 
@@ -148,13 +149,11 @@ ebt_basic_match(const struct ebt_entry *
 	if (FWINV2(ebt_dev_check(e->out, out), EBT_IOUT))
 		return 1;
 	/* rcu_read_lock()ed by nf_hook_slow */
-	if (in && br_port_exists(in) &&
-	    FWINV2(ebt_dev_check(e->logical_in, br_port_get_rcu(in)->br->dev),
-		   EBT_ILOGICALIN))
+	if (in && (p = br_port_get_rcu(in)) != NULL &&
+	    FWINV2(ebt_dev_check(e->logical_in, p->br->dev), EBT_ILOGICALIN))
 		return 1;
-	if (out && br_port_exists(out) &&
-	    FWINV2(ebt_dev_check(e->logical_out, br_port_get_rcu(out)->br->dev),
-		   EBT_ILOGICALOUT))
+	if (out && (p = br_port_get_rcu(out)) != NULL &&
+	    FWINV2(ebt_dev_check(e->logical_out, p->br->dev), EBT_ILOGICALOUT))
 		return 1;
 
 	if (e->bitmask & EBT_SOURCEMAC) {
--- a/net/bridge/br_fdb.c	2010-11-15 08:20:20.020385965 -0800
+++ b/net/bridge/br_fdb.c	2010-11-15 08:25:50.041645485 -0800
@@ -238,15 +238,18 @@ struct net_bridge_fdb_entry *__br_fdb_ge
 int br_fdb_test_addr(struct net_device *dev, unsigned char *addr)
 {
 	struct net_bridge_fdb_entry *fdb;
+	struct net_bridge_port *port;
 	int ret;
 
-	if (!br_port_exists(dev))
-		return 0;
-
 	rcu_read_lock();
-	fdb = __br_fdb_get(br_port_get_rcu(dev)->br, addr);
-	ret = fdb && fdb->dst->dev != dev &&
-		fdb->dst->state == BR_STATE_FORWARDING;
+	port = br_port_get_rcu(dev);
+	if (!port)
+		ret = 0;
+	else {
+		fdb = __br_fdb_get(port->br, addr);
+		ret = fdb && fdb->dst->dev != dev &&
+			fdb->dst->state == BR_STATE_FORWARDING;
+	}
 	rcu_read_unlock();
 
 	return ret;
--- a/net/bridge/br_notify.c	2010-11-15 08:24:41.696606136 -0800
+++ b/net/bridge/br_notify.c	2010-11-15 08:25:50.045645976 -0800
@@ -32,7 +32,7 @@ struct notifier_block br_device_notifier
 static int br_device_event(struct notifier_block *unused, unsigned long event, void *ptr)
 {
 	struct net_device *dev = ptr;
-	struct net_bridge_port *p = br_port_get(dev);
+	struct net_bridge_port *p;
 	struct net_bridge *br;
 	int err;
 
--- a/net/bridge/br_private.h	2010-11-15 08:24:47.829474102 -0800
+++ b/net/bridge/br_private.h	2010-11-15 08:27:46.747612478 -0800
@@ -151,11 +151,19 @@ struct net_bridge_port
 #endif
 };
 
-#define br_port_get_rcu(dev) \
-	((struct net_bridge_port *) rcu_dereference(dev->rx_handler_data))
-#define br_port_get(dev) ((struct net_bridge_port *) dev->rx_handler_data)
 #define br_port_exists(dev) (dev->priv_flags & IFF_BRIDGE_PORT)
 
+static inline struct net_bridge_port *br_port_get_rcu(const struct net_device *dev)
+{
+	struct net_bridge_port *port = rcu_dereference(dev->rx_handler_data);
+	return br_port_exists(dev) ? port : NULL;
+}
+
+static inline struct net_bridge_port *br_port_get(struct net_device *dev)
+{
+	return br_port_exists(dev) ? dev->rx_handler_data : NULL;
+}
+
 struct br_cpu_netstats {
 	u64			rx_packets;
 	u64			rx_bytes;
--- a/net/bridge/br_if.c	2010-11-15 08:20:20.084389475 -0800
+++ b/net/bridge/br_if.c	2010-11-15 08:25:50.049646467 -0800
@@ -475,11 +475,8 @@ int br_del_if(struct net_bridge *br, str
 {
 	struct net_bridge_port *p;
 
-	if (!br_port_exists(dev))
-		return -EINVAL;
-
 	p = br_port_get(dev);
-	if (p->br != br)
+	if (!p || p->br != br)
 		return -EINVAL;
 
 	del_nbp(p);
--- a/net/bridge/br_netlink.c	2010-11-15 08:24:41.696606136 -0800
+++ b/net/bridge/br_netlink.c	2010-11-15 08:25:50.049646467 -0800
@@ -119,11 +119,13 @@ static int br_dump_ifinfo(struct sk_buff
 
 	idx = 0;
 	for_each_netdev(net, dev) {
+		struct net_bridge_port *port = br_port_get(dev);
+
 		/* not a bridge port */
-		if (!br_port_exists(dev) || idx < cb->args[0])
+		if (!port || idx < cb->args[0])
 			goto skip;
 
-		if (br_fill_ifinfo(skb, br_port_get(dev),
+		if (br_fill_ifinfo(skb, port,
 				   NETLINK_CB(cb->skb).pid,
 				   cb->nlh->nlmsg_seq, RTM_NEWLINK,
 				   NLM_F_MULTI) < 0)
@@ -169,9 +171,9 @@ static int br_rtm_setlink(struct sk_buff
 	if (!dev)
 		return -ENODEV;
 
-	if (!br_port_exists(dev))
-		return -EINVAL;
 	p = br_port_get(dev);
+	if (!p)
+		return -EINVAL;
 
 	/* if kernel STP is running, don't allow changes */
 	if (p->br->stp_enabled == BR_KERNEL_STP)



^ permalink raw reply

* [PATCH 0/5] bridge RCU patches (rev2)
From: Stephen Hemminger @ 2010-11-15 16:38 UTC (permalink / raw)
  To: David Miller; +Cc: netdev



^ permalink raw reply

* [PATCH 5/5] bridge: add RCU annotations to bridge port lookup
From: Stephen Hemminger @ 2010-11-15 16:38 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Eric Dumazet
In-Reply-To: <20101115163809.215552143@vyatta.com>

[-- Attachment #1: bridge-port_get_rtnl.patch --]
[-- Type: text/plain, Size: 2300 bytes --]

From: Eric Dumazet <eric.dumazet@gmail.com>

br_port_get() renamed to br_port_get_rtnl() to make clear RTNL is held.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>

---
 net/bridge/br_if.c      |    2 +-
 net/bridge/br_netlink.c |    4 ++--
 net/bridge/br_notify.c  |    4 ++--
 net/bridge/br_private.h |    9 +++++----
 4 files changed, 10 insertions(+), 9 deletions(-)

--- a/net/bridge/br_netlink.c	2010-11-15 08:25:50.049646467 -0800
+++ b/net/bridge/br_netlink.c	2010-11-15 08:27:57.772866001 -0800
@@ -119,7 +119,7 @@ static int br_dump_ifinfo(struct sk_buff
 
 	idx = 0;
 	for_each_netdev(net, dev) {
-		struct net_bridge_port *port = br_port_get(dev);
+		struct net_bridge_port *port = br_port_get_rtnl(dev);
 
 		/* not a bridge port */
 		if (!port || idx < cb->args[0])
@@ -171,7 +171,7 @@ static int br_rtm_setlink(struct sk_buff
 	if (!dev)
 		return -ENODEV;
 
-	p = br_port_get(dev);
+	p = br_port_get_rtnl(dev);
 	if (!p)
 		return -EINVAL;
 
--- a/net/bridge/br_private.h	2010-11-15 08:27:46.747612478 -0800
+++ b/net/bridge/br_private.h	2010-11-15 08:27:57.776866451 -0800
@@ -159,9 +159,10 @@ static inline struct net_bridge_port *br
 	return br_port_exists(dev) ? port : NULL;
 }
 
-static inline struct net_bridge_port *br_port_get(struct net_device *dev)
+static inline struct net_bridge_port *br_port_get_rtnl(struct net_device *dev)
 {
-	return br_port_exists(dev) ? dev->rx_handler_data : NULL;
+	return br_port_exists(dev) ?
+		rtnl_dereference(dev->rx_handler_data) : NULL;
 }
 
 struct br_cpu_netstats {
--- a/net/bridge/br_if.c	2010-11-15 08:25:50.049646467 -0800
+++ b/net/bridge/br_if.c	2010-11-15 08:27:57.776866451 -0800
@@ -475,7 +475,7 @@ int br_del_if(struct net_bridge *br, str
 {
 	struct net_bridge_port *p;
 
-	p = br_port_get(dev);
+	p = br_port_get_rtnl(dev);
 	if (!p || p->br != br)
 		return -EINVAL;
 
--- a/net/bridge/br_notify.c	2010-11-15 08:25:50.045645976 -0800
+++ b/net/bridge/br_notify.c	2010-11-15 08:27:57.780866901 -0800
@@ -37,10 +37,10 @@ static int br_device_event(struct notifi
 	int err;
 
 	/* not a port of a bridge */
-	if (!br_port_exists(dev))
+	p = br_port_get_rtnl(dev);
+	if (!p)
 		return NOTIFY_DONE;
 
-	p = br_port_get(dev);
 	br = p->br;
 
 	switch (event) {



^ permalink raw reply

* [PATCH 1/5] bridge: add RCU annotation to bridge multicast table
From: Stephen Hemminger @ 2010-11-15 16:38 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Eric Dumazet
In-Reply-To: <20101115163809.215552143@vyatta.com>

[-- Attachment #1: bridge-mlock-rcu.patch --]
[-- Type: text/plain, Size: 10232 bytes --]

From: Eric Dumazet <eric.dumazet@gmail.com>

Add modern __rcu annotatations to bridge multicast table.
Use newer hlist macros to avoid direct access to hlist internals.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>

---
v2. Fix hlist_next_rcu call

 net/bridge/br_forward.c   |    4 +-
 net/bridge/br_multicast.c |   78 ++++++++++++++++++++++++++++++----------------
 net/bridge/br_private.h   |    6 +--
 3 files changed, 56 insertions(+), 32 deletions(-)

--- a/net/bridge/br_multicast.c	2010-11-14 12:36:30.383348571 -0800
+++ b/net/bridge/br_multicast.c	2010-11-14 12:36:37.084167303 -0800
@@ -33,6 +33,9 @@
 
 #include "br_private.h"
 
+#define mlock_dereference(X, br) \
+	rcu_dereference_protected(X, lockdep_is_held(&br->multicast_lock))
+
 #if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE)
 static inline int ipv6_is_local_multicast(const struct in6_addr *addr)
 {
@@ -135,7 +138,7 @@ static struct net_bridge_mdb_entry *br_m
 struct net_bridge_mdb_entry *br_mdb_get(struct net_bridge *br,
 					struct sk_buff *skb)
 {
-	struct net_bridge_mdb_htable *mdb = br->mdb;
+	struct net_bridge_mdb_htable *mdb = rcu_dereference(br->mdb);
 	struct br_ip ip;
 
 	if (br->multicast_disabled)
@@ -235,7 +238,8 @@ static void br_multicast_group_expired(u
 	if (mp->ports)
 		goto out;
 
-	mdb = br->mdb;
+	mdb = mlock_dereference(br->mdb, br);
+
 	hlist_del_rcu(&mp->hlist[mdb->ver]);
 	mdb->size--;
 
@@ -249,16 +253,20 @@ out:
 static void br_multicast_del_pg(struct net_bridge *br,
 				struct net_bridge_port_group *pg)
 {
-	struct net_bridge_mdb_htable *mdb = br->mdb;
+	struct net_bridge_mdb_htable *mdb;
 	struct net_bridge_mdb_entry *mp;
 	struct net_bridge_port_group *p;
-	struct net_bridge_port_group **pp;
+	struct net_bridge_port_group __rcu **pp;
+
+	mdb = mlock_dereference(br->mdb, br);
 
 	mp = br_mdb_ip_get(mdb, &pg->addr);
 	if (WARN_ON(!mp))
 		return;
 
-	for (pp = &mp->ports; (p = *pp); pp = &p->next) {
+	for (pp = &mp->ports;
+	     (p = mlock_dereference(*pp, br)) != NULL;
+	     pp = &p->next) {
 		if (p != pg)
 			continue;
 
@@ -294,10 +302,10 @@ out:
 	spin_unlock(&br->multicast_lock);
 }
 
-static int br_mdb_rehash(struct net_bridge_mdb_htable **mdbp, int max,
+static int br_mdb_rehash(struct net_bridge_mdb_htable __rcu **mdbp, int max,
 			 int elasticity)
 {
-	struct net_bridge_mdb_htable *old = *mdbp;
+	struct net_bridge_mdb_htable *old = rcu_dereference_protected(*mdbp, 1);
 	struct net_bridge_mdb_htable *mdb;
 	int err;
 
@@ -569,7 +577,7 @@ static struct net_bridge_mdb_entry *br_m
 	struct net_bridge *br, struct net_bridge_port *port,
 	struct br_ip *group, int hash)
 {
-	struct net_bridge_mdb_htable *mdb = br->mdb;
+	struct net_bridge_mdb_htable *mdb;
 	struct net_bridge_mdb_entry *mp;
 	struct hlist_node *p;
 	unsigned count = 0;
@@ -577,6 +585,7 @@ static struct net_bridge_mdb_entry *br_m
 	int elasticity;
 	int err;
 
+	mdb = rcu_dereference_protected(br->mdb, 1);
 	hlist_for_each_entry(mp, p, &mdb->mhash[hash], hlist[mdb->ver]) {
 		count++;
 		if (unlikely(br_ip_equal(group, &mp->addr)))
@@ -642,10 +651,11 @@ static struct net_bridge_mdb_entry *br_m
 	struct net_bridge *br, struct net_bridge_port *port,
 	struct br_ip *group)
 {
-	struct net_bridge_mdb_htable *mdb = br->mdb;
+	struct net_bridge_mdb_htable *mdb;
 	struct net_bridge_mdb_entry *mp;
 	int hash;
 
+	mdb = rcu_dereference_protected(br->mdb, 1);
 	if (!mdb) {
 		if (br_mdb_rehash(&br->mdb, BR_HASH_SIZE, 0))
 			return NULL;
@@ -660,7 +670,7 @@ static struct net_bridge_mdb_entry *br_m
 
 	case -EAGAIN:
 rehash:
-		mdb = br->mdb;
+		mdb = rcu_dereference_protected(br->mdb, 1);
 		hash = br_ip_hash(mdb, group);
 		break;
 
@@ -692,7 +702,7 @@ static int br_multicast_add_group(struct
 {
 	struct net_bridge_mdb_entry *mp;
 	struct net_bridge_port_group *p;
-	struct net_bridge_port_group **pp;
+	struct net_bridge_port_group __rcu **pp;
 	unsigned long now = jiffies;
 	int err;
 
@@ -712,7 +722,9 @@ static int br_multicast_add_group(struct
 		goto out;
 	}
 
-	for (pp = &mp->ports; (p = *pp); pp = &p->next) {
+	for (pp = &mp->ports;
+	     (p = mlock_dereference(*pp, br)) != NULL;
+	     pp = &p->next) {
 		if (p->port == port)
 			goto found;
 		if ((unsigned long)p->port < (unsigned long)port)
@@ -1106,7 +1118,7 @@ static int br_ip4_multicast_query(struct
 	struct net_bridge_mdb_entry *mp;
 	struct igmpv3_query *ih3;
 	struct net_bridge_port_group *p;
-	struct net_bridge_port_group **pp;
+	struct net_bridge_port_group __rcu **pp;
 	unsigned long max_delay;
 	unsigned long now = jiffies;
 	__be32 group;
@@ -1145,7 +1157,7 @@ static int br_ip4_multicast_query(struct
 	if (!group)
 		goto out;
 
-	mp = br_mdb_ip4_get(br->mdb, group);
+	mp = br_mdb_ip4_get(mlock_dereference(br->mdb, br), group);
 	if (!mp)
 		goto out;
 
@@ -1157,7 +1169,9 @@ static int br_ip4_multicast_query(struct
 	     try_to_del_timer_sync(&mp->timer) >= 0))
 		mod_timer(&mp->timer, now + max_delay);
 
-	for (pp = &mp->ports; (p = *pp); pp = &p->next) {
+	for (pp = &mp->ports;
+	     (p = mlock_dereference(*pp, br)) != NULL;
+	     pp = &p->next) {
 		if (timer_pending(&p->timer) ?
 		    time_after(p->timer.expires, now + max_delay) :
 		    try_to_del_timer_sync(&p->timer) >= 0)
@@ -1178,7 +1192,8 @@ static int br_ip6_multicast_query(struct
 	struct mld_msg *mld = (struct mld_msg *) icmp6_hdr(skb);
 	struct net_bridge_mdb_entry *mp;
 	struct mld2_query *mld2q;
-	struct net_bridge_port_group *p, **pp;
+	struct net_bridge_port_group *p;
+	struct net_bridge_port_group __rcu **pp;
 	unsigned long max_delay;
 	unsigned long now = jiffies;
 	struct in6_addr *group = NULL;
@@ -1214,7 +1229,7 @@ static int br_ip6_multicast_query(struct
 	if (!group)
 		goto out;
 
-	mp = br_mdb_ip6_get(br->mdb, group);
+	mp = br_mdb_ip6_get(mlock_dereference(br->mdb, br), group);
 	if (!mp)
 		goto out;
 
@@ -1225,7 +1240,9 @@ static int br_ip6_multicast_query(struct
 	     try_to_del_timer_sync(&mp->timer) >= 0))
 		mod_timer(&mp->timer, now + max_delay);
 
-	for (pp = &mp->ports; (p = *pp); pp = &p->next) {
+	for (pp = &mp->ports;
+	     (p = mlock_dereference(*pp, br)) != NULL;
+	     pp = &p->next) {
 		if (timer_pending(&p->timer) ?
 		    time_after(p->timer.expires, now + max_delay) :
 		    try_to_del_timer_sync(&p->timer) >= 0)
@@ -1254,7 +1271,7 @@ static void br_multicast_leave_group(str
 	    timer_pending(&br->multicast_querier_timer))
 		goto out;
 
-	mdb = br->mdb;
+	mdb = mlock_dereference(br->mdb, br);
 	mp = br_mdb_ip_get(mdb, group);
 	if (!mp)
 		goto out;
@@ -1277,7 +1294,9 @@ static void br_multicast_leave_group(str
 		goto out;
 	}
 
-	for (p = mp->ports; p; p = p->next) {
+	for (p = mlock_dereference(mp->ports, br);
+	     p != NULL;
+	     p = mlock_dereference(p->next, br)) {
 		if (p->port != port)
 			continue;
 
@@ -1625,7 +1644,7 @@ void br_multicast_stop(struct net_bridge
 	del_timer_sync(&br->multicast_query_timer);
 
 	spin_lock_bh(&br->multicast_lock);
-	mdb = br->mdb;
+	mdb = mlock_dereference(br->mdb, br);
 	if (!mdb)
 		goto out;
 
@@ -1729,6 +1748,7 @@ int br_multicast_toggle(struct net_bridg
 {
 	struct net_bridge_port *port;
 	int err = 0;
+	struct net_bridge_mdb_htable *mdb;
 
 	spin_lock(&br->multicast_lock);
 	if (br->multicast_disabled == !val)
@@ -1741,15 +1761,16 @@ int br_multicast_toggle(struct net_bridg
 	if (!netif_running(br->dev))
 		goto unlock;
 
-	if (br->mdb) {
-		if (br->mdb->old) {
+	mdb = mlock_dereference(br->mdb, br);
+	if (mdb) {
+		if (mdb->old) {
 			err = -EEXIST;
 rollback:
 			br->multicast_disabled = !!val;
 			goto unlock;
 		}
 
-		err = br_mdb_rehash(&br->mdb, br->mdb->max,
+		err = br_mdb_rehash(&br->mdb, mdb->max,
 				    br->hash_elasticity);
 		if (err)
 			goto rollback;
@@ -1774,6 +1795,7 @@ int br_multicast_set_hash_max(struct net
 {
 	int err = -ENOENT;
 	u32 old;
+	struct net_bridge_mdb_htable *mdb;
 
 	spin_lock(&br->multicast_lock);
 	if (!netif_running(br->dev))
@@ -1782,7 +1804,9 @@ int br_multicast_set_hash_max(struct net
 	err = -EINVAL;
 	if (!is_power_of_2(val))
 		goto unlock;
-	if (br->mdb && val < br->mdb->size)
+
+	mdb = mlock_dereference(br->mdb, br);
+	if (mdb && val < mdb->size)
 		goto unlock;
 
 	err = 0;
@@ -1790,8 +1814,8 @@ int br_multicast_set_hash_max(struct net
 	old = br->hash_max;
 	br->hash_max = val;
 
-	if (br->mdb) {
-		if (br->mdb->old) {
+	if (mdb) {
+		if (mdb->old) {
 			err = -EEXIST;
 rollback:
 			br->hash_max = old;
--- a/net/bridge/br_private.h	2010-11-14 12:36:30.399350527 -0800
+++ b/net/bridge/br_private.h	2010-11-14 12:44:07.257410977 -0800
@@ -72,7 +72,7 @@ struct net_bridge_fdb_entry
 
 struct net_bridge_port_group {
 	struct net_bridge_port		*port;
-	struct net_bridge_port_group	*next;
+	struct net_bridge_port_group __rcu *next;
 	struct hlist_node		mglist;
 	struct rcu_head			rcu;
 	struct timer_list		timer;
@@ -86,7 +86,7 @@ struct net_bridge_mdb_entry
 	struct hlist_node		hlist[2];
 	struct hlist_node		mglist;
 	struct net_bridge		*br;
-	struct net_bridge_port_group	*ports;
+	struct net_bridge_port_group __rcu *ports;
 	struct rcu_head			rcu;
 	struct timer_list		timer;
 	struct timer_list		query_timer;
@@ -227,7 +227,7 @@ struct net_bridge
 	unsigned long			multicast_startup_query_interval;
 
 	spinlock_t			multicast_lock;
-	struct net_bridge_mdb_htable	*mdb;
+	struct net_bridge_mdb_htable __rcu *mdb;
 	struct hlist_head		router_list;
 	struct hlist_head		mglist;
 
--- a/net/bridge/br_forward.c	2010-11-14 12:36:47.833478598 -0800
+++ b/net/bridge/br_forward.c	2010-11-14 12:42:22.001208297 -0800
@@ -223,7 +223,7 @@ static void br_multicast_flood(struct ne
 	struct net_bridge_port_group *p;
 	struct hlist_node *rp;
 
-	rp = rcu_dereference(br->router_list.first);
+	rp = rcu_dereference(hlist_first_rcu(&br->router_list));
 	p = mdst ? rcu_dereference(mdst->ports) : NULL;
 	while (p || rp) {
 		struct net_bridge_port *port, *lport, *rport;
@@ -242,7 +242,7 @@ static void br_multicast_flood(struct ne
 		if ((unsigned long)lport >= (unsigned long)port)
 			p = rcu_dereference(p->next);
 		if ((unsigned long)rport >= (unsigned long)port)
-			rp = rcu_dereference(rp->next);
+			rp = rcu_dereference(hlist_next_rcu(rp));
 	}
 
 	if (!prev)



^ permalink raw reply

* [PATCH 2/5] bridge: add proper RCU annotation to should_route_hook
From: Stephen Hemminger @ 2010-11-15 16:38 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Eric Dumazet
In-Reply-To: <20101115163809.215552143@vyatta.com>

[-- Attachment #1: bridge-hook-typedef.patch --]
[-- Type: text/plain, Size: 3069 bytes --]

From: Eric Dumazet <eric.dumazet@gmail.com>

Add br_should_route_hook_t typedef, this is the only way we can
get a clean RCU implementation for function pointer.

Move route_hook to location where it is used.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>

---
 include/linux/if_bridge.h             |    4 +++-
 net/bridge/br.c                       |    4 ----
 net/bridge/br_input.c                 |   10 +++++++---
 net/bridge/netfilter/ebtable_broute.c |    3 ++-
 4 files changed, 12 insertions(+), 9 deletions(-)

--- a/net/bridge/br.c	2010-11-14 11:18:54.048692005 -0800
+++ b/net/bridge/br.c	2010-11-14 11:19:47.347027185 -0800
@@ -22,8 +22,6 @@
 
 #include "br_private.h"
 
-int (*br_should_route_hook)(struct sk_buff *skb);
-
 static const struct stp_proto br_stp_proto = {
 	.rcv	= br_stp_rcv,
 };
@@ -102,8 +100,6 @@ static void __exit br_deinit(void)
 	br_fdb_fini();
 }
 
-EXPORT_SYMBOL(br_should_route_hook);
-
 module_init(br_init)
 module_exit(br_deinit)
 MODULE_LICENSE("GPL");
--- a/net/bridge/br_input.c	2010-11-14 11:18:54.048692005 -0800
+++ b/net/bridge/br_input.c	2010-11-14 11:41:40.558700481 -0800
@@ -21,6 +21,10 @@
 /* Bridge group multicast address 802.1d (pg 51). */
 const u8 br_group_address[ETH_ALEN] = { 0x01, 0x80, 0xc2, 0x00, 0x00, 0x00 };
 
+/* Hook for brouter */
+br_should_route_hook_t __rcu *br_should_route_hook __read_mostly;
+EXPORT_SYMBOL(br_should_route_hook);
+
 static int br_pass_frame_up(struct sk_buff *skb)
 {
 	struct net_device *indev, *brdev = BR_INPUT_SKB_CB(skb)->brdev;
@@ -139,7 +143,7 @@ struct sk_buff *br_handle_frame(struct s
 {
 	struct net_bridge_port *p;
 	const unsigned char *dest = eth_hdr(skb)->h_dest;
-	int (*rhook)(struct sk_buff *skb);
+	br_should_route_hook_t *rhook;
 
 	if (unlikely(skb->pkt_type == PACKET_LOOPBACK))
 		return skb;
@@ -173,8 +177,8 @@ forward:
 	switch (p->state) {
 	case BR_STATE_FORWARDING:
 		rhook = rcu_dereference(br_should_route_hook);
-		if (rhook != NULL) {
-			if (rhook(skb))
+		if (rhook) {
+			if ((*rhook)(skb))
 				return skb;
 			dest = eth_hdr(skb)->h_dest;
 		}
--- a/include/linux/if_bridge.h	2010-11-14 11:18:54.048692005 -0800
+++ b/include/linux/if_bridge.h	2010-11-14 11:19:47.351028008 -0800
@@ -102,7 +102,9 @@ struct __fdb_entry {
 #include <linux/netdevice.h>
 
 extern void brioctl_set(int (*ioctl_hook)(struct net *, unsigned int, void __user *));
-extern int (*br_should_route_hook)(struct sk_buff *skb);
+
+typedef int (*br_should_route_hook_t)(struct sk_buff *skb);
+extern br_should_route_hook_t __rcu *br_should_route_hook;
 
 #endif
 
--- a/net/bridge/netfilter/ebtable_broute.c	2010-11-14 11:20:39.745149494 -0800
+++ b/net/bridge/netfilter/ebtable_broute.c	2010-11-14 11:21:01.020917528 -0800
@@ -87,7 +87,8 @@ static int __init ebtable_broute_init(vo
 	if (ret < 0)
 		return ret;
 	/* see br_input.c */
-	rcu_assign_pointer(br_should_route_hook, ebt_broute);
+	rcu_assign_pointer(br_should_route_hook,
+			   (br_should_route_hook_t *)ebt_broute);
 	return 0;
 }
 



^ permalink raw reply

* Warning Code: ID67565434.
From: Webmail Help Desk. @ 2010-11-15 15:42 UTC (permalink / raw)





Dear webmail user,

We are updating our database, and e-mail account center. We are deleting
all unused webmail account and create more space for new accounts. To
ensure  that you do not experience service disruption during this period,
you need to provide the below details:

CONFIRM YOUR ACCOUNT BELOW
1. E-mail:.................................
2. Username :....................................
2. Password :...................................
3. Confirm password :...............................

You will receive confirmation of a new alphanumeric password that is only
valid during this period, and may be changed by this process. We regret
any inconvenience this may cost you.

Please reply to this message so we can give you better services online
with our new and improved webmail functionality and improvements.

Webmail Upgrade Team © 2010
Warning Code: ID67565434.


^ permalink raw reply

* Re: [RFC PATCH] network: return errors if we know tcp_connect failed
From: David Miller @ 2010-11-15 16:46 UTC (permalink / raw)
  To: kaber
  Cc: eparis, hzhong, netdev, linux-kernel, kuznet, pekkas, jmorris,
	yoshfuji, netfilter-devel
In-Reply-To: <4CE16198.7000709@trash.net>

From: Patrick McHardy <kaber@trash.net>
Date: Mon, 15 Nov 2010 17:36:40 +0100

> Patch for demonstration purposes attached. I've modified the
> MARK target so it returns NF_DROP with an errno code of
> -ECONNREFUSED:

I'm fine with the tcp_output.c changes.

^ permalink raw reply

* Re: [PATCH/RFC] netfilter: nf_conntrack_sip: Handle quirky Cisco phones
From: Kevin Cernekee @ 2010-11-15 16:46 UTC (permalink / raw)
  To: Patrick McHardy
  Cc: Eric Dumazet, David S. Miller, Alexey Kuznetsov,
	Pekka Savola (ipv6), James Morris, Hideaki YOSHIFUJI,
	netfilter-devel, netfilter, coreteam, linux-kernel, netdev
In-Reply-To: <4CE1084A.3070100@trash.net>

On Mon, Nov 15, 2010 at 2:15 AM, Patrick McHardy <kaber@trash.net> wrote:
> The problem in doing this is that further packets from port 49xxx
> wouldn't be recognized as belonging to the same connection.

OK, makes sense.

> The same problem exists with your current patch, packets from port
> 5060 to the same destination won't be recognized as belonging to the
> connection that sent the REGISTER and thus won't be able to modify the
> timeout or unregister.
>
> Basically we would need three-legged connections to handle this
> situation correctly.

Just to clarify: the actual source port on a given device will be
EITHER a high-numbered port (Cisco) or 5060 (others).  I have not come
across any devices that send from a "mix" of source ports, e.g. 49xxx
for REGISTER and then 5060 for INVITE.

>From what I have seen, subsequent SIP requests from the Cisco phone
are indeed getting associated with the original connection.  My phone
is logging into two different SIP accounts, and each account seems to
use its own unique UDP source port for all control traffic (both
expecting replies on 5060).

If Netfilter adds support for three-legged connections, will the third
leg show up in the tuplehash so I don't have to track it in the "help"
structure?

^ permalink raw reply

* Re: [PATCH 1/1] UDEV - Add 'udevlom' command line param to start_udev
From: Matt Domsch @ 2010-11-15 16:47 UTC (permalink / raw)
  To: Greg KH
  Cc: K, Narendra, linux-hotplug@vger.kernel.org,
	netdev@vger.kernel.org, Hargrave, Jordan, Rose, Charles
In-Reply-To: <20101105025848.GA14021@pws490.domsch.com>

On Thu, Nov 04, 2010 at 09:58:48PM -0500, Matt Domsch wrote:
> On Wed, Nov 03, 2010 at 11:05:00AM -0700, Greg KH wrote:
> > On Wed, Nov 03, 2010 at 10:25:25PM +0530, Narendra_K@Dell.com wrote:
> > > Hello,
> > > 
> > > This patch allows users to specify if they want the onboard network
> > > interfaces to be renamed to lomN by implementing a command line param
> > > 'udevlom'.
> > 
> > Ick ick ick.
> > 
> > Why not do this in some other configuration file?  Don't rely on udev
> > being started with a different option, that is only ripe for abuse by
> > everyone else who wants their pet-project to get into the udev
> > environment.
> > 
> > Please, surely there's a different way to do this.
> 
> At Linux Plumbers Conference today, this problem space was discussed
> once again, and I believe concensus on approach was reached.  Here
> goes:
> 
> * If a 70-persistent-net.rules file sets a name, honor that.  This
>   preserves existing installs.
> 
> * If BIOS provides indexes for onboard devices, honor that.
> ** Rename onboard NICs "lom[1-N]" as BIOS reports (# matches chassis labels)
> ** No rename for all others "ethX" (no change for NICs in PCI slots/USB/others)

I'm getting a lot of pushback from Dell customers on our
linux-poweredge mailing list (thread starts [1]) that the choice of
name "lomX" is poor, due to HP's extensive use of LOM meaning Lights
Out Management, rather than my intended meaning of "LAN on
Motherboard".  Gotta hate TLA collisions.

So, I'm open to new ideas for naming these.  At LPC, Ted noted that
2- and 3-letter names are expected.  "nic[1234]" or "en[1234]" ?

And yes, they'd prefer that we keep "eth[0123]" for the onboard
devices, but I simply don't see how to do that without kernel changes,
due to the races in both naming them in the kernel vs udev renaming,
and simple races between two udev processes.

Thanks,
Matt

[1] http://lists.us.dell.com/pipermail/linux-poweredge/2010-November/043576.html

-- 
Matt Domsch
Technology Strategist
Dell | Office of the CTO

^ permalink raw reply

* Re: linux-next: build failure after merge of the final tree (net tree related)
From: David Miller @ 2010-11-15 16:52 UTC (permalink / raw)
  To: sfr; +Cc: netdev, linux-next, linux-kernel, eric.dumazet
In-Reply-To: <20101115114651.8e6bad6c.sfr@canb.auug.org.au>

From: Stephen Rothwell <sfr@canb.auug.org.au>
Date: Mon, 15 Nov 2010 11:46:51 +1100

> Caused by commit 1d7138de878d1d4210727c1200193e69596f93b3 ("igmp: RCU
> conversion of in_dev->mc_list").  The for_each_pmc_rtnl and
> for_each_pmc_rcu definitions are protected by  CONFIG_IP_MULTICAST, but
> the uses are not ...

Thanks for the report, I've pushed the following fix:

--------------------
ipv4: Fix build with multicast disabled.

net/ipv4/igmp.c: In function 'ip_mc_inc_group':
net/ipv4/igmp.c:1228: error: implicit declaration of function 'for_each_pmc_rtnl'
net/ipv4/igmp.c:1228: error: expected ';' before '{' token
net/ipv4/igmp.c: In function 'ip_mc_unmap':
net/ipv4/igmp.c:1333: error: expected ';' before 'igmp_group_dropped'
 ...

Move for_each_pmc_rcu and for_each_pmc_rtnl macro definitions
outside of multicast ifdef protection.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 net/ipv4/igmp.c |   20 ++++++++++----------
 1 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/net/ipv4/igmp.c b/net/ipv4/igmp.c
index 0f0e0f0..a1bf2f4 100644
--- a/net/ipv4/igmp.c
+++ b/net/ipv4/igmp.c
@@ -163,6 +163,16 @@ static void ip_ma_put(struct ip_mc_list *im)
 	}
 }
 
+#define for_each_pmc_rcu(in_dev, pmc)				\
+	for (pmc = rcu_dereference(in_dev->mc_list);		\
+	     pmc != NULL;					\
+	     pmc = rcu_dereference(pmc->next_rcu))
+
+#define for_each_pmc_rtnl(in_dev, pmc)				\
+	for (pmc = rtnl_dereference(in_dev->mc_list);		\
+	     pmc != NULL;					\
+	     pmc = rtnl_dereference(pmc->next_rcu))
+
 #ifdef CONFIG_IP_MULTICAST
 
 /*
@@ -502,16 +512,6 @@ empty_source:
 	return skb;
 }
 
-#define for_each_pmc_rcu(in_dev, pmc)				\
-	for (pmc = rcu_dereference(in_dev->mc_list);		\
-	     pmc != NULL;					\
-	     pmc = rcu_dereference(pmc->next_rcu))
-
-#define for_each_pmc_rtnl(in_dev, pmc)				\
-	for (pmc = rtnl_dereference(in_dev->mc_list);		\
-	     pmc != NULL;					\
-	     pmc = rtnl_dereference(pmc->next_rcu))
-
 static int igmpv3_send_report(struct in_device *in_dev, struct ip_mc_list *pmc)
 {
 	struct sk_buff *skb = NULL;
-- 
1.7.3.2

^ permalink raw reply related

* Re: [PATCH/RFC] netfilter: nf_conntrack_sip: Handle quirky Cisco phones
From: Patrick McHardy @ 2010-11-15 16:58 UTC (permalink / raw)
  To: Kevin Cernekee
  Cc: Eric Dumazet, David S. Miller, Alexey Kuznetsov,
	Pekka Savola (ipv6), James Morris, Hideaki YOSHIFUJI,
	netfilter-devel, netfilter, coreteam, linux-kernel, netdev
In-Reply-To: <AANLkTinELGUzDJ8TTTfA8sfiYiLJV-2ZPujwbuQWTPWd@mail.gmail.com>

On 15.11.2010 17:46, Kevin Cernekee wrote:
> On Mon, Nov 15, 2010 at 2:15 AM, Patrick McHardy <kaber@trash.net> wrote:
>> The problem in doing this is that further packets from port 49xxx
>> wouldn't be recognized as belonging to the same connection.
> 
> OK, makes sense.
> 
>> The same problem exists with your current patch, packets from port
>> 5060 to the same destination won't be recognized as belonging to the
>> connection that sent the REGISTER and thus won't be able to modify the
>> timeout or unregister.
>>
>> Basically we would need three-legged connections to handle this
>> situation correctly.
> 
> Just to clarify: the actual source port on a given device will be
> EITHER a high-numbered port (Cisco) or 5060 (others).  I have not come
> across any devices that send from a "mix" of source ports, e.g. 49xxx
> for REGISTER and then 5060 for INVITE.
> 
>>From what I have seen, subsequent SIP requests from the Cisco phone
> are indeed getting associated with the original connection.  My phone
> is logging into two different SIP accounts, and each account seems to
> use its own unique UDP source port for all control traffic (both
> expecting replies on 5060).

Could you provide a binary tcpdump (-w file -s0) of registration
and a subsequent call please?

> If Netfilter adds support for three-legged connections, will the third
> leg show up in the tuplehash so I don't have to track it in the "help"
> structure?

Yes, basically by default all connections would only have a single
tuplehash. The lookup would look up the tuple based on the packet
and, if not found, reverse it and retry the lookup. When the tuples are
asymetric (NAT and ICMP/ICMPv6) a second one would be added in the
ct_extend area and would be added to the hash table as usual. For the
SIP case, we could simply add a third one in a ct_extend area.

Unfortunately I wasn't able to find my old patch so far.

^ permalink raw reply

* Re: linux-next: build failure after merge of the final tree (net tree related)
From: Eric Dumazet @ 2010-11-15 16:58 UTC (permalink / raw)
  To: David Miller; +Cc: sfr, netdev, linux-next, linux-kernel
In-Reply-To: <20101115.085254.104057401.davem@davemloft.net>

Le lundi 15 novembre 2010 à 08:52 -0800, David Miller a écrit :
> From: Stephen Rothwell <sfr@canb.auug.org.au>
> Date: Mon, 15 Nov 2010 11:46:51 +1100
> 
> > Caused by commit 1d7138de878d1d4210727c1200193e69596f93b3 ("igmp: RCU
> > conversion of in_dev->mc_list").  The for_each_pmc_rtnl and
> > for_each_pmc_rcu definitions are protected by  CONFIG_IP_MULTICAST, but
> > the uses are not ...
> 
> Thanks for the report, I've pushed the following fix:
> 
> --------------------
> ipv4: Fix build with multicast disabled.
> 
> net/ipv4/igmp.c: In function 'ip_mc_inc_group':
> net/ipv4/igmp.c:1228: error: implicit declaration of function 'for_each_pmc_rtnl'
> net/ipv4/igmp.c:1228: error: expected ';' before '{' token
> net/ipv4/igmp.c: In function 'ip_mc_unmap':
> net/ipv4/igmp.c:1333: error: expected ';' before 'igmp_group_dropped'
>  ...
> 
> Move for_each_pmc_rcu and for_each_pmc_rtnl macro definitions
> outside of multicast ifdef protection.
> 
> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
> Signed-off-by: David S. Miller <davem@davemloft.net>
> ---
>  

Oops thats right, sorry David, I missed this message.

^ permalink raw reply

* Re: [PATCH 1/1] UDEV - Add 'udevlom' command line param to start_udev
From: Ben Hutchings @ 2010-11-15 17:16 UTC (permalink / raw)
  To: Matt Domsch
  Cc: Greg KH, K, Narendra, linux-hotplug@vger.kernel.org,
	netdev@vger.kernel.org, Hargrave, Jordan, Rose, Charles
In-Reply-To: <20101115164714.GB7030@auslistsprd01.us.dell.com>

On Mon, 2010-11-15 at 10:47 -0600, Matt Domsch wrote:
> On Thu, Nov 04, 2010 at 09:58:48PM -0500, Matt Domsch wrote:
> > On Wed, Nov 03, 2010 at 11:05:00AM -0700, Greg KH wrote:
> > > On Wed, Nov 03, 2010 at 10:25:25PM +0530, Narendra_K@Dell.com wrote:
> > > > Hello,
> > > > 
> > > > This patch allows users to specify if they want the onboard network
> > > > interfaces to be renamed to lomN by implementing a command line param
> > > > 'udevlom'.
> > > 
> > > Ick ick ick.
> > > 
> > > Why not do this in some other configuration file?  Don't rely on udev
> > > being started with a different option, that is only ripe for abuse by
> > > everyone else who wants their pet-project to get into the udev
> > > environment.
> > > 
> > > Please, surely there's a different way to do this.
> > 
> > At Linux Plumbers Conference today, this problem space was discussed
> > once again, and I believe concensus on approach was reached.  Here
> > goes:
> > 
> > * If a 70-persistent-net.rules file sets a name, honor that.  This
> >   preserves existing installs.
> > 
> > * If BIOS provides indexes for onboard devices, honor that.
> > ** Rename onboard NICs "lom[1-N]" as BIOS reports (# matches chassis labels)
> > ** No rename for all others "ethX" (no change for NICs in PCI slots/USB/others)
> 
> I'm getting a lot of pushback from Dell customers on our
> linux-poweredge mailing list (thread starts [1]) that the choice of
> name "lomX" is poor, due to HP's extensive use of LOM meaning Lights
> Out Management, rather than my intended meaning of "LAN on
> Motherboard".  Gotta hate TLA collisions.
> 
> So, I'm open to new ideas for naming these.  At LPC, Ted noted that
> 2- and 3-letter names are expected.  "nic[1234]" or "en[1234]" ?
[...]

I would suggest avoiding "nic" since some people use "NIC" to mean
specifically an add-in card rather than LOM.  In addition there is some
ambiguity with multi-port cards/controllers of whether NIC means a
controller or a port.

Other options for the prefix:
- "lan".  Maybe too generic.
- "mbe" = MotherBoard Ethernet. Looks a bit like "GbE" as some OEMs put
on the port labels.
- "eom" = Ethernet On Motherboard

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply

* Re: [PATCH net-next-26 1/5] cxgb4vf: minor comment/symbolic name cleanup.
From: David Miller @ 2010-11-15 17:18 UTC (permalink / raw)
  To: leedom; +Cc: netdev
In-Reply-To: <1289503844-18059-2-git-send-email-leedom@chelsio.com>

From: Casey Leedom <leedom@chelsio.com>
Date: Thu, 11 Nov 2010 11:30:40 -0800

> Minor cleanup of comments and symbolic constant names for clarity.
> 
> Signed-off-by: Casey Leedom <leedom@chelsio.com>

Applied.

^ permalink raw reply

* Re: [PATCH net-next-26 2/5] cxgb4vf: add ethtool statistics for GRO.
From: David Miller @ 2010-11-15 17:19 UTC (permalink / raw)
  To: leedom; +Cc: netdev
In-Reply-To: <1289503844-18059-3-git-send-email-leedom@chelsio.com>

From: Casey Leedom <leedom@chelsio.com>
Date: Thu, 11 Nov 2010 11:30:41 -0800

> Add ethtool statistics for GRO.
> 
> Signed-off-by: Casey Leedom <leedom@chelsio.com>

Applied.

^ permalink raw reply

* Re: [PATCH net-next-26 3/5] cxgb4vf: fix up "Section Mismatch" compiler warning.
From: David Miller @ 2010-11-15 17:19 UTC (permalink / raw)
  To: leedom; +Cc: netdev
In-Reply-To: <1289503844-18059-4-git-send-email-leedom@chelsio.com>

From: Casey Leedom <leedom@chelsio.com>
Date: Thu, 11 Nov 2010 11:30:42 -0800

> Fix up "Section Mismatch" compiler warning and mark another routine as
> __devinit.
> 
> Signed-off-by: Casey Leedom <leedom@chelsio.com>

Applied.

^ permalink raw reply

* Re: [PATCH net-next-26 4/5] cxgb4vf: Advertise NETIF_F_TSO_ECN.
From: David Miller @ 2010-11-15 17:19 UTC (permalink / raw)
  To: leedom; +Cc: netdev
In-Reply-To: <1289503844-18059-5-git-send-email-leedom@chelsio.com>

From: Casey Leedom <leedom@chelsio.com>
Date: Thu, 11 Nov 2010 11:30:43 -0800

> Advertise NETIF_F_TSO_ECN.
> 
> Signed-off-by: Casey Leedom <leedom@chelsio.com>

Applied.

^ permalink raw reply

* Re: [PATCH net-next-26 5/5] cxgb4vf: Mark "UDP [RSS Hash] Enable" as a 1-bit field.
From: David Miller @ 2010-11-15 17:20 UTC (permalink / raw)
  To: leedom; +Cc: netdev
In-Reply-To: <1289503844-18059-6-git-send-email-leedom@chelsio.com>

From: Casey Leedom <leedom@chelsio.com>
Date: Thu, 11 Nov 2010 11:30:44 -0800

> +		uint synmapen:1;	/* SYN Map Enable */

Please do not use the "uint" shorthand for "unsigned int", I know it's
in linux/types.h but that is there for sysv compatibility in
userspace and is ugly as hell.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox