* Connection tracking and vlan @ 2009-10-29 15:43 Adayadil Thomas 2009-10-30 15:20 ` Herbert Xu 0 siblings, 1 reply; 19+ messages in thread From: Adayadil Thomas @ 2009-10-29 15:43 UTC (permalink / raw) To: netdev Greetings! If two connections have same 5 tuple, src ip, dst ip, src port, dst port, protocol(tcp/udp) but on different vlans (different vlan id), does the conntrack separate these ? I am using kernel version 2.6.20; the conntrack tuple structure do not seem to have vlan information. Any information is much appreciated. Thanks ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Connection tracking and vlan 2009-10-29 15:43 Connection tracking and vlan Adayadil Thomas @ 2009-10-30 15:20 ` Herbert Xu 2009-10-30 15:31 ` Eric Dumazet 0 siblings, 1 reply; 19+ messages in thread From: Herbert Xu @ 2009-10-30 15:20 UTC (permalink / raw) To: Adayadil Thomas; +Cc: netdev, Patrick McHardy Adayadil Thomas <adayadil.thomas@gmail.com> wrote: > > If two connections have same 5 tuple, src ip, dst ip, src port, dst > port, protocol(tcp/udp) > but on different vlans (different vlan id), does the conntrack separate these ? Probably not. Patrick, can you confirm this? Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Connection tracking and vlan 2009-10-30 15:20 ` Herbert Xu @ 2009-10-30 15:31 ` Eric Dumazet 2009-10-30 15:46 ` Herbert Xu ` (2 more replies) 0 siblings, 3 replies; 19+ messages in thread From: Eric Dumazet @ 2009-10-30 15:31 UTC (permalink / raw) To: Herbert Xu; +Cc: Adayadil Thomas, netdev, Patrick McHardy Herbert Xu a écrit : > Adayadil Thomas <adayadil.thomas@gmail.com> wrote: >> If two connections have same 5 tuple, src ip, dst ip, src port, dst >> port, protocol(tcp/udp) >> but on different vlans (different vlan id), does the conntrack separate these ? > > Probably not. Patrick, can you confirm this? > Very strange, this question about vlan looks like discussion we had yesterday (or the day before...) about interfaces (versus packet defragmentation) "IP conntracking" is about IP, and [V]LAN doesnt matter at all at this protocol level. Same thing if you have two interfaces, eth0 & eth1 : IP conntrack tuples dont include interface name/index ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Connection tracking and vlan 2009-10-30 15:31 ` Eric Dumazet @ 2009-10-30 15:46 ` Herbert Xu 2009-10-30 16:19 ` Eric Dumazet 2009-10-30 16:26 ` Patrick McHardy 2009-10-30 19:20 ` Adayadil Thomas 2 siblings, 1 reply; 19+ messages in thread From: Herbert Xu @ 2009-10-30 15:46 UTC (permalink / raw) To: Eric Dumazet; +Cc: Adayadil Thomas, netdev, Patrick McHardy On Fri, Oct 30, 2009 at 04:31:50PM +0100, Eric Dumazet wrote: > > Same thing if you have two interfaces, eth0 & eth1 : IP conntrack tuples dont > include interface name/index Indeed, but imagine what happens when eth0 is the LAN and eth1 is the wild wild Internet. Do you really want their packets to mix? Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Connection tracking and vlan 2009-10-30 15:46 ` Herbert Xu @ 2009-10-30 16:19 ` Eric Dumazet 2009-10-30 16:27 ` Patrick McHardy 2009-10-30 16:55 ` Herbert Xu 0 siblings, 2 replies; 19+ messages in thread From: Eric Dumazet @ 2009-10-30 16:19 UTC (permalink / raw) To: Herbert Xu; +Cc: Adayadil Thomas, netdev, Patrick McHardy Herbert Xu a écrit : > On Fri, Oct 30, 2009 at 04:31:50PM +0100, Eric Dumazet wrote: >> Same thing if you have two interfaces, eth0 & eth1 : IP conntrack tuples dont >> include interface name/index > > Indeed, but imagine what happens when eth0 is the LAN and eth1 is > the wild wild Internet. Do you really want their packets to mix? > No, Abayadi needs firewall rules (or RPF), before entering conntrack. Allowing spoofed packets to come from wild Internet would be... interesting in many aspects. And since some setups use several links to LAN, several links to Internet, its user policy decisions. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Connection tracking and vlan 2009-10-30 16:19 ` Eric Dumazet @ 2009-10-30 16:27 ` Patrick McHardy 2009-10-30 16:55 ` Herbert Xu 1 sibling, 0 replies; 19+ messages in thread From: Patrick McHardy @ 2009-10-30 16:27 UTC (permalink / raw) To: Eric Dumazet; +Cc: Herbert Xu, Adayadil Thomas, netdev Eric Dumazet wrote: > Herbert Xu a écrit : >> On Fri, Oct 30, 2009 at 04:31:50PM +0100, Eric Dumazet wrote: >>> Same thing if you have two interfaces, eth0 & eth1 : IP conntrack tuples dont >>> include interface name/index >> Indeed, but imagine what happens when eth0 is the LAN and eth1 is >> the wild wild Internet. Do you really want their packets to mix? >> > > No, Abayadi needs firewall rules (or RPF), before entering conntrack. > > Allowing spoofed packets to come from wild Internet would be... > interesting in many aspects. > > And since some setups use several links to LAN, several links to > Internet, its user policy decisions. Correct, users need to take care of this manually. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Connection tracking and vlan 2009-10-30 16:19 ` Eric Dumazet 2009-10-30 16:27 ` Patrick McHardy @ 2009-10-30 16:55 ` Herbert Xu 1 sibling, 0 replies; 19+ messages in thread From: Herbert Xu @ 2009-10-30 16:55 UTC (permalink / raw) To: Eric Dumazet; +Cc: Adayadil Thomas, netdev, Patrick McHardy On Fri, Oct 30, 2009 at 05:19:30PM +0100, Eric Dumazet wrote: > > No, Abayadi needs firewall rules (or RPF), before entering conntrack. rp_filter doesn't help because routing occurs after conntrack, so IP fragments from the untrusted side may have already been folded into a fragment from the trusted side. Yes netfilter rules pre-conntrack would help but I bet there are a lot folks out there who don't have them. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Connection tracking and vlan 2009-10-30 15:31 ` Eric Dumazet 2009-10-30 15:46 ` Herbert Xu @ 2009-10-30 16:26 ` Patrick McHardy 2009-10-30 19:20 ` Adayadil Thomas 2 siblings, 0 replies; 19+ messages in thread From: Patrick McHardy @ 2009-10-30 16:26 UTC (permalink / raw) To: Eric Dumazet; +Cc: Herbert Xu, Adayadil Thomas, netdev Eric Dumazet wrote: > Herbert Xu a écrit : >> Adayadil Thomas <adayadil.thomas@gmail.com> wrote: >>> If two connections have same 5 tuple, src ip, dst ip, src port, dst >>> port, protocol(tcp/udp) >>> but on different vlans (different vlan id), does the conntrack separate these ? >> Probably not. Patrick, can you confirm this? Yes, you are right. > Very strange, this question about vlan looks like discussion we had > yesterday (or the day before...) about interfaces (versus packet defragmentation) Indeed, we did have that discussion a couple of years ago. IIRC Rusty also suggested to add the interface to the defragmentation key to avoid having fragments from different interfaces being reassembled since iptables interface matches will only match on the interface of the first fragment. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Connection tracking and vlan 2009-10-30 15:31 ` Eric Dumazet 2009-10-30 15:46 ` Herbert Xu 2009-10-30 16:26 ` Patrick McHardy @ 2009-10-30 19:20 ` Adayadil Thomas 2009-10-30 19:51 ` Caitlin Bestler 2009-10-30 23:15 ` Eric W. Biederman 2 siblings, 2 replies; 19+ messages in thread From: Adayadil Thomas @ 2009-10-30 19:20 UTC (permalink / raw) To: Eric Dumazet; +Cc: Herbert Xu, netdev, Patrick McHardy On Fri, Oct 30, 2009 at 11:31 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote: > Very strange, this question about vlan looks like discussion we had > yesterday (or the day before...) about interfaces (versus packet defragmentation) > > "IP conntracking" is about IP, and [V]LAN doesnt matter at all at this protocol level. > > Same thing if you have two interfaces, eth0 & eth1 : IP conntrack tuples dont > include interface name/index I am concerned about the following situation - The linux device is configured as a bridge and is deployed between the trunk ports of 2 switches. In this situation the linux device will be seeing 802.1q packets with vlan headers specifying the vlanids. It is valid to have an environment where private IP addresses are duplicated on different virtual LANs. i.e. it is valid to have machine A ( 10.10.10.1) talking to machine B (10.10.10.2) on vlan 1, and at the same time machine C ( 10.10.10.1) talking to machine D (10.10.10.2) on vlan 2. Since they are on different LANs (VLANs), there should not be any issues. Now when VLANs are shared across switches, the trunk port will sent 802.1q tagged packets between the switches. Imagine these packets when going through the linux bridge. The 802.1q header should identify the separate vlans by the vlan id. If ip_conntrack does not consider vlans, it is possible that all 5 tuple are the same and thus affect the connection tracking. I hope I have described the scenario well. If not I can explain in a more detailed fashion. Thanks ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Connection tracking and vlan 2009-10-30 19:20 ` Adayadil Thomas @ 2009-10-30 19:51 ` Caitlin Bestler 2009-10-30 20:40 ` Adayadil Thomas 2009-10-30 23:15 ` Eric W. Biederman 1 sibling, 1 reply; 19+ messages in thread From: Caitlin Bestler @ 2009-10-30 19:51 UTC (permalink / raw) To: Adayadil Thomas; +Cc: Eric Dumazet, Herbert Xu, netdev, Patrick McHardy Yes, it is legitimate for a Bridge to see two different 10.*.*.* networks on different VLANs. A Bridge can even see that same MAC address being used by two different end stations on different VLANs (especially if the global bit is not set). What is not legitimate is presenting both of those 10.*.*.* networks for local delivery. If you are only bridging the frames then there are no connections to track, only frames. On Fri, Oct 30, 2009 at 12:20 PM, Adayadil Thomas <adayadil.thomas@gmail.com> wrote: > On Fri, Oct 30, 2009 at 11:31 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote: > >> Very strange, this question about vlan looks like discussion we had >> yesterday (or the day before...) about interfaces (versus packet defragmentation) >> >> "IP conntracking" is about IP, and [V]LAN doesnt matter at all at this protocol level. >> >> Same thing if you have two interfaces, eth0 & eth1 : IP conntrack tuples dont >> include interface name/index > > I am concerned about the following situation - > > The linux device is configured as a bridge and is deployed between the > trunk ports > of 2 switches. In this situation the linux device will be seeing 802.1q packets > with vlan headers specifying the vlanids. > > It is valid to have an environment where private IP addresses are duplicated > on different virtual LANs. i.e. it is valid to have machine A ( > 10.10.10.1) talking to > machine B (10.10.10.2) on vlan 1, > and > at the same time machine C ( 10.10.10.1) talking to machine D > (10.10.10.2) on vlan 2. > > Since they are on different LANs (VLANs), there should not be any issues. > > Now when VLANs are shared across switches, the trunk port will sent > 802.1q tagged > packets between the switches. Imagine these packets when going through > the linux bridge. > The 802.1q header should identify the separate vlans by the vlan id. > > If ip_conntrack does not consider vlans, it is possible that all 5 > tuple are the same > and thus affect the connection tracking. > > I hope I have described the scenario well. If not I can explain in a > more detailed fashion. > > Thanks > -- > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Connection tracking and vlan 2009-10-30 19:51 ` Caitlin Bestler @ 2009-10-30 20:40 ` Adayadil Thomas 0 siblings, 0 replies; 19+ messages in thread From: Adayadil Thomas @ 2009-10-30 20:40 UTC (permalink / raw) To: Caitlin Bestler; +Cc: Eric Dumazet, Herbert Xu, netdev, Patrick McHardy On Fri, Oct 30, 2009 at 3:51 PM, Caitlin Bestler <caitlin.bestler@gmail.com> wrote: > Yes, it is legitimate for a Bridge to see two different 10.*.*.* > networks on different VLANs. > A Bridge can even see that same MAC address being used by two > different end stations > on different VLANs (especially if the global bit is not set). > > What is not legitimate is presenting both of those 10.*.*.* networks > for local delivery. > I did not mean this case. > If you are only bridging the frames then there are no connections to > track, only frames. > This is more like what I was trying to do with the device, but with stateful firewall functionality for which I was using iptables/netfilter. Thanks ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Connection tracking and vlan 2009-10-30 19:20 ` Adayadil Thomas 2009-10-30 19:51 ` Caitlin Bestler @ 2009-10-30 23:15 ` Eric W. Biederman 2009-10-30 23:25 ` Ben Greear 1 sibling, 1 reply; 19+ messages in thread From: Eric W. Biederman @ 2009-10-30 23:15 UTC (permalink / raw) To: Adayadil Thomas; +Cc: Eric Dumazet, Herbert Xu, netdev, Patrick McHardy Adayadil Thomas <adayadil.thomas@gmail.com> writes: > On Fri, Oct 30, 2009 at 11:31 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote: > >> Very strange, this question about vlan looks like discussion we had >> yesterday (or the day before...) about interfaces (versus packet defragmentation) >> >> "IP conntracking" is about IP, and [V]LAN doesnt matter at all at this protocol level. >> >> Same thing if you have two interfaces, eth0 & eth1 : IP conntrack tuples dont >> include interface name/index > > I am concerned about the following situation - > > The linux device is configured as a bridge and is deployed between the > trunk ports > of 2 switches. In this situation the linux device will be seeing 802.1q packets > with vlan headers specifying the vlanids. > > It is valid to have an environment where private IP addresses are duplicated > on different virtual LANs. i.e. it is valid to have machine A ( > 10.10.10.1) talking to > machine B (10.10.10.2) on vlan 1, > and > at the same time machine C ( 10.10.10.1) talking to machine D > (10.10.10.2) on vlan 2. > > Since they are on different LANs (VLANs), there should not be any issues. > > Now when VLANs are shared across switches, the trunk port will sent > 802.1q tagged > packets between the switches. Imagine these packets when going through > the linux bridge. > The 802.1q header should identify the separate vlans by the vlan id. > > If ip_conntrack does not consider vlans, it is possible that all 5 > tuple are the same > and thus affect the connection tracking. > > I hope I have described the scenario well. If not I can explain in a > more detailed fashion. Unless you have multiple network namespaces linux assumes all packets are in the same ip space. And 10.10.10.1 is the same machine no matter which interface you talk to it on. Eric ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Connection tracking and vlan 2009-10-30 23:15 ` Eric W. Biederman @ 2009-10-30 23:25 ` Ben Greear 2009-11-02 16:14 ` Adayadil Thomas 0 siblings, 1 reply; 19+ messages in thread From: Ben Greear @ 2009-10-30 23:25 UTC (permalink / raw) To: Eric W. Biederman Cc: Adayadil Thomas, Eric Dumazet, Herbert Xu, netdev, Patrick McHardy On 10/30/2009 04:15 PM, Eric W. Biederman wrote: >> If ip_conntrack does not consider vlans, it is possible that all 5 >> tuple are the same >> and thus affect the connection tracking. >> >> I hope I have described the scenario well. If not I can explain in a >> more detailed fashion. > > Unless you have multiple network namespaces linux assumes all packets are > in the same ip space. And 10.10.10.1 is the same machine no matter > which interface you talk to it on. It only takes a relatively small patch that lets conn-track hash on a skb->foo_mark, and allow that mark to be set on incoming packets based on netdevice or whatever, (before the conn-track lookup is done). This is logically somewhat similar to using multiple routing tables and has been working well for me for several years.... Thanks, Ben -- Ben Greear <greearb@candelatech.com> Candela Technologies Inc http://www.candelatech.com ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Connection tracking and vlan 2009-10-30 23:25 ` Ben Greear @ 2009-11-02 16:14 ` Adayadil Thomas 2009-11-02 16:30 ` Adayadil Thomas 2009-11-02 16:33 ` Patrick McHardy 0 siblings, 2 replies; 19+ messages in thread From: Adayadil Thomas @ 2009-11-02 16:14 UTC (permalink / raw) To: Ben Greear Cc: Eric W. Biederman, Eric Dumazet, Herbert Xu, netdev, Patrick McHardy [-- Attachment #1: Type: text/plain, Size: 1473 bytes --] If the vlan id is used for hash, it still may not avoid the problem completely, i.e. in case of both connections hashing to the same bucket. I was wondering about your opinion about adding an optional member to the tuple structure, vid (for vlan id). I have attached the patch for this change. I would be grateful for any comments such as dependencies on the rest of the system. Thanks much On Fri, Oct 30, 2009 at 6:25 PM, Ben Greear <greearb@candelatech.com> wrote: > On 10/30/2009 04:15 PM, Eric W. Biederman wrote: > >>> If ip_conntrack does not consider vlans, it is possible that all 5 >>> tuple are the same >>> and thus affect the connection tracking. >>> >>> I hope I have described the scenario well. If not I can explain in a >>> more detailed fashion. >> >> Unless you have multiple network namespaces linux assumes all packets are >> in the same ip space. And 10.10.10.1 is the same machine no matter >> which interface you talk to it on. > > It only takes a relatively small patch that lets conn-track hash on a > skb->foo_mark, and allow that mark to be set on incoming packets > based on netdevice or whatever, (before the conn-track lookup is > done). > > This is logically somewhat similar to using multiple routing > tables and has been working well for me for several years.... > > Thanks, > Ben > > -- > Ben Greear <greearb@candelatech.com> > Candela Technologies Inc http://www.candelatech.com > > [-- Attachment #2: patch.txt --] [-- Type: text/plain, Size: 6804 bytes --] --- linux-2.6.20.21/include/linux/netfilter_ipv4/ip_conntrack_core.h 2007-10-17 15:31:14.000000000 -0400 +++ linux-2.6.20.21.mod/include/linux/netfilter_ipv4/ip_conntrack_core.h 2009-11-02 09:53:44.000000000 -0500 @@ -24,7 +24,12 @@ const struct sk_buff *skb, unsigned int dataoff, struct ip_conntrack_tuple *tuple, +#ifdef CONFIG_IP_NF_CONNTRACK_VLAN + const struct ip_conntrack_protocol *protocol, + unsigned short vid); +#else const struct ip_conntrack_protocol *protocol); +#endif extern int ip_ct_invert_tuple(struct ip_conntrack_tuple *inverse, --- linux-2.6.20.21/include/linux/netfilter_ipv4/ip_conntrack_tuple.h 2007-10-17 15:31:14.000000000 -0400 +++ linux-2.6.20.21.mod/include/linux/netfilter_ipv4/ip_conntrack_tuple.h 2009-11-02 09:56:09.000000000 -0500 @@ -79,6 +79,10 @@ /* The direction (for tuplehash) */ u_int8_t dir; } dst; +#ifdef CONFIG_IP_NF_CONNTRACK_VLAN + __be16 vid; +#endif + }; /* This is optimized opposed to a memset of the whole structure. Everything we @@ -87,6 +91,9 @@ do { \ (tuple)->src.u.all = 0; \ (tuple)->dst.u.all = 0; \ +#ifdef CONFIG_IP_NF_CONNTRACK_VLAN + (tuple)->vid = 0; \ +#endif } while (0) #ifdef __KERNEL__ @@ -125,10 +132,22 @@ && t1->dst.protonum == t2->dst.protonum; } +#ifdef CONFIG_IP_NF_CONNTRACK_VLAN +static inline int ip_ct_tuple_vlan_equal(const struct ip_conntrack_tuple *t1, + const struct ip_conntrack_tuple *t2) +{ + return t1->vid == t2->vid; +} +#endif + static inline int ip_ct_tuple_equal(const struct ip_conntrack_tuple *t1, const struct ip_conntrack_tuple *t2) { +#ifdef CONFIG_IP_NF_CONNTRACK_VLAN + return ip_ct_tuple_src_equal(t1, t2) && ip_ct_tuple_dst_equal(t1, t2) && ip_ct_tuple_vlan_equal(t1, t2); +#else return ip_ct_tuple_src_equal(t1, t2) && ip_ct_tuple_dst_equal(t1, t2); +#endif } static inline int ip_ct_tuple_mask_cmp(const struct ip_conntrack_tuple *t, --- linux-2.6.20.21/net/ipv4/netfilter/ip_conntrack_core.c 2009-11-02 10:57:49.000000000 -0500 +++ linux-2.6.20.21.mod/net/ipv4/netfilter/ip_conntrack_core.c 2009-11-02 09:57:57.000000000 -0500 @@ -45,6 +45,7 @@ #include <linux/percpu.h> #include <linux/moduleparam.h> #include <linux/notifier.h> +#include <linux/if_vlan.h> /* ip_conntrack_lock protects the main hash table, protocol/helper/expected registrations, conntrack timers*/ @@ -181,7 +182,8 @@ const struct sk_buff *skb, unsigned int dataoff, struct ip_conntrack_tuple *tuple, - const struct ip_conntrack_protocol *protocol) + const struct ip_conntrack_protocol *protocol, + unsigned short vid) { /* Never happen */ if (iph->frag_off & htons(IP_OFFSET)) { @@ -194,6 +196,7 @@ tuple->dst.ip = iph->daddr; tuple->dst.protonum = iph->protocol; tuple->dst.dir = IP_CT_DIR_ORIGINAL; + tuple->vid = vid; return protocol->pkt_to_tuple(skb, dataoff, tuple); } @@ -206,6 +209,7 @@ inverse->src.ip = orig->dst.ip; inverse->dst.ip = orig->src.ip; inverse->dst.protonum = orig->dst.protonum; + inverse->vid = orig->vid; inverse->dst.dir = !orig->dst.dir; return protocol->invert_tuple(inverse, orig); @@ -779,11 +783,26 @@ struct ip_conntrack_tuple tuple; struct ip_conntrack_tuple_hash *h; struct ip_conntrack *ct; +#ifdef CONFIG_IP_NF_CONNTRACK_VLAN + struct vlan_ethhdr *vhdr; + unsigned short vlan_TCI, vid = 8192; + + vhdr = vlan_eth_hdr(skb); + if (vhdr && vhdr->h_vlan_proto == __constant_htons(ETH_P_8021Q)){ + vlan_TCI = ntohs(vhdr->h_vlan_TCI); + vid = (vlan_TCI & VLAN_VID_MASK); + } +#endif IP_NF_ASSERT((skb->nh.iph->frag_off & htons(IP_OFFSET)) == 0); +#ifdef CONFIG_IP_NF_CONNTRACK_VLAN if (!ip_ct_get_tuple(skb->nh.iph, skb, skb->nh.iph->ihl*4, + &tuple,proto,vid)) +#else + if (!ip_ct_get_tuple(skb->nh.iph, skb, skb->nh.iph->ihl*4, &tuple,proto)) +#endif return NULL; /* look for tuple match */ --- linux-2.6.20.21/net/ipv4/netfilter/ip_conntrack_proto_icmp.c 2007-10-17 15:31:14.000000000 -0400 +++ linux-2.6.20.21.mod/net/ipv4/netfilter/ip_conntrack_proto_icmp.c 2009-11-02 09:58:52.000000000 -0500 @@ -20,6 +20,7 @@ #include <linux/netfilter_ipv4/ip_conntrack.h> #include <linux/netfilter_ipv4/ip_conntrack_core.h> #include <linux/netfilter_ipv4/ip_conntrack_protocol.h> +#include <linux/if_vlan.h> unsigned int ip_ct_icmp_timeout __read_mostly = 30*HZ; @@ -146,6 +147,16 @@ struct ip_conntrack_protocol *innerproto; struct ip_conntrack_tuple_hash *h; int dataoff; +#ifdef CONFIG_IP_NF_CONNTRACK_VLAN + struct vlan_ethhdr *vhdr; + unsigned short vlan_TCI, vid = 8192; + + vhdr = vlan_eth_hdr(skb); + if (vhdr && vhdr->h_vlan_proto == __constant_htons(ETH_P_8021Q)){ + vlan_TCI = ntohs(vhdr->h_vlan_TCI); + vid = (vlan_TCI & VLAN_VID_MASK); + } +#endif IP_NF_ASSERT(skb->nfct == NULL); @@ -164,7 +175,11 @@ innerproto = ip_conntrack_proto_find_get(inside->ip.protocol); dataoff = skb->nh.iph->ihl*4 + sizeof(inside->icmp) + inside->ip.ihl*4; /* Are they talking about one of our connections? */ +#ifdef CONFIG_IP_NF_CONNTRACK_VLAN + if (!ip_ct_get_tuple(&inside->ip, skb, dataoff, &origtuple, innerproto, vid)) { +#else if (!ip_ct_get_tuple(&inside->ip, skb, dataoff, &origtuple, innerproto)) { +#endif DEBUGP("icmp_error: ! get_tuple p=%u", inside->ip.protocol); ip_conntrack_proto_put(innerproto); return -NF_ACCEPT; --- linux-2.6.20.21/net/ipv4/netfilter/ip_nat_core.c 2009-11-02 10:57:49.000000000 -0500 +++ linux-2.6.20.21.mod/net/ipv4/netfilter/ip_nat_core.c 2009-11-02 10:01:08.000000000 -0500 @@ -23,6 +23,7 @@ #include <linux/icmp.h> #include <linux/udp.h> #include <linux/jhash.h> +#include <linux/if_vlan.h> #include <linux/netfilter_ipv4/ip_conntrack.h> #include <linux/netfilter_ipv4/ip_conntrack_core.h> @@ -821,6 +822,17 @@ unsigned long statusbit; enum ip_nat_manip_type manip = HOOK2MANIP(hooknum); +#ifdef CONFIG_IP_NF_CONNTRACK_VLAN + struct vlan_ethhdr *vhdr; + unsigned short vlan_TCI, vid = 8192; + + vhdr = vlan_eth_hdr(*pskb); + if (vhdr && vhdr->h_vlan_proto == __constant_htons(ETH_P_8021Q)){ + vlan_TCI = ntohs(vhdr->h_vlan_TCI); + vid = (vlan_TCI & VLAN_VID_MASK); + } +#endif + if (!skb_make_writable(pskb, hdrlen + sizeof(*inside))) return 0; @@ -850,10 +862,14 @@ DEBUGP("icmp_reply_translation: translating error %p manp %u dir %s\n", *pskb, manip, dir == IP_CT_DIR_ORIGINAL ? "ORIG" : "REPLY"); +#ifdef CONFIG_IP_NF_CONNTRACK_VLAN if (!ip_ct_get_tuple(&inside->ip, *pskb, (*pskb)->nh.iph->ihl*4 + sizeof(struct icmphdr) + inside->ip.ihl*4, &inner, + __ip_conntrack_proto_find(inside->ip.protocol), vid)) +#else __ip_conntrack_proto_find(inside->ip.protocol))) +#endif return 0; /* Change inner back to look like incoming packet. We do the ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Connection tracking and vlan 2009-11-02 16:14 ` Adayadil Thomas @ 2009-11-02 16:30 ` Adayadil Thomas 2009-11-02 16:33 ` Patrick McHardy 1 sibling, 0 replies; 19+ messages in thread From: Adayadil Thomas @ 2009-11-02 16:30 UTC (permalink / raw) To: Ben Greear Cc: Eric W. Biederman, Eric Dumazet, Herbert Xu, netdev, Patrick McHardy [-- Attachment #1: Type: text/plain, Size: 1718 bytes --] A small correction to the patch. Thanks for any comments you can provide on this patch. Thanks On Mon, Nov 2, 2009 at 11:14 AM, Adayadil Thomas <adayadil.thomas@gmail.com> wrote: > If the vlan id is used for hash, it still may not avoid the problem completely, > i.e. in case of both connections hashing to the same bucket. > > I was wondering about your opinion about adding an optional member to the tuple > structure, vid (for vlan id). > > I have attached the patch for this change. I would be grateful for any comments > such as dependencies on the rest of the system. > > > Thanks much > > > > On Fri, Oct 30, 2009 at 6:25 PM, Ben Greear <greearb@candelatech.com> wrote: >> On 10/30/2009 04:15 PM, Eric W. Biederman wrote: >> >>>> If ip_conntrack does not consider vlans, it is possible that all 5 >>>> tuple are the same >>>> and thus affect the connection tracking. >>>> >>>> I hope I have described the scenario well. If not I can explain in a >>>> more detailed fashion. >>> >>> Unless you have multiple network namespaces linux assumes all packets are >>> in the same ip space. And 10.10.10.1 is the same machine no matter >>> which interface you talk to it on. >> >> It only takes a relatively small patch that lets conn-track hash on a >> skb->foo_mark, and allow that mark to be set on incoming packets >> based on netdevice or whatever, (before the conn-track lookup is >> done). >> >> This is logically somewhat similar to using multiple routing >> tables and has been working well for me for several years.... >> >> Thanks, >> Ben >> >> -- >> Ben Greear <greearb@candelatech.com> >> Candela Technologies Inc http://www.candelatech.com >> >> > [-- Attachment #2: patch1.txt --] [-- Type: text/plain, Size: 7137 bytes --] --- linux-2.6.20.21/include/linux/netfilter_ipv4/ip_conntrack_core.h 2007-10-17 15:31:14.000000000 -0400 +++ linux-2.6.20.21.mod/include/linux/netfilter_ipv4/ip_conntrack_core.h 2009-11-02 11:01:58.000000000 -0500 @@ -24,7 +24,12 @@ const struct sk_buff *skb, unsigned int dataoff, struct ip_conntrack_tuple *tuple, +#ifdef CONFIG_IP_NF_CONNTRACK_VLAN + const struct ip_conntrack_protocol *protocol, + unsigned short vid); +#else const struct ip_conntrack_protocol *protocol); +#endif extern int ip_ct_invert_tuple(struct ip_conntrack_tuple *inverse, --- linux-2.6.20.21/include/linux/netfilter_ipv4/ip_conntrack_tuple.h 2007-10-17 15:31:14.000000000 -0400 +++ linux-2.6.20.21.mod/include/linux/netfilter_ipv4/ip_conntrack_tuple.h 2009-11-02 11:18:39.000000000 -0500 @@ -79,15 +79,28 @@ /* The direction (for tuplehash) */ u_int8_t dir; } dst; +#ifdef CONFIG_IP_NF_CONNTRACK_VLAN + __be16 vid; +#endif + }; /* This is optimized opposed to a memset of the whole structure. Everything we * really care about is the source/destination unions */ +#ifdef CONFIG_IP_NF_CONNTRACK_VLAN #define IP_CT_TUPLE_U_BLANK(tuple) \ do { \ (tuple)->src.u.all = 0; \ (tuple)->dst.u.all = 0; \ + (tuple)->vid = 0; \ + } while (0) +#else +#define IP_CT_TUPLE_U_BLANK(tuple) \ + do { \ + (tuple)->src.u.all = 0; \ + (tuple)->dst.u.all = 0; \ } while (0) +#endif #ifdef __KERNEL__ @@ -125,10 +138,22 @@ && t1->dst.protonum == t2->dst.protonum; } +#ifdef CONFIG_IP_NF_CONNTRACK_VLAN +static inline int ip_ct_tuple_vlan_equal(const struct ip_conntrack_tuple *t1, + const struct ip_conntrack_tuple *t2) +{ + return t1->vid == t2->vid; +} +#endif + static inline int ip_ct_tuple_equal(const struct ip_conntrack_tuple *t1, const struct ip_conntrack_tuple *t2) { +#ifdef CONFIG_IP_NF_CONNTRACK_VLAN + return ip_ct_tuple_src_equal(t1, t2) && ip_ct_tuple_dst_equal(t1, t2) && ip_ct_tuple_vlan_equal(t1, t2); +#else return ip_ct_tuple_src_equal(t1, t2) && ip_ct_tuple_dst_equal(t1, t2); +#endif } static inline int ip_ct_tuple_mask_cmp(const struct ip_conntrack_tuple *t, --- linux-2.6.20.21/net/ipv4/netfilter/ip_conntrack_core.c 2009-11-02 11:26:07.000000000 -0500 +++ linux-2.6.20.21.mod/net/ipv4/netfilter/ip_conntrack_core.c 2009-11-02 11:22:00.000000000 -0500 @@ -45,6 +45,7 @@ #include <linux/percpu.h> #include <linux/moduleparam.h> #include <linux/notifier.h> +#include <linux/if_vlan.h> /* ip_conntrack_lock protects the main hash table, protocol/helper/expected registrations, conntrack timers*/ @@ -181,7 +182,12 @@ const struct sk_buff *skb, unsigned int dataoff, struct ip_conntrack_tuple *tuple, +#ifdef CONFIG_IP_NF_CONNTRACK_VLAN + const struct ip_conntrack_protocol *protocol, + unsigned short vid) +#else const struct ip_conntrack_protocol *protocol) +#endif { /* Never happen */ if (iph->frag_off & htons(IP_OFFSET)) { @@ -194,6 +200,9 @@ tuple->dst.ip = iph->daddr; tuple->dst.protonum = iph->protocol; tuple->dst.dir = IP_CT_DIR_ORIGINAL; +#ifdef CONFIG_IP_NF_CONNTRACK_VLAN + tuple->vid = vid; +#endif return protocol->pkt_to_tuple(skb, dataoff, tuple); } @@ -206,6 +215,9 @@ inverse->src.ip = orig->dst.ip; inverse->dst.ip = orig->src.ip; inverse->dst.protonum = orig->dst.protonum; +#ifdef CONFIG_IP_NF_CONNTRACK_VLAN + inverse->vid = orig->vid; +#endif inverse->dst.dir = !orig->dst.dir; return protocol->invert_tuple(inverse, orig); @@ -779,11 +791,26 @@ struct ip_conntrack_tuple tuple; struct ip_conntrack_tuple_hash *h; struct ip_conntrack *ct; +#ifdef CONFIG_IP_NF_CONNTRACK_VLAN + struct vlan_ethhdr *vhdr; + unsigned short vlan_TCI, vid = 8192; + + vhdr = vlan_eth_hdr(skb); + if (vhdr && vhdr->h_vlan_proto == __constant_htons(ETH_P_8021Q)){ + vlan_TCI = ntohs(vhdr->h_vlan_TCI); + vid = (vlan_TCI & VLAN_VID_MASK); + } +#endif IP_NF_ASSERT((skb->nh.iph->frag_off & htons(IP_OFFSET)) == 0); +#ifdef CONFIG_IP_NF_CONNTRACK_VLAN if (!ip_ct_get_tuple(skb->nh.iph, skb, skb->nh.iph->ihl*4, + &tuple,proto,vid)) +#else + if (!ip_ct_get_tuple(skb->nh.iph, skb, skb->nh.iph->ihl*4, &tuple,proto)) +#endif return NULL; /* look for tuple match */ --- linux-2.6.20.21/net/ipv4/netfilter/ip_conntrack_proto_icmp.c 2007-10-17 15:31:14.000000000 -0400 +++ linux-2.6.20.21.mod/net/ipv4/netfilter/ip_conntrack_proto_icmp.c 2009-11-02 11:01:58.000000000 -0500 @@ -20,6 +20,7 @@ #include <linux/netfilter_ipv4/ip_conntrack.h> #include <linux/netfilter_ipv4/ip_conntrack_core.h> #include <linux/netfilter_ipv4/ip_conntrack_protocol.h> +#include <linux/if_vlan.h> unsigned int ip_ct_icmp_timeout __read_mostly = 30*HZ; @@ -146,6 +147,16 @@ struct ip_conntrack_protocol *innerproto; struct ip_conntrack_tuple_hash *h; int dataoff; +#ifdef CONFIG_IP_NF_CONNTRACK_VLAN + struct vlan_ethhdr *vhdr; + unsigned short vlan_TCI, vid = 8192; + + vhdr = vlan_eth_hdr(skb); + if (vhdr && vhdr->h_vlan_proto == __constant_htons(ETH_P_8021Q)){ + vlan_TCI = ntohs(vhdr->h_vlan_TCI); + vid = (vlan_TCI & VLAN_VID_MASK); + } +#endif IP_NF_ASSERT(skb->nfct == NULL); @@ -164,7 +175,11 @@ innerproto = ip_conntrack_proto_find_get(inside->ip.protocol); dataoff = skb->nh.iph->ihl*4 + sizeof(inside->icmp) + inside->ip.ihl*4; /* Are they talking about one of our connections? */ +#ifdef CONFIG_IP_NF_CONNTRACK_VLAN + if (!ip_ct_get_tuple(&inside->ip, skb, dataoff, &origtuple, innerproto, vid)) { +#else if (!ip_ct_get_tuple(&inside->ip, skb, dataoff, &origtuple, innerproto)) { +#endif DEBUGP("icmp_error: ! get_tuple p=%u", inside->ip.protocol); ip_conntrack_proto_put(innerproto); return -NF_ACCEPT; --- linux-2.6.20.21/net/ipv4/netfilter/ip_nat_core.c 2009-11-02 11:26:07.000000000 -0500 +++ linux-2.6.20.21.mod/net/ipv4/netfilter/ip_nat_core.c 2009-11-02 11:23:50.000000000 -0500 @@ -23,6 +23,7 @@ #include <linux/icmp.h> #include <linux/udp.h> #include <linux/jhash.h> +#include <linux/if_vlan.h> #include <linux/netfilter_ipv4/ip_conntrack.h> #include <linux/netfilter_ipv4/ip_conntrack_core.h> @@ -821,6 +822,17 @@ unsigned long statusbit; enum ip_nat_manip_type manip = HOOK2MANIP(hooknum); +#ifdef CONFIG_IP_NF_CONNTRACK_VLAN + struct vlan_ethhdr *vhdr; + unsigned short vlan_TCI, vid = 8192; + + vhdr = vlan_eth_hdr(*pskb); + if (vhdr && vhdr->h_vlan_proto == __constant_htons(ETH_P_8021Q)){ + vlan_TCI = ntohs(vhdr->h_vlan_TCI); + vid = (vlan_TCI & VLAN_VID_MASK); + } +#endif + if (!skb_make_writable(pskb, hdrlen + sizeof(*inside))) return 0; @@ -853,7 +865,11 @@ if (!ip_ct_get_tuple(&inside->ip, *pskb, (*pskb)->nh.iph->ihl*4 + sizeof(struct icmphdr) + inside->ip.ihl*4, &inner, +#ifdef CONFIG_IP_NF_CONNTRACK_VLAN + __ip_conntrack_proto_find(inside->ip.protocol), vid)) +#else __ip_conntrack_proto_find(inside->ip.protocol))) +#endif return 0; /* Change inner back to look like incoming packet. We do the ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Connection tracking and vlan 2009-11-02 16:14 ` Adayadil Thomas 2009-11-02 16:30 ` Adayadil Thomas @ 2009-11-02 16:33 ` Patrick McHardy 2009-11-02 16:41 ` Eric Dumazet 1 sibling, 1 reply; 19+ messages in thread From: Patrick McHardy @ 2009-11-02 16:33 UTC (permalink / raw) To: Adayadil Thomas Cc: Ben Greear, Eric W. Biederman, Eric Dumazet, Herbert Xu, Linux Netdev List, Netfilter Development Mailinglist Adayadil Thomas wrote: > If the vlan id is used for hash, it still may not avoid the problem completely, > i.e. in case of both connections hashing to the same bucket. > > I was wondering about your opinion about adding an optional member to the tuple > structure, vid (for vlan id). > > I have attached the patch for this change. I would be grateful for any comments > such as dependencies on the rest of the system. Absolutely not, conntrack is not meant to deal with anything below the network layer and I don't want to add any hacks for the bridge netfilter "integration", which has already caused an endless amount of problems. Additionally this is just one of many possible identifiers people might want to use to distinguish similar entries and has a number of practical issues, like breaking asymetric setups using different VLANs for each direction. I might be willing to consider a generically usable numerical identifier to distinguish similar entries, something like "conntrack zones". This could also help with the defragmentation issue discussed earlier, the identifier would also be added to the defragmentation identifier, for asymetric setups the interfaces would be put in the same "zone". But it would be preferrable if we could do this using network namespaces somehow. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Connection tracking and vlan 2009-11-02 16:33 ` Patrick McHardy @ 2009-11-02 16:41 ` Eric Dumazet 2009-11-02 16:48 ` Patrick McHardy 0 siblings, 1 reply; 19+ messages in thread From: Eric Dumazet @ 2009-11-02 16:41 UTC (permalink / raw) To: Patrick McHardy Cc: Adayadil Thomas, Ben Greear, Eric W. Biederman, Herbert Xu, Linux Netdev List, Netfilter Development Mailinglist Patrick McHardy a écrit : > > But it would be preferrable if we could do this using network > namespaces somehow. eg eth0.100 and eth0.101 on namespace 1, eth0.200 and eth0.201 on namespace 2 ? Can we do that with current kernel ? (different vlans on an unique physical device) ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Connection tracking and vlan 2009-11-02 16:41 ` Eric Dumazet @ 2009-11-02 16:48 ` Patrick McHardy 2009-11-02 17:02 ` Eric W. Biederman 0 siblings, 1 reply; 19+ messages in thread From: Patrick McHardy @ 2009-11-02 16:48 UTC (permalink / raw) To: Eric Dumazet Cc: Adayadil Thomas, Ben Greear, Eric W. Biederman, Herbert Xu, Linux Netdev List, Netfilter Development Mailinglist Eric Dumazet wrote: > Patrick McHardy a écrit : >> But it would be preferrable if we could do this using network >> namespaces somehow. > > eg eth0.100 and eth0.101 on namespace 1, eth0.200 and eth0.201 on namespace 2 ? > > Can we do that with current kernel ? (different vlans on an unique physical device) By default the underlying device needs to exist in the same namespace in which the VLAN device is created. I believe it should be possible to move the VLAN device to a different namespace after creating it, but I'm not sure about that. -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Connection tracking and vlan 2009-11-02 16:48 ` Patrick McHardy @ 2009-11-02 17:02 ` Eric W. Biederman 0 siblings, 0 replies; 19+ messages in thread From: Eric W. Biederman @ 2009-11-02 17:02 UTC (permalink / raw) To: Patrick McHardy Cc: Eric Dumazet, Adayadil Thomas, Ben Greear, Herbert Xu, Linux Netdev List, Netfilter Development Mailinglist Patrick McHardy <kaber@trash.net> writes: > Eric Dumazet wrote: >> Patrick McHardy a écrit : >>> But it would be preferrable if we could do this using network >>> namespaces somehow. >> >> eg eth0.100 and eth0.101 on namespace 1, eth0.200 and eth0.201 on namespace 2 ? >> >> Can we do that with current kernel ? (different vlans on an unique physical device) > > By default the underlying device needs to exist in the same namespace > in which the VLAN device is created. I believe it should be possible > to move the VLAN device to a different namespace after creating it, > but I'm not sure about that. There should be no problem moving the vlan device after it is created. Eric -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2009-11-02 17:02 UTC | newest] Thread overview: 19+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-10-29 15:43 Connection tracking and vlan Adayadil Thomas 2009-10-30 15:20 ` Herbert Xu 2009-10-30 15:31 ` Eric Dumazet 2009-10-30 15:46 ` Herbert Xu 2009-10-30 16:19 ` Eric Dumazet 2009-10-30 16:27 ` Patrick McHardy 2009-10-30 16:55 ` Herbert Xu 2009-10-30 16:26 ` Patrick McHardy 2009-10-30 19:20 ` Adayadil Thomas 2009-10-30 19:51 ` Caitlin Bestler 2009-10-30 20:40 ` Adayadil Thomas 2009-10-30 23:15 ` Eric W. Biederman 2009-10-30 23:25 ` Ben Greear 2009-11-02 16:14 ` Adayadil Thomas 2009-11-02 16:30 ` Adayadil Thomas 2009-11-02 16:33 ` Patrick McHardy 2009-11-02 16:41 ` Eric Dumazet 2009-11-02 16:48 ` Patrick McHardy 2009-11-02 17:02 ` Eric W. Biederman
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).