netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] bonding: L2L3 xmit doesn't support IPv6
@ 2011-10-08  5:36 Yinglin Sun
  2011-10-10 19:35 ` Yinglin Sun
  2011-10-11 14:33 ` Andy Gospodarek
  0 siblings, 2 replies; 10+ messages in thread
From: Yinglin Sun @ 2011-10-08  5:36 UTC (permalink / raw)
  To: Jay Vosburgh, Andy Gospodarek; +Cc: netdev, Yinglin Sun

Add IPv6 support in L2L3 xmit policy.
L3L4 doesn't support IPv6 either, and I'll try to fix that later.

Signed-off-by: Yinglin Sun <Yinglin.Sun@emc.com>
---
 drivers/net/bonding/bond_main.c |    7 +++++++
 1 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 6d79b78..d6fd282 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -41,8 +41,10 @@
 #include <linux/ptrace.h>
 #include <linux/ioport.h>
 #include <linux/in.h>
+#include <linux/in6.h>
 #include <net/ip.h>
 #include <linux/ip.h>
+#include <linux/ipv6.h>
 #include <linux/tcp.h>
 #include <linux/udp.h>
 #include <linux/slab.h>
@@ -3372,10 +3374,15 @@ static int bond_xmit_hash_policy_l23(struct sk_buff *skb, int count)
 {
 	struct ethhdr *data = (struct ethhdr *)skb->data;
 	struct iphdr *iph = ip_hdr(skb);
+	struct ipv6hdr *ipv6h = ipv6_hdr(skb);
 
 	if (skb->protocol == htons(ETH_P_IP)) {
 		return ((ntohl(iph->saddr ^ iph->daddr) & 0xffff) ^
 			(data->h_dest[5] ^ data->h_source[5])) % count;
+	} else if (skb->protocol == htons(ETH_P_IPV6)) {
+		return ((ntohl(ipv6h->saddr.s6_addr32[3] ^
+			       ipv6h->daddr.s6_addr32[3]) & 0xffff) ^
+			(data->h_dest[5] ^ data->h_source[5])) % count;
 	}
 
 	return (data->h_dest[5] ^ data->h_source[5]) % count;
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] bonding: L2L3 xmit doesn't support IPv6
  2011-10-08  5:36 [PATCH] bonding: L2L3 xmit doesn't support IPv6 Yinglin Sun
@ 2011-10-10 19:35 ` Yinglin Sun
  2011-10-11 14:33 ` Andy Gospodarek
  1 sibling, 0 replies; 10+ messages in thread
From: Yinglin Sun @ 2011-10-10 19:35 UTC (permalink / raw)
  To: Jay Vosburgh, Andy Gospodarek; +Cc: netdev

On Fri, Oct 7, 2011 at 10:36 PM, Yinglin Sun <Yinglin.Sun@emc.com> wrote:
> Add IPv6 support in L2L3 xmit policy.
> L3L4 doesn't support IPv6 either, and I'll try to fix that later.
>
> Signed-off-by: Yinglin Sun <Yinglin.Sun@emc.com>
> ---
>  drivers/net/bonding/bond_main.c |    7 +++++++
>  1 files changed, 7 insertions(+), 0 deletions(-)
>
> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
> index 6d79b78..d6fd282 100644
> --- a/drivers/net/bonding/bond_main.c
> +++ b/drivers/net/bonding/bond_main.c
> @@ -41,8 +41,10 @@
>  #include <linux/ptrace.h>
>  #include <linux/ioport.h>
>  #include <linux/in.h>
> +#include <linux/in6.h>
>  #include <net/ip.h>
>  #include <linux/ip.h>
> +#include <linux/ipv6.h>
>  #include <linux/tcp.h>
>  #include <linux/udp.h>
>  #include <linux/slab.h>
> @@ -3372,10 +3374,15 @@ static int bond_xmit_hash_policy_l23(struct sk_buff *skb, int count)
>  {
>        struct ethhdr *data = (struct ethhdr *)skb->data;
>        struct iphdr *iph = ip_hdr(skb);
> +       struct ipv6hdr *ipv6h = ipv6_hdr(skb);
>
>        if (skb->protocol == htons(ETH_P_IP)) {
>                return ((ntohl(iph->saddr ^ iph->daddr) & 0xffff) ^
>                        (data->h_dest[5] ^ data->h_source[5])) % count;
> +       } else if (skb->protocol == htons(ETH_P_IPV6)) {
> +               return ((ntohl(ipv6h->saddr.s6_addr32[3] ^
> +                              ipv6h->daddr.s6_addr32[3]) & 0xffff) ^
> +                       (data->h_dest[5] ^ data->h_source[5])) % count;
>        }
>
>        return (data->h_dest[5] ^ data->h_source[5]) % count;
> --
> 1.7.4.1
>
>

Any comment on this patch? Thanks.

Yinglin

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] bonding: L2L3 xmit doesn't support IPv6
  2011-10-08  5:36 [PATCH] bonding: L2L3 xmit doesn't support IPv6 Yinglin Sun
  2011-10-10 19:35 ` Yinglin Sun
@ 2011-10-11 14:33 ` Andy Gospodarek
  2011-10-11 15:58   ` Jay Vosburgh
  2011-10-12  3:27   ` Yinglin Sun
  1 sibling, 2 replies; 10+ messages in thread
From: Andy Gospodarek @ 2011-10-11 14:33 UTC (permalink / raw)
  To: Yinglin Sun; +Cc: Jay Vosburgh, Andy Gospodarek, netdev

On Fri, Oct 07, 2011 at 10:36:45PM -0700, Yinglin Sun wrote:
> Add IPv6 support in L2L3 xmit policy.
> L3L4 doesn't support IPv6 either, and I'll try to fix that later.
> 
> Signed-off-by: Yinglin Sun <Yinglin.Sun@emc.com>
> ---
>  drivers/net/bonding/bond_main.c |    7 +++++++
>  1 files changed, 7 insertions(+), 0 deletions(-)
> 
> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
> index 6d79b78..d6fd282 100644
> --- a/drivers/net/bonding/bond_main.c
> +++ b/drivers/net/bonding/bond_main.c
> @@ -41,8 +41,10 @@
>  #include <linux/ptrace.h>
>  #include <linux/ioport.h>
>  #include <linux/in.h>
> +#include <linux/in6.h>
>  #include <net/ip.h>
>  #include <linux/ip.h>
> +#include <linux/ipv6.h>
>  #include <linux/tcp.h>
>  #include <linux/udp.h>
>  #include <linux/slab.h>
> @@ -3372,10 +3374,15 @@ static int bond_xmit_hash_policy_l23(struct sk_buff *skb, int count)
>  {
>  	struct ethhdr *data = (struct ethhdr *)skb->data;
>  	struct iphdr *iph = ip_hdr(skb);
> +	struct ipv6hdr *ipv6h = ipv6_hdr(skb);
>  
>  	if (skb->protocol == htons(ETH_P_IP)) {
>  		return ((ntohl(iph->saddr ^ iph->daddr) & 0xffff) ^
>  			(data->h_dest[5] ^ data->h_source[5])) % count;
> +	} else if (skb->protocol == htons(ETH_P_IPV6)) {
> +		return ((ntohl(ipv6h->saddr.s6_addr32[3] ^
> +			       ipv6h->daddr.s6_addr32[3]) & 0xffff) ^
> +			(data->h_dest[5] ^ data->h_source[5])) % count;
>  	}
>  

There have been some attempts to add support for ipv6 hashing this in
the past, but none have been committed.  The best one I had seen was one
that did some extensive testing one a wide variety of ipv6 traffic and
it showed nice traffic distribution.  I'm not sure if it was ever posted
upstream, so I will see if I can dig it up.

Can you quantify how traffic was distributed with this algorithm?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] bonding: L2L3 xmit doesn't support IPv6
  2011-10-11 14:33 ` Andy Gospodarek
@ 2011-10-11 15:58   ` Jay Vosburgh
  2011-10-12  2:51     ` Andy Gospodarek
  2011-10-12  3:30     ` Yinglin Sun
  2011-10-12  3:27   ` Yinglin Sun
  1 sibling, 2 replies; 10+ messages in thread
From: Jay Vosburgh @ 2011-10-11 15:58 UTC (permalink / raw)
  To: Andy Gospodarek; +Cc: Yinglin Sun, netdev

Andy Gospodarek <andy@greyhouse.net> wrote:

>On Fri, Oct 07, 2011 at 10:36:45PM -0700, Yinglin Sun wrote:
>> Add IPv6 support in L2L3 xmit policy.
>> L3L4 doesn't support IPv6 either, and I'll try to fix that later.
>> 
>> Signed-off-by: Yinglin Sun <Yinglin.Sun@emc.com>
>> ---
>>  drivers/net/bonding/bond_main.c |    7 +++++++
>>  1 files changed, 7 insertions(+), 0 deletions(-)
>> 
>> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>> index 6d79b78..d6fd282 100644
>> --- a/drivers/net/bonding/bond_main.c
>> +++ b/drivers/net/bonding/bond_main.c
>> @@ -41,8 +41,10 @@
>>  #include <linux/ptrace.h>
>>  #include <linux/ioport.h>
>>  #include <linux/in.h>
>> +#include <linux/in6.h>
>>  #include <net/ip.h>
>>  #include <linux/ip.h>
>> +#include <linux/ipv6.h>
>>  #include <linux/tcp.h>
>>  #include <linux/udp.h>
>>  #include <linux/slab.h>
>> @@ -3372,10 +3374,15 @@ static int bond_xmit_hash_policy_l23(struct sk_buff *skb, int count)
>>  {
>>  	struct ethhdr *data = (struct ethhdr *)skb->data;
>>  	struct iphdr *iph = ip_hdr(skb);
>> +	struct ipv6hdr *ipv6h = ipv6_hdr(skb);
>>  
>>  	if (skb->protocol == htons(ETH_P_IP)) {
>>  		return ((ntohl(iph->saddr ^ iph->daddr) & 0xffff) ^
>>  			(data->h_dest[5] ^ data->h_source[5])) % count;
>> +	} else if (skb->protocol == htons(ETH_P_IPV6)) {
>> +		return ((ntohl(ipv6h->saddr.s6_addr32[3] ^
>> +			       ipv6h->daddr.s6_addr32[3]) & 0xffff) ^
>> +			(data->h_dest[5] ^ data->h_source[5])) % count;
>>  	}
>>  
>
>There have been some attempts to add support for ipv6 hashing this in
>the past, but none have been committed.  The best one I had seen was one
>that did some extensive testing one a wide variety of ipv6 traffic and
>it showed nice traffic distribution.  I'm not sure if it was ever posted
>upstream, so I will see if I can dig it up.
>
>Can you quantify how traffic was distributed with this algorithm?

	As I recall, the IPv6 issues had to do with the "layer3+4" hash,
because the IPv6 TCP or UDP port numbers can be harder to get at than in
IPv4 (which typically has a fixed size header).  The above is just for
layer 2, so it only hits the IPv6 addresses, which don't move around.

	That said, I believe that many IPv6 addresses are derived from
the MAC address, the autoconf addresses in particular, so s6_addr32[3]
may not show a lot more variation than just the MAC address.  I don't
know for sure though, since I haven't tested it.

	I don't recall seeing the patch you mention, Andy, that checks
ipv6 traffic; can you post it?

	-J

---
	-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] bonding: L2L3 xmit doesn't support IPv6
  2011-10-11 15:58   ` Jay Vosburgh
@ 2011-10-12  2:51     ` Andy Gospodarek
  2011-10-12  3:39       ` Yinglin Sun
  2011-10-12  3:30     ` Yinglin Sun
  1 sibling, 1 reply; 10+ messages in thread
From: Andy Gospodarek @ 2011-10-12  2:51 UTC (permalink / raw)
  To: Jay Vosburgh; +Cc: Andy Gospodarek, Yinglin Sun, netdev, John Eaglesham

On Tue, Oct 11, 2011 at 08:58:59AM -0700, Jay Vosburgh wrote:
> Andy Gospodarek <andy@greyhouse.net> wrote:
[...]
> >
> >There have been some attempts to add support for ipv6 hashing this in
> >the past, but none have been committed.  The best one I had seen was one
> >that did some extensive testing one a wide variety of ipv6 traffic and
> >it showed nice traffic distribution.  I'm not sure if it was ever posted
> >upstream, so I will see if I can dig it up.
> >
> >Can you quantify how traffic was distributed with this algorithm?
> 
> 	As I recall, the IPv6 issues had to do with the "layer3+4" hash,
> because the IPv6 TCP or UDP port numbers can be harder to get at than in
> IPv4 (which typically has a fixed size header).  The above is just for
> layer 2, so it only hits the IPv6 addresses, which don't move around.
> 
> 	That said, I believe that many IPv6 addresses are derived from
> the MAC address, the autoconf addresses in particular, so s6_addr32[3]
> may not show a lot more variation than just the MAC address.  I don't
> know for sure though, since I haven't tested it.
> 
> 	I don't recall seeing the patch you mention, Andy, that checks
> ipv6 traffic; can you post it?
> 

I found the patch, cleaned it up, and compile tested it against
net-next.  I traded some emails with John Eaglesham (cc'd) earlier this
year and though he planned to post it, I never followed up.

His comments about this patch were as follows:

"I've attached my patch for IPv6 transmit hashing for the nic bonding
driver.

"The algorithm I chose is based on 273,913 IPv6 client addresses I
gathered from webservers and ran through a test program that implemented
several algorithms. This algorithm provided the most even distribution
while using the fewest instructions.

"I've tested this on 2.6.39-rc4 and a similar patch to 2.6.18 (from
RHEL5 5.4.3) and it has performed as expected in both cases.

"Please let me know if you have any comments, otherwise I suppose the
next step is to propose the patch to LKML."

I would suggest we use this.  John or I could write an official
changelog and post this in it's own thread if it looks good to others.

---
 drivers/net/bonding/bond_main.c |   30 +++++++++++++++++++++++++-----
 1 files changed, 25 insertions(+), 5 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 6191e63..335cb67 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -3368,11 +3368,20 @@ static struct notifier_block bond_inetaddr_notifier = {
 static int bond_xmit_hash_policy_l23(struct sk_buff *skb, int count)
 {
 	struct ethhdr *data = (struct ethhdr *)skb->data;
-	struct iphdr *iph = ip_hdr(skb);
 
 	if (skb->protocol == htons(ETH_P_IP)) {
+		struct iphdr *iph = ip_hdr(skb);
 		return ((ntohl(iph->saddr ^ iph->daddr) & 0xffff) ^
 			(data->h_dest[5] ^ data->h_source[5])) % count;
+	} else if (skb->protocol == htons(ETH_P_IPV6)) {
+		struct ipv6hdr *ipv6h = ipv6_hdr(skb);
+		u32 v6hash = (
+			(ipv6h->saddr.s6_addr32[1] ^ ipv6h->daddr.s6_addr32[1]) ^
+			(ipv6h->saddr.s6_addr32[2] ^ ipv6h->daddr.s6_addr32[2]) ^
+			(ipv6h->saddr.s6_addr32[3] ^ ipv6h->daddr.s6_addr32[3])
+		);
+		v6hash = (v6hash >> 16) ^ (v6hash >> 8) ^ v6hash;
+		return (v6hash ^ data->h_dest[5] ^ data->h_source[5]) % count;
 	}
 
 	return (data->h_dest[5] ^ data->h_source[5]) % count;
@@ -3386,11 +3395,11 @@ static int bond_xmit_hash_policy_l23(struct sk_buff *skb, int count)
 static int bond_xmit_hash_policy_l34(struct sk_buff *skb, int count)
 {
 	struct ethhdr *data = (struct ethhdr *)skb->data;
-	struct iphdr *iph = ip_hdr(skb);
-	__be16 *layer4hdr = (__be16 *)((u32 *)iph + iph->ihl);
-	int layer4_xor = 0;
+	u32 layer4_xor = 0;
 
 	if (skb->protocol == htons(ETH_P_IP)) {
+		struct iphdr *iph = ip_hdr(skb);
+		__be16 *layer4hdr = (__be16 *)((u32 *)iph + iph->ihl);
 		if (!ip_is_fragment(iph) &&
 		    (iph->protocol == IPPROTO_TCP ||
 		     iph->protocol == IPPROTO_UDP)) {
@@ -3398,7 +3407,18 @@ static int bond_xmit_hash_policy_l34(struct sk_buff *skb, int count)
 		}
 		return (layer4_xor ^
 			((ntohl(iph->saddr ^ iph->daddr)) & 0xffff)) % count;
-
+	} else if (skb->protocol == htons(ETH_P_IPV6)) {
+		struct ipv6hdr *ipv6h = ipv6_hdr(skb);
+		__be16 *layer4hdrv6 = (__be16 *)((u8 *)ipv6h + sizeof(*ipv6h));
+		if (ipv6h->nexthdr == IPPROTO_TCP || ipv6h->nexthdr == IPPROTO_UDP) {
+			layer4_xor = (*layer4hdrv6 ^ *(layer4hdrv6 + 1));
+		}
+		layer4_xor ^= (
+			(ipv6h->saddr.s6_addr32[1] ^ ipv6h->daddr.s6_addr32[1]) ^
+			(ipv6h->saddr.s6_addr32[2] ^ ipv6h->daddr.s6_addr32[2]) ^
+			(ipv6h->saddr.s6_addr32[3] ^ ipv6h->daddr.s6_addr32[3])
+		);
+		return ((layer4_xor >> 16) ^ (layer4_xor >> 8) ^ layer4_xor) % count;
 	}
 
 	return (data->h_dest[5] ^ data->h_source[5]) % count;

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] bonding: L2L3 xmit doesn't support IPv6
  2011-10-11 14:33 ` Andy Gospodarek
  2011-10-11 15:58   ` Jay Vosburgh
@ 2011-10-12  3:27   ` Yinglin Sun
  1 sibling, 0 replies; 10+ messages in thread
From: Yinglin Sun @ 2011-10-12  3:27 UTC (permalink / raw)
  To: Andy Gospodarek; +Cc: Jay Vosburgh, netdev

On Tue, Oct 11, 2011 at 7:33 AM, Andy Gospodarek <andy@greyhouse.net> wrote:
> On Fri, Oct 07, 2011 at 10:36:45PM -0700, Yinglin Sun wrote:
>> Add IPv6 support in L2L3 xmit policy.
>> L3L4 doesn't support IPv6 either, and I'll try to fix that later.
>>
>> Signed-off-by: Yinglin Sun <Yinglin.Sun@emc.com>
>> ---
>>  drivers/net/bonding/bond_main.c |    7 +++++++
>>  1 files changed, 7 insertions(+), 0 deletions(-)
>>
>> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>> index 6d79b78..d6fd282 100644
>> --- a/drivers/net/bonding/bond_main.c
>> +++ b/drivers/net/bonding/bond_main.c
>> @@ -41,8 +41,10 @@
>>  #include <linux/ptrace.h>
>>  #include <linux/ioport.h>
>>  #include <linux/in.h>
>> +#include <linux/in6.h>
>>  #include <net/ip.h>
>>  #include <linux/ip.h>
>> +#include <linux/ipv6.h>
>>  #include <linux/tcp.h>
>>  #include <linux/udp.h>
>>  #include <linux/slab.h>
>> @@ -3372,10 +3374,15 @@ static int bond_xmit_hash_policy_l23(struct sk_buff *skb, int count)
>>  {
>>       struct ethhdr *data = (struct ethhdr *)skb->data;
>>       struct iphdr *iph = ip_hdr(skb);
>> +     struct ipv6hdr *ipv6h = ipv6_hdr(skb);
>>
>>       if (skb->protocol == htons(ETH_P_IP)) {
>>               return ((ntohl(iph->saddr ^ iph->daddr) & 0xffff) ^
>>                       (data->h_dest[5] ^ data->h_source[5])) % count;
>> +     } else if (skb->protocol == htons(ETH_P_IPV6)) {
>> +             return ((ntohl(ipv6h->saddr.s6_addr32[3] ^
>> +                            ipv6h->daddr.s6_addr32[3]) & 0xffff) ^
>> +                     (data->h_dest[5] ^ data->h_source[5])) % count;
>>       }
>>
>
> There have been some attempts to add support for ipv6 hashing this in
> the past, but none have been committed.  The best one I had seen was one
> that did some extensive testing one a wide variety of ipv6 traffic and
> it showed nice traffic distribution.  I'm not sure if it was ever posted
> upstream, so I will see if I can dig it up.
>
> Can you quantify how traffic was distributed with this algorithm?
>

My test was not extensive. I manually set some addresses which vary by
the last 32bit portion, and the traffic was distributed evenly across
slaves.

I agree that the real world IPv6 traffic should be used for more extensive test.

Yinglin

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] bonding: L2L3 xmit doesn't support IPv6
  2011-10-11 15:58   ` Jay Vosburgh
  2011-10-12  2:51     ` Andy Gospodarek
@ 2011-10-12  3:30     ` Yinglin Sun
  1 sibling, 0 replies; 10+ messages in thread
From: Yinglin Sun @ 2011-10-12  3:30 UTC (permalink / raw)
  To: Jay Vosburgh; +Cc: Andy Gospodarek, netdev

On Tue, Oct 11, 2011 at 8:58 AM, Jay Vosburgh <fubar@us.ibm.com> wrote:
> Andy Gospodarek <andy@greyhouse.net> wrote:
>
>>On Fri, Oct 07, 2011 at 10:36:45PM -0700, Yinglin Sun wrote:
>>> Add IPv6 support in L2L3 xmit policy.
>>> L3L4 doesn't support IPv6 either, and I'll try to fix that later.
>>>
>>> Signed-off-by: Yinglin Sun <Yinglin.Sun@emc.com>
>>> ---
>>>  drivers/net/bonding/bond_main.c |    7 +++++++
>>>  1 files changed, 7 insertions(+), 0 deletions(-)
>>>
>>> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>>> index 6d79b78..d6fd282 100644
>>> --- a/drivers/net/bonding/bond_main.c
>>> +++ b/drivers/net/bonding/bond_main.c
>>> @@ -41,8 +41,10 @@
>>>  #include <linux/ptrace.h>
>>>  #include <linux/ioport.h>
>>>  #include <linux/in.h>
>>> +#include <linux/in6.h>
>>>  #include <net/ip.h>
>>>  #include <linux/ip.h>
>>> +#include <linux/ipv6.h>
>>>  #include <linux/tcp.h>
>>>  #include <linux/udp.h>
>>>  #include <linux/slab.h>
>>> @@ -3372,10 +3374,15 @@ static int bond_xmit_hash_policy_l23(struct sk_buff *skb, int count)
>>>  {
>>>      struct ethhdr *data = (struct ethhdr *)skb->data;
>>>      struct iphdr *iph = ip_hdr(skb);
>>> +    struct ipv6hdr *ipv6h = ipv6_hdr(skb);
>>>
>>>      if (skb->protocol == htons(ETH_P_IP)) {
>>>              return ((ntohl(iph->saddr ^ iph->daddr) & 0xffff) ^
>>>                      (data->h_dest[5] ^ data->h_source[5])) % count;
>>> +    } else if (skb->protocol == htons(ETH_P_IPV6)) {
>>> +            return ((ntohl(ipv6h->saddr.s6_addr32[3] ^
>>> +                           ipv6h->daddr.s6_addr32[3]) & 0xffff) ^
>>> +                    (data->h_dest[5] ^ data->h_source[5])) % count;
>>>      }
>>>
>>
>>There have been some attempts to add support for ipv6 hashing this in
>>the past, but none have been committed.  The best one I had seen was one
>>that did some extensive testing one a wide variety of ipv6 traffic and
>>it showed nice traffic distribution.  I'm not sure if it was ever posted
>>upstream, so I will see if I can dig it up.
>>
>>Can you quantify how traffic was distributed with this algorithm?
>
>        As I recall, the IPv6 issues had to do with the "layer3+4" hash,
> because the IPv6 TCP or UDP port numbers can be harder to get at than in
> IPv4 (which typically has a fixed size header).  The above is just for
> layer 2, so it only hits the IPv6 addresses, which don't move around.
>
>        That said, I believe that many IPv6 addresses are derived from
> the MAC address, the autoconf addresses in particular, so s6_addr32[3]
> may not show a lot more variation than just the MAC address.  I don't
> know for sure though, since I haven't tested it.
>

This is a good point. The last 32bit portion is not enough for some cases.

Yinglin

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] bonding: L2L3 xmit doesn't support IPv6
  2011-10-12  2:51     ` Andy Gospodarek
@ 2011-10-12  3:39       ` Yinglin Sun
  2011-10-12  4:06         ` Eric Dumazet
  0 siblings, 1 reply; 10+ messages in thread
From: Yinglin Sun @ 2011-10-12  3:39 UTC (permalink / raw)
  To: Andy Gospodarek; +Cc: Jay Vosburgh, netdev, John Eaglesham

On Tue, Oct 11, 2011 at 7:51 PM, Andy Gospodarek <andy@greyhouse.net> wrote:
> On Tue, Oct 11, 2011 at 08:58:59AM -0700, Jay Vosburgh wrote:
>> Andy Gospodarek <andy@greyhouse.net> wrote:
> [...]
>> >
>> >There have been some attempts to add support for ipv6 hashing this in
>> >the past, but none have been committed.  The best one I had seen was one
>> >that did some extensive testing one a wide variety of ipv6 traffic and
>> >it showed nice traffic distribution.  I'm not sure if it was ever posted
>> >upstream, so I will see if I can dig it up.
>> >
>> >Can you quantify how traffic was distributed with this algorithm?
>>
>>       As I recall, the IPv6 issues had to do with the "layer3+4" hash,
>> because the IPv6 TCP or UDP port numbers can be harder to get at than in
>> IPv4 (which typically has a fixed size header).  The above is just for
>> layer 2, so it only hits the IPv6 addresses, which don't move around.
>>
>>       That said, I believe that many IPv6 addresses are derived from
>> the MAC address, the autoconf addresses in particular, so s6_addr32[3]
>> may not show a lot more variation than just the MAC address.  I don't
>> know for sure though, since I haven't tested it.
>>
>>       I don't recall seeing the patch you mention, Andy, that checks
>> ipv6 traffic; can you post it?
>>
>
> I found the patch, cleaned it up, and compile tested it against
> net-next.  I traded some emails with John Eaglesham (cc'd) earlier this
> year and though he planned to post it, I never followed up.
>
> His comments about this patch were as follows:
>
> "I've attached my patch for IPv6 transmit hashing for the nic bonding
> driver.
>
> "The algorithm I chose is based on 273,913 IPv6 client addresses I
> gathered from webservers and ran through a test program that implemented
> several algorithms. This algorithm provided the most even distribution
> while using the fewest instructions.
>
> "I've tested this on 2.6.39-rc4 and a similar patch to 2.6.18 (from
> RHEL5 5.4.3) and it has performed as expected in both cases.
>
> "Please let me know if you have any comments, otherwise I suppose the
> next step is to propose the patch to LKML."
>
> I would suggest we use this.  John or I could write an official
> changelog and post this in it's own thread if it looks good to others.
>
> ---
>  drivers/net/bonding/bond_main.c |   30 +++++++++++++++++++++++++-----
>  1 files changed, 25 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
> index 6191e63..335cb67 100644
> --- a/drivers/net/bonding/bond_main.c
> +++ b/drivers/net/bonding/bond_main.c
> @@ -3368,11 +3368,20 @@ static struct notifier_block bond_inetaddr_notifier = {
>  static int bond_xmit_hash_policy_l23(struct sk_buff *skb, int count)
>  {
>        struct ethhdr *data = (struct ethhdr *)skb->data;
> -       struct iphdr *iph = ip_hdr(skb);
>
>        if (skb->protocol == htons(ETH_P_IP)) {
> +               struct iphdr *iph = ip_hdr(skb);
>                return ((ntohl(iph->saddr ^ iph->daddr) & 0xffff) ^
>                        (data->h_dest[5] ^ data->h_source[5])) % count;
> +       } else if (skb->protocol == htons(ETH_P_IPV6)) {
> +               struct ipv6hdr *ipv6h = ipv6_hdr(skb);
> +               u32 v6hash = (
> +                       (ipv6h->saddr.s6_addr32[1] ^ ipv6h->daddr.s6_addr32[1]) ^
> +                       (ipv6h->saddr.s6_addr32[2] ^ ipv6h->daddr.s6_addr32[2]) ^
> +                       (ipv6h->saddr.s6_addr32[3] ^ ipv6h->daddr.s6_addr32[3])
> +               );
> +               v6hash = (v6hash >> 16) ^ (v6hash >> 8) ^ v6hash;
> +               return (v6hash ^ data->h_dest[5] ^ data->h_source[5]) % count;
>        }
>
>        return (data->h_dest[5] ^ data->h_source[5]) % count;
> @@ -3386,11 +3395,11 @@ static int bond_xmit_hash_policy_l23(struct sk_buff *skb, int count)
>  static int bond_xmit_hash_policy_l34(struct sk_buff *skb, int count)
>  {
>        struct ethhdr *data = (struct ethhdr *)skb->data;
> -       struct iphdr *iph = ip_hdr(skb);
> -       __be16 *layer4hdr = (__be16 *)((u32 *)iph + iph->ihl);
> -       int layer4_xor = 0;
> +       u32 layer4_xor = 0;
>
>        if (skb->protocol == htons(ETH_P_IP)) {
> +               struct iphdr *iph = ip_hdr(skb);
> +               __be16 *layer4hdr = (__be16 *)((u32 *)iph + iph->ihl);
>                if (!ip_is_fragment(iph) &&
>                    (iph->protocol == IPPROTO_TCP ||
>                     iph->protocol == IPPROTO_UDP)) {
> @@ -3398,7 +3407,18 @@ static int bond_xmit_hash_policy_l34(struct sk_buff *skb, int count)
>                }
>                return (layer4_xor ^
>                        ((ntohl(iph->saddr ^ iph->daddr)) & 0xffff)) % count;
> -
> +       } else if (skb->protocol == htons(ETH_P_IPV6)) {
> +               struct ipv6hdr *ipv6h = ipv6_hdr(skb);
> +               __be16 *layer4hdrv6 = (__be16 *)((u8 *)ipv6h + sizeof(*ipv6h));
> +               if (ipv6h->nexthdr == IPPROTO_TCP || ipv6h->nexthdr == IPPROTO_UDP) {

Does this work if this is a fragmentation packet? and if
ipv6h->nexthdr is not IPPROTO_TCP/IPPROTO_UDP, it doesn't mean this is
not TCP/UDP packet. We need to go through the extension header chain
and look at the last one. It's likely there are some other extension
headers before L4 header.

Yinglin

> +                       layer4_xor = (*layer4hdrv6 ^ *(layer4hdrv6 + 1));
> +               }
> +               layer4_xor ^= (
> +                       (ipv6h->saddr.s6_addr32[1] ^ ipv6h->daddr.s6_addr32[1]) ^
> +                       (ipv6h->saddr.s6_addr32[2] ^ ipv6h->daddr.s6_addr32[2]) ^
> +                       (ipv6h->saddr.s6_addr32[3] ^ ipv6h->daddr.s6_addr32[3])
> +               );
> +               return ((layer4_xor >> 16) ^ (layer4_xor >> 8) ^ layer4_xor) % count;
>        }
>
>        return (data->h_dest[5] ^ data->h_source[5]) % count;
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] bonding: L2L3 xmit doesn't support IPv6
  2011-10-12  3:39       ` Yinglin Sun
@ 2011-10-12  4:06         ` Eric Dumazet
  2011-10-12 21:15           ` John Eaglesham
  0 siblings, 1 reply; 10+ messages in thread
From: Eric Dumazet @ 2011-10-12  4:06 UTC (permalink / raw)
  To: Yinglin Sun; +Cc: Andy Gospodarek, Jay Vosburgh, netdev, John Eaglesham

Le mardi 11 octobre 2011 à 20:39 -0700, Yinglin Sun a écrit :
> On Tue, Oct 11, 2011 at 7:51 PM, Andy Gospodarek <andy@greyhouse.net> wrote:
> >
> >        if (skb->protocol == htons(ETH_P_IP)) {
> > +               struct iphdr *iph = ip_hdr(skb);
> > +               __be16 *layer4hdr = (__be16 *)((u32 *)iph + iph->ihl);
> >                if (!ip_is_fragment(iph) &&
> >                    (iph->protocol == IPPROTO_TCP ||
> >                     iph->protocol == IPPROTO_UDP)) {
> > @@ -3398,7 +3407,18 @@ static int bond_xmit_hash_policy_l34(struct sk_buff *skb, int count)
> >                }
> >                return (layer4_xor ^
> >                        ((ntohl(iph->saddr ^ iph->daddr)) & 0xffff)) % count;
> > -
> > +       } else if (skb->protocol == htons(ETH_P_IPV6)) {
> > +               struct ipv6hdr *ipv6h = ipv6_hdr(skb);
> > +               __be16 *layer4hdrv6 = (__be16 *)((u8 *)ipv6h + sizeof(*ipv6h));
> > +               if (ipv6h->nexthdr == IPPROTO_TCP || ipv6h->nexthdr == IPPROTO_UDP) {
> 
> Does this work if this is a fragmentation packet? and if
> ipv6h->nexthdr is not IPPROTO_TCP/IPPROTO_UDP, it doesn't mean this is
> not TCP/UDP packet. We need to go through the extension header chain
> and look at the last one. It's likely there are some other extension
> headers before L4 header.
> 

Its a best effort.

If first header is not IPPROTO_TCP/UDP, I am not sure its wise to spend
time in hope to find layer4 info (missing anyway in fragments)

By the way I see no bound checking on SKB head : malicious packet could
make bond_xmit_hash_policy_l34() access unitialized memory.

We have same 'fastpath' in __skb_get_rxhash()

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] bonding: L2L3 xmit doesn't support IPv6
  2011-10-12  4:06         ` Eric Dumazet
@ 2011-10-12 21:15           ` John Eaglesham
  0 siblings, 0 replies; 10+ messages in thread
From: John Eaglesham @ 2011-10-12 21:15 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Yinglin Sun, Andy Gospodarek, Jay Vosburgh,
	netdev@vger.kernel.org, linux

Eric Dumazet wrote:
> Le mardi 11 octobre 2011 à 20:39 -0700, Yinglin Sun a écrit :
>> On Tue, Oct 11, 2011 at 7:51 PM, Andy Gospodarek <andy@greyhouse.net> wrote:
>>>        if (skb->protocol == htons(ETH_P_IP)) {
>>> +               struct iphdr *iph = ip_hdr(skb);
>>> +               __be16 *layer4hdr = (__be16 *)((u32 *)iph + iph->ihl);
>>>                if (!ip_is_fragment(iph) &&
>>>                    (iph->protocol == IPPROTO_TCP ||
>>>                     iph->protocol == IPPROTO_UDP)) {
>>> @@ -3398,7 +3407,18 @@ static int bond_xmit_hash_policy_l34(struct sk_buff *skb, int count)
>>>                }
>>>                return (layer4_xor ^
>>>                        ((ntohl(iph->saddr ^ iph->daddr)) & 0xffff)) % count;
>>> -
>>> +       } else if (skb->protocol == htons(ETH_P_IPV6)) {
>>> +               struct ipv6hdr *ipv6h = ipv6_hdr(skb);
>>> +               __be16 *layer4hdrv6 = (__be16 *)((u8 *)ipv6h + sizeof(*ipv6h));
>>> +               if (ipv6h->nexthdr == IPPROTO_TCP || ipv6h->nexthdr == IPPROTO_UDP) {
>> Does this work if this is a fragmentation packet? and if
>> ipv6h->nexthdr is not IPPROTO_TCP/IPPROTO_UDP, it doesn't mean this is
>> not TCP/UDP packet. We need to go through the extension header chain
>> and look at the last one. It's likely there are some other extension
>> headers before L4 header.
>>
> 
> Its a best effort.
> 
> If first header is not IPPROTO_TCP/UDP, I am not sure its wise to spend
> time in hope to find layer4 info (missing anyway in fragments)
> 
> By the way I see no bound checking on SKB head : malicious packet could
> make bond_xmit_hash_policy_l34() access unitialized memory.
> 
> We have same 'fastpath' in __skb_get_rxhash()
> 
> 

Thanks for keeping me in the loop on this.

I caught the bounds checking bug and added checks to code I haven't 
submitted in yet (I still need to test that it works before I submit a 
patch).

I don't like the idea of sticking a loop in this part of the code, 
especially one traversing a list of length controlled by outside users 
with no guarantee it is properly formed. I ran some tests gathering 
traffic and I never saw any packets with headers between the IPv6 and 
TCP or UDP, so at least in the real world right now this code should 
behave as expected for the vast majority of traffic. Things might change 
in the future, but if this code is clear/clean/simple and solves the 
problem today, I'd be happier with that than trying to parse a full IPv6 
header.

I'll test and post my revised patch (including bounds checking and 
documentation updates) in a day or so.

John

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2011-10-12 21:28 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-10-08  5:36 [PATCH] bonding: L2L3 xmit doesn't support IPv6 Yinglin Sun
2011-10-10 19:35 ` Yinglin Sun
2011-10-11 14:33 ` Andy Gospodarek
2011-10-11 15:58   ` Jay Vosburgh
2011-10-12  2:51     ` Andy Gospodarek
2011-10-12  3:39       ` Yinglin Sun
2011-10-12  4:06         ` Eric Dumazet
2011-10-12 21:15           ` John Eaglesham
2011-10-12  3:30     ` Yinglin Sun
2011-10-12  3:27   ` Yinglin Sun

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).