* [PATCH 10/13]: net: Implement simple sw TX hashing.
@ 2008-07-10 10:57 David Miller
2008-07-11 20:49 ` Eric Dumazet
2008-07-17 16:16 ` Stephen Hemminger
0 siblings, 2 replies; 10+ messages in thread
From: David Miller @ 2008-07-10 10:57 UTC (permalink / raw)
To: netdev
It just xor hashes over IPv4/IPv6 addresses and ports of transport.
The only assumption it makes is that skb_network_header() is set
correctly.
Signed-off-by: David S. Miller <davem@davemloft.net>
---
net/core/dev.c | 52 ++++++++++++++++++++++++++++++++++++++++++++++++++++
1 files changed, 52 insertions(+), 0 deletions(-)
diff --git a/net/core/dev.c b/net/core/dev.c
index db95b49..62459ea 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -121,6 +121,9 @@
#include <linux/ctype.h>
#include <linux/if_arp.h>
#include <linux/if_vlan.h>
+#include <linux/ip.h>
+#include <linux/ipv6.h>
+#include <linux/in.h>
#include "net-sysfs.h"
@@ -1665,6 +1668,53 @@ out_kfree_skb:
* --BLG
*/
+static u16 simple_tx_hash(struct net_device *dev, struct sk_buff *skb)
+{
+ u32 *addr, *ports, hash, ihl;
+ u8 ip_proto;
+ int alen;
+
+ switch (skb->protocol) {
+ case __constant_htons(ETH_P_IP):
+ ip_proto = ip_hdr(skb)->protocol;
+ addr = &ip_hdr(skb)->saddr;
+ ihl = ip_hdr(skb)->ihl;
+ alen = 2;
+ break;
+ case __constant_htons(ETH_P_IPV6):
+ ip_proto = ipv6_hdr(skb)->nexthdr;
+ addr = &ipv6_hdr(skb)->saddr.s6_addr32[0];
+ ihl = (40 >> 2);
+ alen = 8;
+ break;
+ default:
+ return 0;
+ }
+
+ ports = (u32 *) (skb_network_header(skb) + (ihl * 4));
+
+ hash = 0;
+ while (alen--)
+ hash ^= *addr++;
+
+ switch (ip_proto) {
+ case IPPROTO_TCP:
+ case IPPROTO_UDP:
+ case IPPROTO_DCCP:
+ case IPPROTO_ESP:
+ case IPPROTO_AH:
+ case IPPROTO_SCTP:
+ case IPPROTO_UDPLITE:
+ hash ^= *ports;
+ break;
+
+ default:
+ break;
+ }
+
+ return hash % dev->real_num_tx_queues;
+}
+
static struct netdev_queue *dev_pick_tx(struct net_device *dev,
struct sk_buff *skb)
{
@@ -1672,6 +1722,8 @@ static struct netdev_queue *dev_pick_tx(struct net_device *dev,
if (dev->select_queue)
queue_index = dev->select_queue(dev, skb);
+ else if (dev->real_num_tx_queues)
+ queue_index = simple_tx_hash(dev, skb);
skb_set_queue_mapping(skb, queue_index);
return netdev_get_tx_queue(dev, queue_index);
--
1.5.6.2.255.gbed62
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH 10/13]: net: Implement simple sw TX hashing.
2008-07-10 10:57 [PATCH 10/13]: net: Implement simple sw TX hashing David Miller
@ 2008-07-11 20:49 ` Eric Dumazet
2008-07-14 11:33 ` Herbert Xu
2008-07-17 16:16 ` Stephen Hemminger
1 sibling, 1 reply; 10+ messages in thread
From: Eric Dumazet @ 2008-07-11 20:49 UTC (permalink / raw)
To: David Miller; +Cc: netdev
David Miller a écrit :
> It just xor hashes over IPv4/IPv6 addresses and ports of transport.
>
> The only assumption it makes is that skb_network_header() is set
> correctly.
>
> Signed-off-by: David S. Miller <davem@davemloft.net>
> ---
> net/core/dev.c | 52 ++++++++++++++++++++++++++++++++++++++++++++++++++++
> 1 files changed, 52 insertions(+), 0 deletions(-)
>
> diff --git a/net/core/dev.c b/net/core/dev.c
> index db95b49..62459ea 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -121,6 +121,9 @@
> #include <linux/ctype.h>
> #include <linux/if_arp.h>
> #include <linux/if_vlan.h>
> +#include <linux/ip.h>
> +#include <linux/ipv6.h>
> +#include <linux/in.h>
>
> #include "net-sysfs.h"
>
> @@ -1665,6 +1668,53 @@ out_kfree_skb:
> * --BLG
> */
>
> +static u16 simple_tx_hash(struct net_device *dev, struct sk_buff *skb)
> +{
> + u32 *addr, *ports, hash, ihl;
> + u8 ip_proto;
> + int alen;
> +
> + switch (skb->protocol) {
> + case __constant_htons(ETH_P_IP):
> + ip_proto = ip_hdr(skb)->protocol;
> + addr = &ip_hdr(skb)->saddr;
> + ihl = ip_hdr(skb)->ihl;
> + alen = 2;
> + break;
> + case __constant_htons(ETH_P_IPV6):
> + ip_proto = ipv6_hdr(skb)->nexthdr;
> + addr = &ipv6_hdr(skb)->saddr.s6_addr32[0];
> + ihl = (40 >> 2);
> + alen = 8;
> + break;
> + default:
> + return 0;
> + }
> +
> + ports = (u32 *) (skb_network_header(skb) + (ihl * 4));
> +
> + hash = 0;
> + while (alen--)
> + hash ^= *addr++;
> +
> + switch (ip_proto) {
> + case IPPROTO_TCP:
> + case IPPROTO_UDP:
> + case IPPROTO_DCCP:
> + case IPPROTO_ESP:
> + case IPPROTO_AH:
> + case IPPROTO_SCTP:
> + case IPPROTO_UDPLITE:
> + hash ^= *ports;
> + break;
> +
> + default:
> + break;
> + }
> +
> + return hash % dev->real_num_tx_queues;
simple but expensive... but could be changed to use reciprocal_divide() if necessary...
> +}
> +
> static struct netdev_queue *dev_pick_tx(struct net_device *dev,
> struct sk_buff *skb)
> {
> @@ -1672,6 +1722,8 @@ static struct netdev_queue *dev_pick_tx(struct net_device *dev,
>
> if (dev->select_queue)
> queue_index = dev->select_queue(dev, skb);
> + else if (dev->real_num_tx_queues)
I guess your intent was to check you have at least 2 queues ?
"else if (dev->real_num_tx_queues > 1)"
> + queue_index = simple_tx_hash(dev, skb);
>
> skb_set_queue_mapping(skb, queue_index);
> return netdev_get_tx_queue(dev, queue_index);
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 10/13]: net: Implement simple sw TX hashing.
2008-07-11 20:49 ` Eric Dumazet
@ 2008-07-14 11:33 ` Herbert Xu
2008-07-14 11:58 ` David Miller
0 siblings, 1 reply; 10+ messages in thread
From: Herbert Xu @ 2008-07-14 11:33 UTC (permalink / raw)
To: Eric Dumazet; +Cc: davem, netdev
Eric Dumazet <dada1@cosmosbay.com> wrote:
>
>> + return hash % dev->real_num_tx_queues;
>
> simple but expensive... but could be changed to use reciprocal_divide() if necessary...
For the common cases it should be a power of 2 too.
Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 10/13]: net: Implement simple sw TX hashing.
2008-07-14 11:33 ` Herbert Xu
@ 2008-07-14 11:58 ` David Miller
2008-07-15 6:49 ` Eric Dumazet
0 siblings, 1 reply; 10+ messages in thread
From: David Miller @ 2008-07-14 11:58 UTC (permalink / raw)
To: herbert; +Cc: dada1, netdev
From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Mon, 14 Jul 2008 19:33:02 +0800
> Eric Dumazet <dada1@cosmosbay.com> wrote:
> >
> >> + return hash % dev->real_num_tx_queues;
> >
> > simple but expensive... but could be changed to use reciprocal_divide() if necessary...
>
> For the common cases it should be a power of 2 too.
Unfortunately it is not enforcable for it to be a power of 2
and I fear it will in fact not be very often for several
chips that will use this stuff.
BTW, how can reciprocal_divide() be used to compute a modulus? Are
you going to do the reciprocal divide, re-multiply, then subtract?
:-)
Sorry I haven't replied yet to all of this useful feedback, I'm still
killing myself to finish making the entire generic networking
multiqueue aware :(
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 10/13]: net: Implement simple sw TX hashing.
2008-07-14 11:58 ` David Miller
@ 2008-07-15 6:49 ` Eric Dumazet
2008-07-15 6:58 ` David Miller
2008-07-15 10:45 ` David Miller
0 siblings, 2 replies; 10+ messages in thread
From: Eric Dumazet @ 2008-07-15 6:49 UTC (permalink / raw)
To: David Miller; +Cc: herbert, netdev
David Miller a écrit :
> From: Herbert Xu <herbert@gondor.apana.org.au>
> Date: Mon, 14 Jul 2008 19:33:02 +0800
>
>> Eric Dumazet <dada1@cosmosbay.com> wrote:
>>>> + return hash % dev->real_num_tx_queues;
>>> simple but expensive... but could be changed to use reciprocal_divide() if necessary...
>> For the common cases it should be a power of 2 too.
>
> Unfortunately it is not enforcable for it to be a power of 2
> and I fear it will in fact not be very often for several
> chips that will use this stuff.
>
> BTW, how can reciprocal_divide() be used to compute a modulus? Are
> you going to do the reciprocal divide, re-multiply, then subtract?
> :-)
>
reciprocal divide is the name of the following
tranformation of a divide to one multiply.
f1(X) = X / N;
->g1(X) ((u64)X * R) >> 32;
So you are right I was wrong to name the following
transformation a reciprocal divide.
f2(X) = X % N ;
->g2(X) = ((u64)X * N) >> 32;
But g2() is quite similar to g1() :)
f2() & g2() functions are different of course, but should give
same hash spreading if X has an uniform distribution in 32bits space.
simple_tx_hash() in its current form may not have this property, thats hard
to say.
For example, hash_dst() in net/netfilter/xt_hashlimit.c is using the following code :
static u_int32_t
hash_dst(const struct xt_hashlimit_htable *ht, const struct dsthash_dst *dst)
{
u_int32_t hash = jhash2((const u32 *)dst,
sizeof(*dst)/sizeof(u32),
ht->rnd);
/*
* Instead of returning hash % ht->cfg.size (implying a divide)
* we return the high 32 bits of the (hash * ht->cfg.size) that will
* give results between [0 and cfg.size-1] and same hash distribution,
* but using a multiply, less expensive than a divide
*/
return ((u64)hash * ht->cfg.size) >> 32;
}
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 10/13]: net: Implement simple sw TX hashing.
2008-07-15 6:49 ` Eric Dumazet
@ 2008-07-15 6:58 ` David Miller
2008-07-15 10:45 ` David Miller
1 sibling, 0 replies; 10+ messages in thread
From: David Miller @ 2008-07-15 6:58 UTC (permalink / raw)
To: dada1; +Cc: herbert, netdev
From: Eric Dumazet <dada1@cosmosbay.com>
Date: Tue, 15 Jul 2008 08:49:32 +0200
> For example, hash_dst() in net/netfilter/xt_hashlimit.c is using the
> following code :
Aha, now I see what you mean.
Thanks, I'll take this into consideration.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 10/13]: net: Implement simple sw TX hashing.
2008-07-15 6:49 ` Eric Dumazet
2008-07-15 6:58 ` David Miller
@ 2008-07-15 10:45 ` David Miller
1 sibling, 0 replies; 10+ messages in thread
From: David Miller @ 2008-07-15 10:45 UTC (permalink / raw)
To: dada1; +Cc: herbert, netdev
From: Eric Dumazet <dada1@cosmosbay.com>
Date: Tue, 15 Jul 2008 08:49:32 +0200
> f2(X) = X % N ;
> ->g2(X) = ((u64)X * N) >> 32;
>
> But g2() is quite similar to g1() :)
>
> f2() & g2() functions are different of course, but should give
> same hash spreading if X has an uniform distribution in 32bits space.
I thought about this some more and I'm having my doubts about
this.
Let's say N is 8, this means that all values of X smaller than
0x20000000 will hash to zero.
It means that all of your entropy comes from the top 3 bits of
the X.
So like hashlimit we'd need to use jhash or something like that
to spread the bits around some more.
I also wonder if we should add hash randomization like the ipv4
routing cache. I wonder if the performance hit is bad enough
to warrant protection against that. After all, if you "exploit"
this hash you just end up with the performance we have today.
For now I'm leaving the modulus in there until we understand this
better.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 10/13]: net: Implement simple sw TX hashing.
2008-07-10 10:57 [PATCH 10/13]: net: Implement simple sw TX hashing David Miller
2008-07-11 20:49 ` Eric Dumazet
@ 2008-07-17 16:16 ` Stephen Hemminger
2008-07-17 16:21 ` Patrick McHardy
1 sibling, 1 reply; 10+ messages in thread
From: Stephen Hemminger @ 2008-07-17 16:16 UTC (permalink / raw)
To: David Miller; +Cc: netdev
On Thu, 10 Jul 2008 03:57:13 -0700 (PDT)
David Miller <davem@davemloft.net> wrote:
>
> It just xor hashes over IPv4/IPv6 addresses and ports of transport.
>
> The only assumption it makes is that skb_network_header() is set
> correctly.
>
> Signed-off-by: David S. Miller <davem@davemloft.net>
> ---
> net/core/dev.c | 52 ++++++++++++++++++++++++++++++++++++++++++++++++++++
> 1 files changed, 52 insertions(+), 0 deletions(-)
>
> diff --git a/net/core/dev.c b/net/core/dev.c
> index db95b49..62459ea 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -121,6 +121,9 @@
> #include <linux/ctype.h>
> #include <linux/if_arp.h>
> #include <linux/if_vlan.h>
> +#include <linux/ip.h>
> +#include <linux/ipv6.h>
> +#include <linux/in.h>
>
> #include "net-sysfs.h"
>
> @@ -1665,6 +1668,53 @@ out_kfree_skb:
> * --BLG
> */
>
> +static u16 simple_tx_hash(struct net_device *dev, struct sk_buff *skb)
> +{
> + u32 *addr, *ports, hash, ihl;
> + u8 ip_proto;
> + int alen;
> +
> + switch (skb->protocol) {
> + case __constant_htons(ETH_P_IP):
> + ip_proto = ip_hdr(skb)->protocol;
> + addr = &ip_hdr(skb)->saddr;
> + ihl = ip_hdr(skb)->ihl;
> + alen = 2;
> + break;
> + case __constant_htons(ETH_P_IPV6):
> + ip_proto = ipv6_hdr(skb)->nexthdr;
> + addr = &ipv6_hdr(skb)->saddr.s6_addr32[0];
> + ihl = (40 >> 2);
> + alen = 8;
> + break;
> + default:
> + return 0;
> + }
> +
> + ports = (u32 *) (skb_network_header(skb) + (ihl * 4));
> +
> + hash = 0;
> + while (alen--)
> + hash ^= *addr++;
> +
> + switch (ip_proto) {
> + case IPPROTO_TCP:
> + case IPPROTO_UDP:
> + case IPPROTO_DCCP:
> + case IPPROTO_ESP:
> + case IPPROTO_AH:
> + case IPPROTO_SCTP:
> + case IPPROTO_UDPLITE:
> + hash ^= *ports;
> + break;
> +
> + default:
> + break;
> + }
> +
> + return hash % dev->real_num_tx_queues;
> +}
What about VLAN's? and PPPoE?
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 10/13]: net: Implement simple sw TX hashing.
2008-07-17 16:16 ` Stephen Hemminger
@ 2008-07-17 16:21 ` Patrick McHardy
2008-07-19 7:26 ` David Miller
0 siblings, 1 reply; 10+ messages in thread
From: Patrick McHardy @ 2008-07-17 16:21 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: David Miller, netdev
Stephen Hemminger wrote:
> On Thu, 10 Jul 2008 03:57:13 -0700 (PDT)
> David Miller <davem@davemloft.net> wrote:
>
>> It just xor hashes over IPv4/IPv6 addresses and ports of transport.
>>
>> The only assumption it makes is that skb_network_header() is set
>> correctly.
>>
>> + switch (ip_proto) {
>> + case IPPROTO_TCP:
>> + case IPPROTO_UDP:
>> + case IPPROTO_DCCP:
>> + case IPPROTO_ESP:
>> + case IPPROTO_AH:
>> + case IPPROTO_SCTP:
>> + case IPPROTO_UDPLITE:
>> + hash ^= *ports;
>> + break;
>> +
>> + default:
>> + break;
>> + }
>> +
>> + return hash % dev->real_num_tx_queues;
>> +}
>
> What about VLAN's? and PPPoE?
Actually I think we could just make the number of tx queues of
virtual devices match the lower device and use the unencapsulated
packets for queue selection.
Of course that would require not to perform queue selection
again on the real device. For VLANs, macvlan etc. that don't
do any locking internally that would probably make sense.
Not sure about PPPoE.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 10/13]: net: Implement simple sw TX hashing.
2008-07-17 16:21 ` Patrick McHardy
@ 2008-07-19 7:26 ` David Miller
0 siblings, 0 replies; 10+ messages in thread
From: David Miller @ 2008-07-19 7:26 UTC (permalink / raw)
To: kaber; +Cc: shemminger, netdev
From: Patrick McHardy <kaber@trash.net>
Date: Thu, 17 Jul 2008 18:21:27 +0200
> Actually I think we could just make the number of tx queues of
> virtual devices match the lower device and use the unencapsulated
> packets for queue selection.
>
> Of course that would require not to perform queue selection
> again on the real device. For VLANs, macvlan etc. that don't
> do any locking internally that would probably make sense.
Agreed.
> Not sure about PPPoE.
PPPOE just sends down to dev_queue_xmit() with the new device
after encapsulating, and the network header is set correctly.
Therefore this case can just be left as-is.
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2008-07-19 7:26 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-07-10 10:57 [PATCH 10/13]: net: Implement simple sw TX hashing David Miller
2008-07-11 20:49 ` Eric Dumazet
2008-07-14 11:33 ` Herbert Xu
2008-07-14 11:58 ` David Miller
2008-07-15 6:49 ` Eric Dumazet
2008-07-15 6:58 ` David Miller
2008-07-15 10:45 ` David Miller
2008-07-17 16:16 ` Stephen Hemminger
2008-07-17 16:21 ` Patrick McHardy
2008-07-19 7:26 ` David Miller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).