* [PATCH 6/7] mlx4_en: Adding rxhash support
@ 2011-10-17 20:18 Yevgeny Petrilin
2011-10-18 1:48 ` Eric Dumazet
0 siblings, 1 reply; 10+ messages in thread
From: Yevgeny Petrilin @ 2011-10-17 20:18 UTC (permalink / raw)
To: davem; +Cc: netdev, yevgenyp
Moving to Toeplitz function in RSS calculation.
Reporting rxhash in skb.
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
---
drivers/net/ethernet/mellanox/mlx4/en_netdev.c | 2 +-
drivers/net/ethernet/mellanox/mlx4/en_rx.c | 9 +++++++++
2 files changed, 10 insertions(+), 1 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
index c4c4be4..78d776b 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
@@ -1084,7 +1084,7 @@ int mlx4_en_init_netdev(struct mlx4_en_dev *mdev, int port,
dev->vlan_features = dev->hw_features;
- dev->hw_features |= NETIF_F_RXCSUM;
+ dev->hw_features |= NETIF_F_RXCSUM | NETIF_F_RXHASH;
dev->features = dev->hw_features | NETIF_F_HIGHDMA |
NETIF_F_HW_VLAN_TX | NETIF_F_HW_VLAN_RX |
NETIF_F_HW_VLAN_FILTER;
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
index f42c7a6..ae5feed 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
@@ -618,6 +618,9 @@ int mlx4_en_process_rx_cq(struct net_device *dev, struct mlx4_en_cq *cq, int bud
__vlan_hwaccel_put_tag(gro_skb, vid);
}
+ if (dev->features & NETIF_F_RXHASH)
+ gro_skb->rxhash = be32_to_cpu(cqe->immed_rss_invalid);
+
skb_record_rx_queue(gro_skb, cq->ring);
napi_gro_frags(&cq->napi);
@@ -651,6 +654,9 @@ int mlx4_en_process_rx_cq(struct net_device *dev, struct mlx4_en_cq *cq, int bud
skb->protocol = eth_type_trans(skb, dev);
skb_record_rx_queue(skb, cq->ring);
+ if (dev->features & NETIF_F_RXHASH)
+ skb->rxhash = be32_to_cpu(cqe->immed_rss_invalid);
+
if (be32_to_cpu(cqe->vlan_my_qpn) &
MLX4_CQE_VLAN_PRESENT_MASK)
__vlan_hwaccel_put_tag(skb, be16_to_cpu(cqe->sl_vid));
@@ -865,6 +871,9 @@ int mlx4_en_config_rss_steer(struct mlx4_en_priv *priv)
(rss_map->base_qpn));
rss_context->default_qpn = cpu_to_be32(rss_map->base_qpn);
rss_context->flags = rss_mask;
+ rss_context->hash_fn = 1;
+ for (i = 0; i < 10; i++)
+ rss_context->rss_key[i] = random32();
if (priv->mdev->profile.udp_rss)
rss_context->base_qpn_udp = rss_context->default_qpn;
--
1.7.7
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH 6/7] mlx4_en: Adding rxhash support
2011-10-17 20:18 [PATCH 6/7] mlx4_en: Adding rxhash support Yevgeny Petrilin
@ 2011-10-18 1:48 ` Eric Dumazet
2011-10-18 7:36 ` Yevgeny Petrilin
0 siblings, 1 reply; 10+ messages in thread
From: Eric Dumazet @ 2011-10-18 1:48 UTC (permalink / raw)
To: Yevgeny Petrilin; +Cc: davem, netdev
Le lundi 17 octobre 2011 à 22:18 +0200, Yevgeny Petrilin a écrit :
> Moving to Toeplitz function in RSS calculation.
> Reporting rxhash in skb.
>
> Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
> rss_context->flags = rss_mask;
> + rss_context->hash_fn = 1;
> + for (i = 0; i < 10; i++)
> + rss_context->rss_key[i] = random32();
>
Thats bit of a problem : Two NICS will have different seeds, and thus
provide different rxhash for a given flow. A bonding of two NICS will
not be able to provide a consistent rxhash.
drivers/net/ethernet/intel/igb/igb_main.c uses a static table to avoid
this problem.
^ permalink raw reply [flat|nested] 10+ messages in thread
* RE: [PATCH 6/7] mlx4_en: Adding rxhash support
2011-10-18 1:48 ` Eric Dumazet
@ 2011-10-18 7:36 ` Yevgeny Petrilin
2011-10-18 8:34 ` Eric Dumazet
0 siblings, 1 reply; 10+ messages in thread
From: Yevgeny Petrilin @ 2011-10-18 7:36 UTC (permalink / raw)
To: Eric Dumazet; +Cc: davem@davemloft.net, netdev@vger.kernel.org
> > rss_context->flags = rss_mask;
> > + rss_context->hash_fn = 1;
> > + for (i = 0; i < 10; i++)
> > + rss_context->rss_key[i] = random32();
> >
>
> Thats bit of a problem : Two NICS will have different seeds, and thus provide different rxhash for a given flow. A bonding of two NICS will
> not be able to provide a consistent rxhash.
>
> drivers/net/ethernet/intel/igb/igb_main.c uses a static table to avoid this problem.
>
Hello Eric, thanks for your review.
I agree that in this case two ports will have different seeds.
But even if we use static values for the key, what about bonding of 2 NICs from different vendors?
How can we ensure we get same rxhash value for all NICs?
There are also other drivers that use random values as well, for example:
drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
Thanks,
Yevgeny
^ permalink raw reply [flat|nested] 10+ messages in thread
* RE: [PATCH 6/7] mlx4_en: Adding rxhash support
2011-10-18 7:36 ` Yevgeny Petrilin
@ 2011-10-18 8:34 ` Eric Dumazet
2011-10-18 8:59 ` Yevgeny Petrilin
0 siblings, 1 reply; 10+ messages in thread
From: Eric Dumazet @ 2011-10-18 8:34 UTC (permalink / raw)
To: Yevgeny Petrilin; +Cc: davem@davemloft.net, netdev@vger.kernel.org
Le mardi 18 octobre 2011 à 07:36 +0000, Yevgeny Petrilin a écrit :
> > > rss_context->flags = rss_mask;
> > > + rss_context->hash_fn = 1;
> > > + for (i = 0; i < 10; i++)
> > > + rss_context->rss_key[i] = random32();
> > >
> >
> > Thats bit of a problem : Two NICS will have different seeds, and thus provide different rxhash for a given flow. A bonding of two NICS will
> > not be able to provide a consistent rxhash.
> >
> > drivers/net/ethernet/intel/igb/igb_main.c uses a static table to avoid this problem.
> >
>
> Hello Eric, thanks for your review.
>
> I agree that in this case two ports will have different seeds.
> But even if we use static values for the key, what about bonding of 2 NICs from different vendors?
> How can we ensure we get same rxhash value for all NICs?
>
> There are also other drivers that use random values as well, for example:
> drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
>
What is the gain using random values ?
Usually, we tend to have same hardware in a single machine, or we use
active-backup bonding mode, and an active slave flip can change rxhash
values with litle effect, since this happens not often.
I really prefer not random values, because it allows to have replayable
configurations : For a given tcp flow, the same rxhash value is given
and same cpu target in RPS. Its way easier to tune your machine for some
workloads.
^ permalink raw reply [flat|nested] 10+ messages in thread
* RE: [PATCH 6/7] mlx4_en: Adding rxhash support
2011-10-18 8:34 ` Eric Dumazet
@ 2011-10-18 8:59 ` Yevgeny Petrilin
2011-10-18 15:36 ` Stephen Hemminger
0 siblings, 1 reply; 10+ messages in thread
From: Yevgeny Petrilin @ 2011-10-18 8:59 UTC (permalink / raw)
To: Eric Dumazet; +Cc: davem@davemloft.net, netdev@vger.kernel.org
>
> What is the gain using random values ?
>
> Usually, we tend to have same hardware in a single machine, or we use
> active-backup bonding mode, and an active slave flip can change rxhash
> values with litle effect, since this happens not often.
>
> I really prefer not random values, because it allows to have replayable
> configurations : For a given tcp flow, the same rxhash value is given
> and same cpu target in RPS. Its way easier to tune your machine for some workloads.
>
There is no gain in random values,
I'll make the change to have static value for RSS function.
We might consider how to ensure consistency across the different drivers in this aspect.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 6/7] mlx4_en: Adding rxhash support
2011-10-18 8:59 ` Yevgeny Petrilin
@ 2011-10-18 15:36 ` Stephen Hemminger
2011-10-18 18:49 ` Jesse Brandeburg
2011-10-18 19:35 ` Ben Hutchings
0 siblings, 2 replies; 10+ messages in thread
From: Stephen Hemminger @ 2011-10-18 15:36 UTC (permalink / raw)
To: Yevgeny Petrilin
Cc: Eric Dumazet, davem@davemloft.net, netdev@vger.kernel.org
On Tue, 18 Oct 2011 08:59:44 +0000
Yevgeny Petrilin <yevgenyp@mellanox.co.il> wrote:
> >
> > What is the gain using random values ?
> >
> > Usually, we tend to have same hardware in a single machine, or we use
> > active-backup bonding mode, and an active slave flip can change rxhash
> > values with litle effect, since this happens not often.
> >
> > I really prefer not random values, because it allows to have replayable
> > configurations : For a given tcp flow, the same rxhash value is given
> > and same cpu target in RPS. Its way easier to tune your machine for some workloads.
> >
>
> There is no gain in random values,
> I'll make the change to have static value for RSS function.
>
> We might consider how to ensure consistency across the different drivers in this aspect.
The key should be part of the network device core. Almost all hardware just
implements the Microsoft standard, and if all drivers used same key they should
come up with the same hash.
Although using the same key all the time makes testing easier.
The risk of using the same key is that it makes it easier for an attacker to
create a set of addresses that all map to the same CPU which would make a DoS
attack work better. Therefore the key should be randomly generated at boot time.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 6/7] mlx4_en: Adding rxhash support
2011-10-18 15:36 ` Stephen Hemminger
@ 2011-10-18 18:49 ` Jesse Brandeburg
2011-10-18 19:05 ` Eric Dumazet
2011-10-18 19:35 ` Ben Hutchings
1 sibling, 1 reply; 10+ messages in thread
From: Jesse Brandeburg @ 2011-10-18 18:49 UTC (permalink / raw)
To: Stephen Hemminger
Cc: Yevgeny Petrilin, Eric Dumazet, davem@davemloft.net,
netdev@vger.kernel.org
On Tue, Oct 18, 2011 at 8:36 AM, Stephen Hemminger
<shemminger@vyatta.com> wrote:
> On Tue, 18 Oct 2011 08:59:44 +0000
> Yevgeny Petrilin <yevgenyp@mellanox.co.il> wrote:
>> There is no gain in random values,
>> I'll make the change to have static value for RSS function.
>>
>> We might consider how to ensure consistency across the different drivers in this aspect.
>
> The key should be part of the network device core. Almost all hardware just
> implements the Microsoft standard, and if all drivers used same key they should
> come up with the same hash.
>
> Although using the same key all the time makes testing easier.
> The risk of using the same key is that it makes it easier for an attacker to
> create a set of addresses that all map to the same CPU which would make a DoS
> attack work better. Therefore the key should be randomly generated at boot time.
Stephen, I respectfully disagree with your position here. The risk of
using the same key is that a malicious user could target a particular
queue with a DoS attack, but how is that different than any single
queue device? NAPI protects a single queue against (a network
interrupt based) DoS. I do not think we should be generating a random
key at boot time, and because of the way NAPI mitigates load, we are
okay. The gain from from the far simpler setup (and reproducability)
outweighs the risk until someone can show damage due to this
theoretical DoS attack.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 6/7] mlx4_en: Adding rxhash support
2011-10-18 18:49 ` Jesse Brandeburg
@ 2011-10-18 19:05 ` Eric Dumazet
0 siblings, 0 replies; 10+ messages in thread
From: Eric Dumazet @ 2011-10-18 19:05 UTC (permalink / raw)
To: Jesse Brandeburg
Cc: Stephen Hemminger, Yevgeny Petrilin, davem@davemloft.net,
netdev@vger.kernel.org
Le mardi 18 octobre 2011 à 11:49 -0700, Jesse Brandeburg a écrit :
> On Tue, Oct 18, 2011 at 8:36 AM, Stephen Hemminger
> <shemminger@vyatta.com> wrote:
> > On Tue, 18 Oct 2011 08:59:44 +0000
> > Yevgeny Petrilin <yevgenyp@mellanox.co.il> wrote:
> >> There is no gain in random values,
> >> I'll make the change to have static value for RSS function.
> >>
> >> We might consider how to ensure consistency across the different drivers in this aspect.
> >
> > The key should be part of the network device core. Almost all hardware just
> > implements the Microsoft standard, and if all drivers used same key they should
> > come up with the same hash.
> >
> > Although using the same key all the time makes testing easier.
> > The risk of using the same key is that it makes it easier for an attacker to
> > create a set of addresses that all map to the same CPU which would make a DoS
> > attack work better. Therefore the key should be randomly generated at boot time.
>
> Stephen, I respectfully disagree with your position here. The risk of
> using the same key is that a malicious user could target a particular
> queue with a DoS attack, but how is that different than any single
> queue device? NAPI protects a single queue against (a network
> interrupt based) DoS. I do not think we should be generating a random
> key at boot time, and because of the way NAPI mitigates load, we are
> okay. The gain from from the far simpler setup (and reproducability)
> outweighs the risk until someone can show damage due to this
> theoretical DoS attack.
Note : This policy could be up to the admin :
1) We could let admin chose a known hash for reproducability
ethtool .... rss_hash xxxxxxxx:yyyyyyyy:zzzzzzzz:....
2) We could have a 'rss_perturb N ' ethtool option, to randomly reshufle
things every N seconds, for people really afraid ;)
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 6/7] mlx4_en: Adding rxhash support
2011-10-18 15:36 ` Stephen Hemminger
2011-10-18 18:49 ` Jesse Brandeburg
@ 2011-10-18 19:35 ` Ben Hutchings
2011-10-19 14:57 ` Ben Hutchings
1 sibling, 1 reply; 10+ messages in thread
From: Ben Hutchings @ 2011-10-18 19:35 UTC (permalink / raw)
To: Stephen Hemminger
Cc: Yevgeny Petrilin, Eric Dumazet, davem@davemloft.net,
netdev@vger.kernel.org
On Tue, 2011-10-18 at 08:36 -0700, Stephen Hemminger wrote:
> On Tue, 18 Oct 2011 08:59:44 +0000
> Yevgeny Petrilin <yevgenyp@mellanox.co.il> wrote:
>
> > >
> > > What is the gain using random values ?
> > >
> > > Usually, we tend to have same hardware in a single machine, or we use
> > > active-backup bonding mode, and an active slave flip can change rxhash
> > > values with litle effect, since this happens not often.
> > >
> > > I really prefer not random values, because it allows to have replayable
> > > configurations : For a given tcp flow, the same rxhash value is given
> > > and same cpu target in RPS. Its way easier to tune your machine for some workloads.
> > >
> >
> > There is no gain in random values,
> > I'll make the change to have static value for RSS function.
> >
> > We might consider how to ensure consistency across the different drivers in this aspect.
>
> The key should be part of the network device core. Almost all hardware just
> implements the Microsoft standard, and if all drivers used same key they should
> come up with the same hash.
It should, but that's not enough. The core also needs to be responsible
for initialising the hash indirection table, determining how many RX
queues to create, and IRQ affinity hints.
> Although using the same key all the time makes testing easier.
> The risk of using the same key is that it makes it easier for an attacker to
> create a set of addresses that all map to the same CPU which would make a DoS
> attack work better. Therefore the key should be randomly generated at boot time.
If I understand correctly, the core of the Toeplitz hash functions is
(pretending we have a wide enough type called bigint):
u32 toeplitz_hash(bigint input, bigint key, unsigned width)
{
u32 hash = 0;
unsigned i;
for (i = 0; i < width; i++)
if (input & ((bigint)1 << i))
hash ^= key >> (1 + i);
return hash;
}
This is hardly a cryptographic hash! And while the key probably should
be random it should not just be random *bits*. For example, if any 32
consecutive bits of the key are zero then 1 bit of input will have no
effect on the hash at all.
There was also a proposal a while back that we should try to make the
hash symmetric w.r.t. RX and TX addresses, so that both directions of a
flow through a router/bridge are aligned. I think this was to be done
by repeating a 16-bit pattern across the key. Not sure whether that's
worthwhile.
Ben.
--
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 6/7] mlx4_en: Adding rxhash support
2011-10-18 19:35 ` Ben Hutchings
@ 2011-10-19 14:57 ` Ben Hutchings
0 siblings, 0 replies; 10+ messages in thread
From: Ben Hutchings @ 2011-10-19 14:57 UTC (permalink / raw)
To: Stephen Hemminger
Cc: Yevgeny Petrilin, Eric Dumazet, davem@davemloft.net,
netdev@vger.kernel.org
On Tue, 2011-10-18 at 20:35 +0100, Ben Hutchings wrote:
[...]
> There was also a proposal a while back that we should try to make the
> hash symmetric w.r.t. RX and TX addresses, so that both directions of a
> flow through a router/bridge are aligned. I think this was to be done
> by repeating a 16-bit pattern across the key. Not sure whether that's
> worthwhile.
That also makes it relatively cheap to calculate in software, which
DragonflyBSD does:
http://gitweb.dragonflybsd.org/dragonfly.git/blob/master:/sys/net/toeplitz.c
http://gitweb.dragonflybsd.org/dragonfly.git/blob/master:/sys/net/toeplitz2.h
(the latter file appears to assume that in_addr_t/in_port_t are byte-
swapped i.e. the host is little-endian).
Ben.
--
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2011-10-19 14:57 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-10-17 20:18 [PATCH 6/7] mlx4_en: Adding rxhash support Yevgeny Petrilin
2011-10-18 1:48 ` Eric Dumazet
2011-10-18 7:36 ` Yevgeny Petrilin
2011-10-18 8:34 ` Eric Dumazet
2011-10-18 8:59 ` Yevgeny Petrilin
2011-10-18 15:36 ` Stephen Hemminger
2011-10-18 18:49 ` Jesse Brandeburg
2011-10-18 19:05 ` Eric Dumazet
2011-10-18 19:35 ` Ben Hutchings
2011-10-19 14:57 ` Ben Hutchings
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).