netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Petr Machata <petrm@nvidia.com>
To: "David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	<netdev@vger.kernel.org>
Cc: Ido Schimmel <idosch@nvidia.com>, Petr Machata <petrm@nvidia.com>,
	"David Ahern" <dsahern@kernel.org>
Subject: [PATCH net-next 0/4] Allow configuration of multipath hash seed
Date: Wed, 29 May 2024 13:18:40 +0200	[thread overview]
Message-ID: <20240529111844.13330-1-petrm@nvidia.com> (raw)

Let me just quote the commit message of patch #2 here to inform the
motivation and some of the implementation:

    When calculating hashes for the purpose of multipath forwarding,
    both IPv4 and IPv6 code currently fall back on
    flow_hash_from_keys(). That uses a randomly-generated seed. That's a
    fine choice by default, but unfortunately some deployments may need
    a tighter control over the seed used.

    In this patchset, make the seed configurable by adding a new sysctl
    key, net.ipv4.fib_multipath_hash_seed to control the seed. This seed
    is used specifically for multipath forwarding and not for the other
    concerns that flow_hash_from_keys() is used for, such as queue
    selection. Expose the knob as sysctl because other such settings,
    such as headers to hash, are also handled that way.

    Despite being placed in the net.ipv4 namespace, the multipath seed
    sysctl is used for both IPv4 and IPv6, similarly to e.g. a number of
    TCP variables. Like those, the multipath hash seed is a per-netns
    variable.

    The new sysctl is added with permissions 0600 so that the hash is
    only readable and writable by root.

    The seed used by flow_hash_from_keys() is a 128-bit quantity.
    However it seems that usually the seed is a much more modest value.
    32 bits seem typical (Cisco, Cumulus), some systems go even lower.
    For that reason, and to decouple the user interface from
    implementation details, go with a 32-bit quantity, which is then
    quadruplicated to form the siphash key.

One example of use of this interface is avoiding hash polarization,
where two ECMP routers, one behind the other, happen to make consistent
hashing decisions, and as a result, part of the ECMP space of the latter
router is never used. Another is a load balancer where several machines
forward traffic to one of a number of leaves, and the forwarding
decisions need to be made consistently. (This is a case of a desired
hash polarization, mentioned e.g. in chapter 6.3 of [0].)

There has already been a proposal to include a hash seed control
interface in the past[1]. This patchset uses broadly the same ideas, but
limits the externally visible seed size to 32 bits.

- Patches #1-#2 contain the substance of the work
- Patch #3 is a mlxsw offload
- Patch #4 is a selftest

[0] https://www.usenix.org/system/files/conference/nsdi18/nsdi18-araujo.pdf
[1] https://lore.kernel.org/netdev/YIlVpYMCn%2F8WfE1P@rnd/

Petr Machata (4):
  net: ipv4,ipv6: Pass multipath hash computation through a helper
  net: ipv4: Add a sysctl to set multipath hash seed
  mlxsw: spectrum_router: Apply user-defined multipath hash seed
  selftests: forwarding: router_mpath_hash: Add a new selftest

 Documentation/networking/ip-sysctl.rst        |  10 +
 .../ethernet/mellanox/mlxsw/spectrum_router.c |  14 +-
 include/net/flow_dissector.h                  |   2 +
 include/net/ip_fib.h                          |  24 ++
 include/net/netns/ipv4.h                      |  10 +
 net/core/flow_dissector.c                     |   7 +
 net/ipv4/route.c                              |  12 +-
 net/ipv4/sysctl_net_ipv4.c                    |  82 +++++
 net/ipv6/route.c                              |  12 +-
 .../testing/selftests/net/forwarding/Makefile |   1 +
 .../net/forwarding/router_mpath_seed.sh       | 322 ++++++++++++++++++
 11 files changed, 482 insertions(+), 14 deletions(-)
 create mode 100755 tools/testing/selftests/net/forwarding/router_mpath_seed.sh

-- 
2.45.0


             reply	other threads:[~2024-05-29 11:21 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-29 11:18 Petr Machata [this message]
2024-05-29 11:18 ` [PATCH net-next 1/4] net: ipv4,ipv6: Pass multipath hash computation through a helper Petr Machata
2024-05-29 11:18 ` [PATCH net-next 2/4] net: ipv4: Add a sysctl to set multipath hash seed Petr Machata
2024-05-31  1:00   ` Jakub Kicinski
2024-06-02 11:15     ` Ido Schimmel
2024-06-03  6:51       ` Nicolas Dichtel
2024-06-03  9:51     ` Petr Machata
2024-06-03 11:37       ` Petr Machata
2024-06-01  8:46   ` Eric Dumazet
2024-06-03  7:29     ` Toke Høiland-Jørgensen
2024-06-03  8:25       ` Eric Dumazet
2024-06-03  8:58         ` Toke Høiland-Jørgensen
2024-06-03 13:53           ` Paul E. McKenney
2024-06-03  9:50     ` Petr Machata
2024-05-29 11:18 ` [PATCH net-next 3/4] mlxsw: spectrum_router: Apply user-defined " Petr Machata
2024-05-29 11:18 ` [PATCH net-next 4/4] selftests: forwarding: router_mpath_hash: Add a new selftest Petr Machata
2024-05-29 19:57 ` [PATCH net-next 0/4] Allow configuration of multipath hash seed Nikolay Aleksandrov
2024-05-30 15:25   ` Petr Machata
2024-05-30 17:27     ` Nikolay Aleksandrov
2024-05-30 18:07       ` Nikolay Aleksandrov
2024-05-30 21:34         ` Nikolay Aleksandrov
2024-06-03  9:21           ` Petr Machata

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240529111844.13330-1-petrm@nvidia.com \
    --to=petrm@nvidia.com \
    --cc=davem@davemloft.net \
    --cc=dsahern@kernel.org \
    --cc=edumazet@google.com \
    --cc=idosch@nvidia.com \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).