All of lore.kernel.org
 help / color / mirror / Atom feed
From: Hannes Frederic Sowa <hannes@stressinduktion.org>
To: Tom Herbert <tom@herbertland.com>
Cc: Saeed Mahameed <saeedm@dev.mellanox.co.il>,
	Saeed Mahameed <saeedm@mellanox.com>,
	"David S. Miller" <davem@davemloft.net>,
	Linux Netdev List <netdev@vger.kernel.org>
Subject: Re: [pull request][net-next 0/3] Mellanox, mlx5 GRE tunnel offloads
Date: Mon, 04 Sep 2017 15:50:12 +0200	[thread overview]
Message-ID: <87pob6bmjv.fsf@stressinduktion.org> (raw)
In-Reply-To: <CALx6S356vRYfcsN6FaRmpn2oC0GN9-JwLsF7ybC+w4sEGGsa7w@mail.gmail.com> (Tom Herbert's message of "Sun, 3 Sep 2017 08:43:27 -0700")

Tom Herbert <tom@herbertland.com> writes:

> An encapsulator sets the UDP source port to be the flow entropy of the
> packet being encapsulated. So when the packet traverses the network
> devices can base their hash just on the canonical 5-tuple which is
> sufficient for ECMP and RSS. IPv6 flow label is even better since the
> middleboxes don't even need to look at the transport header, a packet
> is steered based on the 3-tuple of addresses and flow label. In the
> Linux stack,  udp_flow_src_port is used by UDP encapsulations to set
> the source port. Flow label is similarly set by ip6_make_flowlabel.
> Both of these functions use the skb->hash which is computed by calling
> flow dissector at most once per packet (if the packet was received
> with an L4 HW hash or locally originated on a connection the hash does
> not need to be computed).

This would require the MPLS stack copying the flowlabel of IPv6
connections between MPLS routers over their whole lifetime in the MPLS
network. The same would hold for MPLS encapsulated inside UDP, the
source port needs to be kept constant. This is very impractical. The
hash for the flow label can often not be recomputed by interim routers,
because they might lack the knowledge of the upper layer protocol.

UDP source port entropy still has the problem that we don't respect the
source port as RSS entropy by default in network cards, because of
possible fragmentation and thus possible reordering of packets. GRE does
not have this problem and is way easier to identify by hardware.

Basically we need to tell network cards that they can use specific
destination UDP ports where we allow the source port to be used in RSS
hash calculation. I don't see how this is any easier than just using GRE
with a defined protocol field? I do like the combination of ipv6
flowlabel + GRE.

Btw. people are using the GRE Key as additional entropy without looking
into the GRE payload.

Bye,
Hannes

  parent reply	other threads:[~2017-09-04 13:50 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-30 23:04 [pull request][net-next 0/3] Mellanox, mlx5 GRE tunnel offloads Saeed Mahameed
2017-08-30 23:04 ` [net-next 1/3] net/mlx5e: Use IP version matching to classify IP traffic Saeed Mahameed
2017-08-30 23:04 ` [net-next 2/3] net/mlx5e: Support TSO and TX checksum offloads for GRE tunnels Saeed Mahameed
2017-08-30 23:04 ` [net-next 3/3] net/mlx5e: Support RSS for GRE tunneled packets Saeed Mahameed
2017-08-31  5:15 ` [pull request][net-next 0/3] Mellanox, mlx5 GRE tunnel offloads David Miller
2017-08-31 13:51 ` Hannes Frederic Sowa
2017-09-02 23:01   ` Saeed Mahameed
2017-09-03  1:32     ` Hannes Frederic Sowa
2017-09-03  1:37       ` Tom Herbert
2017-09-03  4:11         ` Saeed Mahameed
2017-09-03 15:43           ` Tom Herbert
2017-09-03 16:17             ` Or Gerlitz
2017-09-03 16:45               ` Tom Herbert
2017-09-03 18:58                 ` Or Gerlitz
2017-09-04 13:50             ` Hannes Frederic Sowa [this message]
2017-09-04 16:15               ` Tom Herbert
2017-09-04 16:52                 ` Hannes Frederic Sowa
2017-09-04 17:11                   ` Tom Herbert
2017-09-04 17:57                     ` Hannes Frederic Sowa
2017-09-04 18:56                       ` Tom Herbert
2017-09-05 11:14                         ` Hannes Frederic Sowa
2017-09-05 16:02                           ` Tom Herbert
2017-09-05 19:20                             ` Hannes Frederic Sowa
2017-09-05 21:13                               ` Tom Herbert
2017-09-06  3:06                                 ` Alexander Duyck
2017-09-06 16:17                                   ` Tom Herbert
2017-09-06 17:43                                     ` Alexander Duyck
2017-09-06 19:01                                       ` Tom Herbert
2017-09-03  4:07       ` Saeed Mahameed
2017-09-04  9:37         ` Hannes Frederic Sowa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87pob6bmjv.fsf@stressinduktion.org \
    --to=hannes@stressinduktion.org \
    --cc=davem@davemloft.net \
    --cc=netdev@vger.kernel.org \
    --cc=saeedm@dev.mellanox.co.il \
    --cc=saeedm@mellanox.com \
    --cc=tom@herbertland.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.