netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jakub Sitnicki <jkbs@redhat.com>
To: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Cc: netdev@vger.kernel.org, edumazet@google.com, davem@davemloft.net,
	roopa@cumulusnetworks.com, dsa@cumulusnetworks.com
Subject: Re: [PATCH net-next v2] net: ipv4: add support for ECMP hash policy choice
Date: Wed, 08 Mar 2017 17:00:05 +0100	[thread overview]
Message-ID: <8760jjep2y.fsf@redhat.com> (raw)
In-Reply-To: <929a2609-51f8-c385-a727-3f819cf28b4f@cumulusnetworks.com>

On Wed, Mar 08, 2017 at 12:43 PM GMT, Nikolay Aleksandrov wrote:
> On 08/03/17 14:05, Jakub Sitnicki wrote:
>> On Tue, Mar 07, 2017 at 11:01 AM GMT, Nikolay Aleksandrov wrote:
>>> This patch adds support for ECMP hash policy choice via a new sysctl
>>> called fib_multipath_hash_policy and also adds support for L4 hashes.
>>> The current values for fib_multipath_hash_policy are:
>>>  0 - layer 3 (default)
>>>  1 - layer 4
>>> If there's an skb hash already set and it matches the chosen policy then it
>>> will be used instead of being calculated. The ICMP inner IP addresses use
>>> is removed.
>>>
>>> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
>>> ---
>>> v2:
>>>  - removed the output_key_hash as it's not needed anymore
>>>  - reverted to my original/internal patch with L3 as default hash
>> 
>> What about ICMP PTB (Fragmentation Needed) forwarding that makes PMTUD
>> work with ECMP in setups like described in RFC7690 [1]?
>> 
>>   ptb -> router ecmp -> next hop L4/L7 load balancer -> destination
>> 
>>        router --> load balancer 1 --->
>>             \\--> load balancer 2 ---> load-balanced service
>>              \--> load balancer N --->
>> 
>> Removing special treatment of ICMP errors will break it, won't it?
>> 
>
> Yes, I am aware and this decision was made with that in mind.
> We'd like to use the HW hash when available and IIRC that doesn't play well with
> special-casing ICMP errors for anycast as it may not match also. Another thing,
> again if I remember correctly, was that this behaviour is closer to how hardware
> handles ECMP.

OK, I wanted to make sure that is not an oversight that ECMP routing in
ipv4 stack is to be dumbed down to match the hardware behavior. I
thought that it was an advantage that we want to have over hardware
routers. (To be fair, I should mention that we don't have it in ipv6
stack ATM.)

>
> One thing we can do is leave the current L3 behaviour with ICMP error handling
> and add a new L3 mode that tries to use the skb hash when available and doesn't
> care about the packet type.
>
> What do you think ?

Sounds good to me. Would be good to hear other opinions also.

Thanks,
Jakub

  reply	other threads:[~2017-03-08 16:00 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-06 14:59 [PATCH net-next] net: ipv4: add support for ECMP hash policy choice Nikolay Aleksandrov
2017-03-06 16:24 ` David Ahern
2017-03-06 16:52   ` Nikolay Aleksandrov
2017-03-07  6:16 ` Roopa Prabhu
2017-03-07 11:01 ` [PATCH net-next v2] " Nikolay Aleksandrov
2017-03-08 12:05   ` Jakub Sitnicki
2017-03-08 12:43     ` Nikolay Aleksandrov
2017-03-08 16:00       ` Jakub Sitnicki [this message]
2017-03-13  2:23         ` David Miller
2017-03-14 15:36   ` [PATCH net-next v3] " Nikolay Aleksandrov
2017-03-14 15:55     ` Stephen Hemminger
2017-03-14 15:58       ` Nikolay Aleksandrov
2017-03-14 18:48         ` David Miller
2017-03-14 20:25           ` Stephen Hemminger
2017-03-14 21:10             ` Roopa Prabhu
2017-03-14 21:42               ` Stephen Hemminger
2017-03-14 22:38                 ` Roopa Prabhu
2017-03-14 23:27                   ` Stephen Hemminger
2017-03-14 23:45                     ` David Ahern
2017-03-15  9:17                       ` Nicolas Dichtel
2017-03-15 10:46                         ` Nikolay Aleksandrov
2017-03-15 11:18                           ` Nicolas Dichtel
2017-03-15 11:27                             ` Nikolay Aleksandrov
2017-03-15 15:01                         ` David Ahern
2017-03-15 15:20                           ` Stephen Hemminger
2017-03-15  0:24             ` David Miller
2017-03-15  2:30               ` Tom Herbert
2017-03-17  3:36                 ` David Miller
2017-03-14 18:55     ` Nikolay Aleksandrov
2017-03-15 11:32     ` Jakub Sitnicki
2017-03-15 12:10       ` Nikolay Aleksandrov
2017-03-16 13:28   ` [PATCH net-next v4] " Nikolay Aleksandrov
2017-03-16 16:41     ` Stephen Hemminger
2017-03-16 16:49       ` Nikolay Aleksandrov
2017-03-17 10:06         ` Nikolay Aleksandrov
2017-03-21 22:28     ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8760jjep2y.fsf@redhat.com \
    --to=jkbs@redhat.com \
    --cc=davem@davemloft.net \
    --cc=dsa@cumulusnetworks.com \
    --cc=edumazet@google.com \
    --cc=netdev@vger.kernel.org \
    --cc=nikolay@cumulusnetworks.com \
    --cc=roopa@cumulusnetworks.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).