All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Ahern <dsahern@gmail.com>
To: Ido Schimmel <idosch@idosch.org>
Cc: "Yi Yang (杨燚)-云服务集团" <yangyi01@inspur.com>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	"nikolay@cumulusnetworks.com" <nikolay@cumulusnetworks.com>
Subject: Re: 答复: [PATCH] can current ECMP implementation support consistent hashing for next hop?
Date: Thu, 6 Aug 2020 10:45:52 -0600	[thread overview]
Message-ID: <3c965294-fe7d-3893-e9d9-3354ff508731@gmail.com> (raw)
In-Reply-To: <20200802144959.GA2483264@shredder>

On 8/2/20 8:49 AM, Ido Schimmel wrote:
> On Thu, Jun 11, 2020 at 10:36:59PM -0600, David Ahern wrote:
>> On 6/11/20 6:32 PM, Yi Yang (杨燚)-云服务集团 wrote:
>>> David, thank you so much for confirming it can't, I did read your cumulus document before, resilient hashing is ok for next hop remove, but it still has the same issue there if add new next hop. I know most of kernel code in Cumulus Linux has been in upstream kernel, I'm wondering why you didn't push resilient hashing to upstream kernel.
>>>
>>> I think consistent hashing is must-have for a commercial load balancing solution, otherwise it is basically nonsense , do you Cumulus Linux have consistent hashing solution?
>>>
>>> Is "- replacing nexthop entries as LB's come and go" ithe stuff https://docs.cumulusnetworks.com/cumulus-linux/Layer-3/Equal-Cost-Multipath-Load-Sharing-Hardware-ECMP/#resilient-hashing is showing? It can't ensure the flow is distributed to the right backend server if a new next hop is added.
>>
>> I do not believe it is a problem to be solved in the kernel.
>>
>> If you follow the *intent* of the Cumulus document: what is the maximum
>> number of load balancers you expect to have? 16? 32? 64? Define an ECMP
>> route with that number of nexthops and fill in the weighting that meets
>> your needs. When an LB is added or removed, you decide what the new set
>> of paths is that maintains N-total paths with the distribution that
>> meets your needs.
> 
> I recently started looking into consistent hashing and I wonder if it
> can be done with the new nexthop API while keeping all the logic in user
> space (e.g., FRR).
> 
> The only extension that might be required from the kernel is a new
> nexthop attribute that indicates when a nexthop was last recently used.

The only potential problem that comes to mind is that a nexthop can be
used by multiple prefixes.

But, I'm not sure I follow what the last recently used indicator gives
you for maintaining flows as a group is updated.

> User space can then use it to understand which nexthops to replace when
> a new nexthop is added and when to perform the replacement. In case the
> nexthops are offloaded, it is possible for the driver to periodically
> update the nexthop code about their activity.
> 
> Below is a script that demonstrates the concept with the example in the
> Cumulus documentation. I chose to replace the individual nexthops
> instead of creating new ones and then replacing the group.

That is one of the features ... a group points to individual nexthops
and those can be atomically updated without affecting the group.

> 
> It is obviously possible to create larger groups to reduce the impact on
> existing flows when a new nexthop is added.
> 
> WDYT?

This is inline with my earlier responses, and your script shows an
example of how to manage it. Combine it with the active-backup patch set
and you handle device events too (avoid disrupting size of the group on
device events).


  reply	other threads:[~2020-08-06 16:46 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-11 14:56 [PATCH] can current ECMP implementation support consistent hashing for next hop? Yi Yang (杨燚)-云服务集团
2020-06-11 18:27 ` David Ahern
2020-06-12  0:32   ` 答复: " Yi Yang (杨燚)-云服务集团
2020-06-12  4:36     ` David Ahern
2020-06-15  6:56       ` 答复: [vger.kernel.org代发]Re: " Yi Yang (杨燚)-云服务集团
2020-06-15 22:42         ` David Ahern
2020-06-16  0:29           ` 答复: " Yi Yang (杨燚)-云服务集团
2020-08-02 14:49       ` Ido Schimmel
2020-08-06 16:45         ` David Ahern [this message]
2020-08-08 18:40           ` Ido Schimmel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3c965294-fe7d-3893-e9d9-3354ff508731@gmail.com \
    --to=dsahern@gmail.com \
    --cc=idosch@idosch.org \
    --cc=netdev@vger.kernel.org \
    --cc=nikolay@cumulusnetworks.com \
    --cc=yangyi01@inspur.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.