All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jay Vosburgh <fubar@us.ibm.com>
To: =?UTF-8?B?Tmljb2xhcyBkZSBQZXNsb8O8YW4=?= <nicolas.2p.debian@gmail.com>
Cc: "Oleg V. Ukhno" <olegu@yandex-team.ru>,
	John Fastabend <john.r.fastabend@intel.com>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>
Subject: Re: [PATCH] bonding: added 802.3ad round-robin hashing policy for single TCP session balancing
Date: Wed, 02 Feb 2011 09:57:33 -0800	[thread overview]
Message-ID: <32505.1296669453@death> (raw)
In-Reply-To: <4D4929BB.2000403@gmail.com>

Nicolas de Pesloüan <nicolas.2p.debian@gmail.com> wrote:

>Le 29/01/2011 03:28, Jay Vosburgh a écrit :
>> 	I've thought about this whole thing, and here's what I view as
>> the proper way to do this.
>>
>> 	In my mind, this proposal is two separate pieces:
>>
>> 	First, a piece to make round-robin a selectable hash for
>> xmit_hash_policy.  The documentation for this should follow the pattern
>> of the "layer3+4" hash policy, in particular noting that the new
>> algorithm violates the 802.3ad standard in exciting ways, will result in
>> out of order delivery, and that other 802.3ad implementations may or may
>> not tolerate this.
>>
>> 	Second, a piece to make certain transmitted packets use the
>> source MAC of the sending slave instead of the bond's MAC.  This should
>> be a separate option from the round-robin hash policy.  I'd call it
>> something like "mac_select" with two values: "default" (what we do now)
>> and "slave_src_mac" to use the slave's real MAC for certain types of
>> traffic (I'm open to better names; that's just what I came up with while
>> writing this).  I believe that "certain types" means "everything but
>> ARP," but might be "only IP and IPv6."  Structuring the option in this
>> manner leaves the option open for additional selections in the future,
>> which a simple "on/off" option wouldn't.  This option should probably
>> only affect a subset of modes; I'm thinking anything except balance-tlb
>> or -alb (because they do funky MAC things already) and active-backup (it
>> doesn't balance traffic, and already uses fail_over_mac to control
>> this).  I think this option also needs a whole new section down in the
>> bottom explaining how to exploit it (the "pick special MACs on slaves to
>> trick switch hash" business).
>>
>> 	Comments?
>
>Looks really sensible to me.
>
>I just propose the following option and option values : "src_mac_select"
>(instead of mac_select), with "default" and "slave_mac" (instead of
>slave_src_mac) as possible values. In the future, we might need a
>"dst_mac_select" option... :-)

	I originally thought of using the nomenclature you propose; my
thinking for doing it the way I ended up with is to minimize the number
of tunable knobs that bonding has (so, the dst_mac would be a setting
for mac_select).  That works as long as there aren't a lot of settings
that would be turned on simultaneously, since each combination would
have to be a separate option, or the options parser would have to handle
multiple settings (e.g., mac_select=src+dst or something like that).

	Anyway, after thinking about it some more, in the long run it's
probably safer to separate these two, so, Oleg, use the above naming
("src_mac_select" with "default" and "slave_mac").

>Also, are there any risks that this kind of session load-balancing won't
>properly cooperate with multiqueue (as explained in "Overriding
>Configuration for Special Cases" in Documentation/networking/bonding.txt)?
>I think it is important to ensure we keep the ability to fine tune the
>egress path selection

	I think the logic for the mac_select (or src_mac_select or
whatever) just has to be done last, after the slave selection is done by
the multiqueue stuff.  That's probably a good tidbit to put in the
documentation as well.

	-J

---
	-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com

  reply	other threads:[~2011-02-02 17:57 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-01-14 19:07 [PATCH] bonding: added 802.3ad round-robin hashing policy for single TCP session balancing Oleg V. Ukhno
2011-01-14 20:10 ` John Fastabend
2011-01-14 23:12   ` Oleg V. Ukhno
2011-01-14 20:13 ` Jay Vosburgh
2011-01-14 22:51   ` Oleg V. Ukhno
2011-01-15  0:05     ` Jay Vosburgh
2011-01-15 12:11       ` Oleg V. Ukhno
2011-01-18  3:16       ` John Fastabend
2011-01-18 12:40         ` Oleg V. Ukhno
2011-01-18 14:54           ` Nicolas de Pesloüan
2011-01-18 15:28             ` Oleg V. Ukhno
2011-01-18 16:24               ` Nicolas de Pesloüan
2011-01-18 16:57                 ` Oleg V. Ukhno
2011-01-18 20:24                 ` Jay Vosburgh
2011-01-18 21:20                   ` Nicolas de Pesloüan
2011-01-19  1:45                     ` Jay Vosburgh
2011-01-18 22:22                   ` Oleg V. Ukhno
2011-01-19 16:13                   ` Oleg V. Ukhno
2011-01-19 20:12                     ` Nicolas de Pesloüan
2011-01-21 13:55                       ` Oleg V. Ukhno
2011-01-22 12:48                         ` Nicolas de Pesloüan
2011-01-24 19:32                           ` Oleg V. Ukhno
2011-01-29  2:28                         ` Jay Vosburgh
2011-02-01 16:25                           ` Oleg V. Ukhno
2011-02-02 17:30                             ` Jay Vosburgh
2011-02-02  9:54                           ` Nicolas de Pesloüan
2011-02-02 17:57                             ` Jay Vosburgh [this message]
2011-02-03 14:54                               ` Oleg V. Ukhno
2011-01-18 17:56               ` Kirill Smelkov
2011-01-18 16:41           ` John Fastabend
2011-01-18 17:21             ` Oleg V. Ukhno
2011-01-14 20:41 ` Nicolas de Pesloüan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=32505.1296669453@death \
    --to=fubar@us.ibm.com \
    --cc=john.r.fastabend@intel.com \
    --cc=netdev@vger.kernel.org \
    --cc=nicolas.2p.debian@gmail.com \
    --cc=olegu@yandex-team.ru \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.