netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jay Vosburgh <fubar@us.ibm.com>
To: =?ISO-8859-1?Q?Nicolas_de_Peslo=FCan?= <nicolas.2p.debian@free.fr>
Cc: netdev@vger.kernel.org, bonding-devel@lists.sourceforge.net,
	davem@davemloft.net, Jiri Pirko <jpirko@redhat.com>
Subject: Re: [Bonding-devel] [PATCH net-next-2.6] bonding: introduce primary_lazy option
Date: Tue, 25 Aug 2009 11:41:21 -0700	[thread overview]
Message-ID: <16215.1251225681@death.nxdomain.ibm.com> (raw)
In-Reply-To: <4A941FD9.6050304@free.fr>

Nicolas de Pesloüan <nicolas.2p.debian@free.fr> wrote:

>Jiri Pirko wrote:
>> Mon, Aug 24, 2009 at 07:35:17PM CEST, fubar@us.ibm.com wrote:
>>> 	I'm still unclear as to why it's better to add another special
>>> case option to bonding instead of changing this in user space, other
>>> than it'd be a change to user space (initscripts / sysconfig).
>>>
>>> 	The way I see it, this patch is adding a mechanism that says,
>>> effectively, "make slave X the active slave, but do it only once."
>>> There is already a way to do that in bonding (sysfs, as above, or
>>> ifenslave -c); I am reluctant to add another without good reason.
>> 
>> Hello Jay.
>> 
>> As I already replied you once it's not only about selecting a slave at the
>> start. It's also about following:
>> 
>> Imagine you have bond with 3 slaves:
>> eth0            eth1            eth2
>> UP(curr)        UP              UP
>> DOWN            UP(curr)        UP
>> UP              UP(curr)        UP
>> UP              DOWN            UP(curr)
>> 
>> eth2 ends up being current active but we prefer eth0 (as primary interface).
>> This is not desirable and is solved by primary_lazy option.
>> 
>> Jirka
>> 
>>> 	I'm not necessarily against the "weight" business in general.
>>> For the purposes of this discussion, however, it's a big complex
>>> solution to a pretty simple problem, and the "weight" system still has
>>> to have special sauce added it to to handle this special case.
>>>
>>> 	Last, presuming for the moment that this goes forward as an
>>> option to bonding, I think this should be named something along the
>>> lines of "make_active" (or perhaps "make_active_once", but that's a bit
>>> long).  The option has the effect of making the specified slave the
>>> active slave one time, then the option setting is cleared.
>
>Hi Jay,
>
> From what I understand from Jirka's needs, the exact expected behaviors are :
>
>1/ If a slave is active, keep it active, even if the primary comes back up.
>2/ If the current slave just failed, choose the new active slave, giving 
>priority to the master.
>
>Selecting the active slave at startup (by using ifenslave -c or writing into 
>/sys/class/net/bond0/bonding/active_slave) would solve 1, but not 2.

	Yah, I had missed step 2.  I'd still call it something other
than "lazy," though; "passive" sounds better to me.

>Also, I suggested to change 1 in this way :
>
>1/ If a slave is active, keep it active, even if the primary comes back up, 
>*except if the speed of the primary is better than the speed of the active slave*.
>
>Thinking about all that, I start feeling that some sort of user space system to 
>select the "best" slave would be better. If we can design a NETLINK interface to 
>report events (slave up, slave down...) to user space, then any user space 
>daemon would be able to tell bonding what to do. Only if no process register to 
>receive those events would bonding use the normal slave selection rules.

	This has been discussed more than once in the past, but hasn't
ever really gotten anywhere.  I suspect the main impediment is the lack
of a suitable API.

>Designing such a NETLINK interface would replace my proposed weight option (at 
>least for best slave selection in active-backup mode and for best aggregator 
>selection in 802.3ad mode). It would also solve the problem reported by Jirka 
>and so replace the proposed primary_lazy option.

	Yes, a lot of the decision making at failover could be moved
into a user space daemon.  The daemon, I think, should be optional; if
the basic selection policies are sufficient, then there's no need for a
trip to user space and back.

>Any way, NETLINK is something that is supposed to come into bonding at some 
>times, because we know that the sysfs purists hate the sysfs bonding stuff and 
>that NETLINK is the target to setup networking.

	I'm not a big fan of the sysfs API, either; it seemed like a
good idea at the time.  It's certainly better than ifenslave in terms of
features, but some of it is pretty convoluted, and there are things that
just can't be done from within sysfs.

	I recall seeing a note from Stephen Hemminger not too long ago
(a month or two ago) that he was working on a netlink API for bonding,
but I don't know how far that ever got.

	One quesiton is, if a netlink API is implemented, whether to
convert ifenslave, or deprecate ifenslave and put the various bonding
functions into ip.

	If a netlink API is on the relatively near horizon (say, within
a few months), then I'm less inclined to put in the "lazy" option, since
it would just become baggage carried forward for the next several years
(until the sysfs API could be deprecated and removed).

	-J

---
	-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com

  reply	other threads:[~2009-08-25 18:41 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-08-13 15:05 [PATCH net-next-2.6] bonding: introduce primary_lazy option Jiri Pirko
2009-08-13 15:44 ` Jay Vosburgh
2009-08-14 10:52   ` Jiri Pirko
2009-08-13 19:41 ` [Bonding-devel] " Nicolas de Pesloüan
2009-08-14 10:59   ` Jiri Pirko
2009-08-14 16:27     ` Nicolas de Pesloüan
2009-08-17 11:49       ` Jiri Pirko
2009-08-17 20:55         ` Nicolas de Pesloüan
2009-08-18 12:45           ` Jiri Pirko
2009-08-20 12:40             ` Nicolas de Pesloüan
2009-08-24 11:16               ` Jiri Pirko
2009-08-24 15:07                 ` Nicolas de Pesloüan
2009-08-24 15:20                   ` Jiri Pirko
2009-08-24 17:35                     ` Jay Vosburgh
2009-08-25  6:43                       ` Jiri Pirko
2009-08-25 17:31                         ` Nicolas de Pesloüan
2009-08-25 18:41                           ` Jay Vosburgh [this message]
2009-08-25 20:33                             ` Nicolas de Pesloüan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=16215.1251225681@death.nxdomain.ibm.com \
    --to=fubar@us.ibm.com \
    --cc=bonding-devel@lists.sourceforge.net \
    --cc=davem@davemloft.net \
    --cc=jpirko@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=nicolas.2p.debian@free.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).