From: Jay Vosburgh <fubar@us.ibm.com>
To: netdev@vger.kernel.org
Cc: Andy Gospodarek <andy@greyhouse.net>
Subject: [RFC v2 PATCH 0/2] bonding: generic netlink, multi-link mode
Date: Thu, 16 Dec 2010 14:53:21 -0800 [thread overview]
Message-ID: <1292540003-9465-1-git-send-email-fubar@us.ibm.com> (raw)
[ New and Improved, includes all the files this time... -J ]
These patches add support to bonding for generic netlink and a new
multi-link mode. At the moment, I'm looking primarily for discussion
about the generic netlink and implementation of multi-link.
First, in patch 1, is a generic netlink infrastructure for
bonding. This patch provides a "get mode" command and a "slave link state
change" asychnronous notification via a netlink multicast group. One long
term goal is to have bonding be controlled via netlink, both for
administrative purposes (add / remove slaves, etc) and policy (slave A is
better than slave B). I'd appreciate feedback from netlink savvy folks as
to whether this is the appropriate starting point.
Second, in patch 2, is the multi-link kernel code itself, which is
at present a work in progress. Here, I'm primarily looking for comments
regarding the control interface for this mode.
As implemented, this is a new mode to bonding, controlled via
generic netlink commands from a user space daemon. Slave assignment for
outgoing traffic is handled directly by bonding (the mapping table used by
multi-link is within bonding itself, and the usual transmit hash policy is
applied to the set of slaves allowable for a given destination).
In some private discussion with Andy, he suggested that this would
be better if it utilized the recently added queue mapping facility within
bonding, and then having the queue (and thus slave) assignments performed
at the qdisc level (via a tc filter) instead of within bonding itself.
This, I believe, would require a new tc filter that implements the ability
to set a skb queue_mapping in a hash (of protocol data in the packet) or
round robin fashion. In this case, the tc filter would also incorporate
all of the netlink functionality for communicating with the user space
daemon (to permit the mappings to be updated).
Thoughts?
Lastly, a description of the multi-link system itself. This is a
reimplementation of a load balancing scheme that has been available on AIX
for some time. It operates essentially as a load balancer by subnet, with
a UDP-based protocol to exchange multi-link topology information between
participating systems. Hosts participating in multi-link have IP
addresses in a separate subnet. Interfaces enslaved to multi-link do not
lose their assigned IP address information, and may also operate
separately from multi-link.
One notable feature is that multi-link provides load balancing
facilities for network devices that cannot change their MAC address, such
as Infiniband.
For example, given two systems as follows:
host A:
bond0 10.88.0.1/16
slave eth0 10.0.0.1/16
slave eth1 10.1.0.1/16
slave eth2 10.2.0.1/16
host B:
bond0 10.88.0.2/16
slave eth0 10.0.0.2/16
slave eth1 10.1.0.2/16
slave eth2 10.2.0.2/16
in this case, host A's bond0 running multi-link would load balance
traffic from 10.88.0.1 to 10.88.0.2 across eth0, eth1 and eth2. The user
space daemon negotiates the link set to use with other participating
hosts, and communicates that to the multi-link implementation.
-J
---
-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com
next reply other threads:[~2010-12-16 22:53 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-12-16 22:53 Jay Vosburgh [this message]
2010-12-16 22:53 ` [PATCH v2 1/2] bonding: generic netlink infrastructure Jay Vosburgh
2010-12-16 23:12 ` Stephen Hemminger
2010-12-16 22:53 ` [PATCH v2 2/2] bonding: add multi-link mode Jay Vosburgh
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1292540003-9465-1-git-send-email-fubar@us.ibm.com \
--to=fubar@us.ibm.com \
--cc=andy@greyhouse.net \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).