From: Jay Vosburgh <fubar@us.ibm.com>
To: Veaceslav Falico <vfalico@redhat.com>
Cc: netdev@vger.kernel.org, Andy Gospodarek <andy@greyhouse.net>,
"David S. Miller" <davem@davemloft.net>
Subject: Re: [PATCH net-next 0/6] bonding: only rely on arp packets if arp monitor is used
Date: Wed, 15 Jan 2014 21:09:57 -0800 [thread overview]
Message-ID: <15764.1389848997@death.nxdomain> (raw)
In-Reply-To: <1389837916-5377-1-git-send-email-vfalico@redhat.com>
Veaceslav Falico <vfalico@redhat.com> wrote:
>Currently, if arp_validate is off (0), slave_last_rx() returns the
>slave->dev->last_rx, which is always updated on *any* packet received by
>slave, and not only arps. This means that, if the validation of arps is
>off, we're treating *any* incoming packet as a proof of slave being up, and
>not only arps.
The "any incoming packet" part is intentional.
>This might seem logical at the first glance, however it can cause a lot of
>troubles and false-positives, one example would be:
>
>The arp_ip_target is NOT accessible, however someone in the broadcast domain
>spams with any broadcast traffic. This way bonding will be tricked that the
>slave is still up (as in - can access arp_ip_target), while it's not.
This type of situation is why arp_validate was added.
The specific situation was when multiple hosts using bonding
with the ARP monitor were set up behind a common gateway (in the same
Ethernet broadcast domain). The arp_ip_target is unreachable for
whatever reason. In that case, the various bonding instances on the
different hosts will each issue broadcast ARP requests, and (in the
absence of arp_validate) those requests would trick the other bonds into
believing that they are up.
I don't think this patch set will resolve that problem, since
you explicitly permit any incoming ARP to count.
>The documentation for arp_validate also states that *ARPs* will (not) be
>validated if it's on/off, and that the arp monitoring works on arps as
>traffic generators.
I wrote most of that text in the documentation, and the intent
was not to imply that only ARPs should count for "up-ness" even without
arp_validate enabled. The intent was to distinguish it from
"non-validate," in which any incoming traffic counted for "up-ness."
The main reason for preserving the non-validate behavior (any
traffic counts) is for the loadbalance (xor and rr) modes. In those
modes, the switch decides which slave receives the incoming traffic, and
so it's to our advantage to permit any incoming traffic to count for
"up-ness." The arp_validate option is not allowed in these modes
because it won't work.
With these changes, I suspect that the loadbalance ARP monitor
will be less reliable with these changes (granted that it's already a
bit dodgy in its dependence on the switch to hit all slaves with
incoming packets regularly). Particularly if the switch ports are
configured into an Etherchannel ("static link aggregation") group, in
which case only one slave will receive any given frame (broadcast /
multicast traffic will not be duplicated across all slaves).
I'm not sure that this change (the "only count ARPs even without
arp_validate" bit) won't break existing configurations. Did you test
the -rr and -xor modes with ARP monitor after your changes (with and
without configuring a channel group on the switch ports)?
>Also, the net_device->last_rx is already used in a lot of drivers (even
>though the comment states to NOT do it :)), and it's also ugly to modify it
>from bonding.
I didn't check, but I suspect those are mostly leftovers from
the distant past, when the drivers were expected to update last_rx, or
perhaps drivers using it for their own purposes.
I don't really see an issue in decoupling bonding from the
net_device->last_rx; it's pretty much the same thing that was done for
trans_start some time ago.
-J
---
-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com
>So, to fix this, remove the last_rx from bonding, *always* call
>bond_arp_rcv() in slave's rx_handler (bond_handle_frame), and if we spot an
>arp there - update the slave->last_arp_rx - and use it instead of
>net_device->last_rx. Finally, rename slave_last_rx() to slave_last_arp_rx()
>to reflect the changes.
>
>As the changes touch really sensitive parts, I've tried to split them as
>much as possible, for easier debugging/bisecting.
>
>CC: Jay Vosburgh <fubar@us.ibm.com>
>CC: Andy Gospodarek <andy@greyhouse.net>
>CC: "David S. Miller" <davem@davemloft.net>
>CC: netdev@vger.kernel.org
>Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
>
>---
> drivers/net/bonding/bond_main.c | 18 ++++++++----------
> drivers/net/bonding/bond_options.c | 12 ++----------
> drivers/net/bonding/bonding.h | 16 ++++++----------
> include/linux/netdevice.h | 8 +-------
> 4 files changed, 17 insertions(+), 37 deletions(-)
>
next prev parent reply other threads:[~2014-01-16 5:10 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-01-16 2:05 [PATCH net-next 0/6] bonding: only rely on arp packets if arp monitor is used Veaceslav Falico
2014-01-16 2:05 ` [PATCH net-next 1/6] bonding: always update last_arp_rx on arp recieve Veaceslav Falico
2014-01-16 2:05 ` [PATCH net-next 2/6] bonding: always set recv_probe to bond_arp_rcv in arp monitor Veaceslav Falico
2014-01-16 2:05 ` [PATCH net-next 3/6] bonding: use last_arp_rx in slave_last_rx() Veaceslav Falico
2014-01-16 2:05 ` [PATCH net-next 4/6] bonding: rename slave_last_rx() to slave_last_arp_rx() Veaceslav Falico
2014-01-16 2:05 ` [PATCH net-next 5/6] bonding: use last_arp_rx in bond_loadbalance_arp_mon() Veaceslav Falico
2014-01-16 2:05 ` [PATCH net-next 6/6] bonding: remove useless updating of slave->dev->last_rx Veaceslav Falico
2014-01-16 5:09 ` Jay Vosburgh [this message]
2014-01-16 6:01 ` [PATCH net-next 0/6] bonding: only rely on arp packets if arp monitor is used David Miller
2014-01-17 8:02 ` Veaceslav Falico
2014-01-16 8:41 ` Veaceslav Falico
2014-01-16 22:38 ` Jay Vosburgh
2014-01-17 6:57 ` Veaceslav Falico
2014-01-17 17:07 ` Veaceslav Falico
2014-01-16 5:53 ` David Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=15764.1389848997@death.nxdomain \
--to=fubar@us.ibm.com \
--cc=andy@greyhouse.net \
--cc=davem@davemloft.net \
--cc=netdev@vger.kernel.org \
--cc=vfalico@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).