All of lore.kernel.org
 help / color / mirror / Atom feed
From: Joe Eykholt <jre@nuovasystems.com>
To: netdev@vger.kernel.org
Subject: Re: [RFC PATCH] [BONDING] Let non-IP L2 protocols receive on inactive slaves.
Date: Tue, 11 Mar 2008 18:39:43 -0700	[thread overview]
Message-ID: <47D7345F.9000900@nuovasystems.com> (raw)
In-Reply-To: <47D31973.7060905@nuovasystems.com>

Joe Eykholt wrote:
> Hi all,
> 
> I'd like to get some comments on a proposed patch to assist the
> use of L2 protocols in the presence of bonding.
> 
> This is for when active/backup mode is used to bond IP, but
> it is still appropriate to handle some L2 protocols (e.g.,
> LLDP and FCoE) on the backup (inactive) links.
> 
> The existing Linux bonding hook drops (almost) all traffic
> being received on the inactive slave interface except for
> ARP replies used by bonding itself to test connectivity.
> 
> Once we are past the bonding hook, netif_receive_skb()
> delivers the packet to all packet_type structures
> that match the following criteria:
>   1) The type field matches either ETH_P_ALL
>      or the received packet's type AND
>   2) the dev field is NULL or matches the (master)
>      interface dev pointer.
> 
> The packet_type.dev field is usually NULL.
> Almost all protocols in the kernel initialize the dev
> field to NULL and don't touch it.  The only exceptions
> are the bonding code itself and AF_PACKET during the bind
> operation.
> 
> My proposal is to tweak the semantics so that when the
> packet_type dev field is non-NULL and matches
> the slave device, we deliver the frame to the packet_type
> function even in the case where the bonding driver would've
> dropped it.
> This provides for any L2 protocol and user-level program
> using AF_PACKET bound to the individual slave interface.
> If AF_PACKET is not bound, it'll receive on all interfaces
> still except passive slaves, and will receive from active
> slaves as if the frame arrived on the master (the current
> behavior).
> 
> It's important to note that this is a potential change
> for user programs that bind to slave devices, since they
> wouldn't have received traffic while the slave was bound in
> the past.  I'm not sure how many such programs there might
> be so I'm unsure how concerned to be about this.
> This however is how I think it should work.
> 
> So, here is a patch that accomplishes this, just for
> discussion purposes.
> 
> The way this works is a tiny bit tricky.  Instead of testing
> ptype->dev == NULL, when checking for packet delivery,
> I use a pointer 'wild_card' which is NULL in most cases and
> test ptype->dev == wild_card.
> 
> This way the non-bonding case is unaffected.  In the bonding
> case, if skb_bond_should_drop() returns non-zero,
> the wild_card is set to dev, the original slave device,
> so only exact matches to the slave will pass.
> The packet_types with dev == NULL will not be matched.
> 
> If no packet_types match, the drop at the end of the
> routine takes care of dropping the packet.
> 
> In the non-drop case we still set dev = dev->master and
> skb->dev = dev.
> 
> I moved skb_bond() into netif_receive() because otherwise it'd
> need two return values, orig_dev and the wild_card.
> 
> The performance impact of this change should be neglegible.
> If bonding is off, the test is only slightly more expensive.
> If no protocols are registered with non-NULL devs, which is
> usually the case, the extra test in the ptype list is never
> reached.  In the bonding case, it does mean that if heavy
> traffic were arriving on the backup slave, more checks
> would be done before dropping the traffic.  That would
> be unusual.
> 
> This approach could be extended to remove some of the
> special cases in 'skb_bond_should_drop()' by having the
> bonding driver register packet_type matches for itself on
> the inactive slave.
> 
> There are related issues with MAC addresses, but I think
> there are good solutions to them, and I'd like to discuss
> that separately.
> 
> Please let me know what you think.
> 
>     Thanks,
>     Joe Eykholt
> 
> 
> [PATCH] Allow bond to receive some packets on the passive side.
> 
> commit 4e4fd41cbdeab598febe9782de76e745fbe1b7dc
> Author: Joe Eykholt <jre@nuovasystems.com>
> Date:   Wed Mar 5 18:32:37 2008 -0800
> 
>     Allow bond to receive some packets on inactive slaves.
> 
>     This is for protocols which do not wish to be bonded
>     or do not wish to have their packets dropped on the
>     passive side of an active/passive bond.
> 
>     Only a few cases currently use the packet_type dev match
>     to select one specific device.  All other packet_types
>     have dev == NULL.  There are two such cases in the
>     bonding code for ARP and LACP, and one for af_packet,
>     where it may bond to a specific interface.
> 
>     In these cases, we change the semantic to mean that packets
>     matching the type and device will still be received,
>     even on inactive slaves of bonds.
> 
> diff --git a/net/core/dev.c b/net/core/dev.c
> index fcdf03c..cc70055 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -1816,21 +1816,6 @@ int netif_rx_ni(struct sk_buff *skb)
> 
>  EXPORT_SYMBOL(netif_rx_ni);
> 
> -static inline struct net_device *skb_bond(struct sk_buff *skb)
> -{
> -       struct net_device *dev = skb->dev;
> -
> -       if (dev->master) {
> -               if (skb_bond_should_drop(skb)) {
> -                       kfree_skb(skb);
> -                       return NULL;
> -               }
> -               skb->dev = dev->master;
> -       }
> -
> -       return dev;
> -}
> -
> 
>  static void net_tx_action(struct softirq_action *h)
>  {
> @@ -2022,6 +2007,8 @@ out:
>  int netif_receive_skb(struct sk_buff *skb)
>  {
>         struct packet_type *ptype, *pt_prev;
> +       struct net_device *dev;
> +       struct net_device *wild_card;
>         struct net_device *orig_dev;
>         int ret = NET_RX_DROP;
>         __be16 type;
> @@ -2036,10 +2023,17 @@ int netif_receive_skb(struct sk_buff *skb)
>         if (!skb->iif)
>                 skb->iif = skb->dev->ifindex;
> 
> -       orig_dev = skb_bond(skb);
> -
> -       if (!orig_dev)
> -               return NET_RX_DROP;
> +       wild_card = NULL;
> +       dev = skb->dev;
> +       orig_dev = dev;
> +       if (dev->master) {
> +               if (skb_bond_should_drop(skb)) {
> +                       wild_card = dev;        /* deliver only exact match */
> +               } else {
> +                       dev = dev->master;
> +                       skb->dev = dev;
> +               }
> +       }
> 
>         __get_cpu_var(netdev_rx_stat).total++;
> 
> @@ -2059,7 +2053,7 @@ int netif_receive_skb(struct sk_buff *skb)
>  #endif
> 
>         list_for_each_entry_rcu(ptype, &ptype_all, list) {
> -               if (!ptype->dev || ptype->dev == skb->dev) {
> +               if (ptype->dev == wild_card || ptype->dev == skb->dev) {
>                         if (pt_prev)
>                                 ret = deliver_skb(skb, pt_prev, orig_dev);
>                         pt_prev = ptype;
> @@ -2084,7 +2078,8 @@ ncls:
>         list_for_each_entry_rcu(ptype,
>                         &ptype_base[ntohs(type) & PTYPE_HASH_MASK], list) {
>                 if (ptype->type == type &&
> -                   (!ptype->dev || ptype->dev == skb->dev)) {
> +                   (ptype->dev == wild_card || ptype->dev == skb->dev ||
> +                    ptype->dev == orig_dev)) {
>                         if (pt_prev)
>                                 ret = deliver_skb(skb, pt_prev, orig_dev);
>                         pt_prev = ptype;
> 
> 
> 
> 
> --

I'm surprised there were no comments on this.  Does everyone like it?  (Seems unlikely).
Please take a look.

I'm resending this with the subject line changed to use BONDING instead of BOND.

	Thanks,
	Joe Eykholt

      reply	other threads:[~2008-03-12  1:39 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-03-08 22:55 [RFC PATCH] [BOND] Let non-IP L2 protocols receive on inactive slaves Joe Eykholt
2008-03-12  1:39 ` Joe Eykholt [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=47D7345F.9000900@nuovasystems.com \
    --to=jre@nuovasystems.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.