Re: [PATCH net-next 2/2] bonding: Simplify the xmit function for modes that use xmit_hash

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Nikolay Aleksandrov <nikolay@redhat.com>
To: Mahesh Bandewar <maheshb@google.com>
Cc: Jay Vosburgh <j.vosburgh@gmail.com>,
	Veaceslav Falico <vfalico@redhat.com>,
	Andy Gospodarek <andy@greyhouse.net>,
	David Miller <davem@davemloft.net>,
	netdev <netdev@vger.kernel.org>,
	Eric Dumazet <edumazet@google.com>,
	Maciej Zenczykowski <maze@google.com>
Subject: Re: [PATCH net-next 2/2] bonding: Simplify the xmit function for modes that use xmit_hash
Date: Fri, 05 Sep 2014 13:26:11 +0200	[thread overview]
Message-ID: <54099DD3.20109@redhat.com> (raw)
In-Reply-To: <CAF2d9ji2Dsi9dr6B0HKy4oQ841CN-00zWcOxCa+oTyPp79hOHA@mail.gmail.com>

On 05/09/14 02:10, Mahesh Bandewar wrote:
> On Thu, Sep 4, 2014 at 6:16 AM, Nikolay Aleksandrov <nikolay@redhat.com> wrote:
>> On 03/09/14 23:47, Mahesh Bandewar wrote:
>>>
>>> Earlier change to use usable slave array for TLB mode had an additional
>>> performance advantage. So extending the same logic to all other modes
>>> that use xmit-hash for slave selection (viz 802.3AD, and XOR modes).
>>> Also consolidating this with the earlier TLB change.
>>>
>>> The main idea is to build the usable slaves array in the control path
>>> and use that array for slave selection during xmit operation.
>>>
>>> Measured performance in a setup with a bond of 4x1G NICs with 200
>>> instances of netperf for the modes involved (3ad, xor, tlb)
>>> cmd: netperf -t TCP_RR -H <TargetHost> -l 60 -s 5
>>>
>>> Mode        TPS-Before   TPS-After
>>>
>>> 802.3ad   : 468,694      493,101
>>> TLB (lb=0): 392,583      392,965
>>> XOR       : 475,696      484,517
>>>
>>> Signed-off-by: Mahesh Bandewar <maheshb@google.com>
>>> ---
>>
<<<<<snip>>>>>>
>>> -       bond_xmit_slave_id(bond, skb, bond_xmit_hash(bond, skb) %
>>> bond->slave_cnt);
>>> +       old_arr = rcu_dereference_protected(bond->slave_arr,
>>> +                                           lockdep_rtnl_is_held() ||
>>> +                                           lockdep_is_held(&bond->lock)
>>> ||
>>> +
>>> lockdep_is_held(&bond->curr_slave_lock));
>>
>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>> This line is the most troublesome for me, which lock is it ? Does this mean
>> that whichever I hold from the three I can update the slave array ?
>> I don't think this is worked out well, you should explicitly specify how and
>> why it is safe to update this under each of the locks and maybe you'll be
>> able to reduce the lock list :-)
>>
> This is primarily because of different code paths it's taking to reach
> here. In all these cases, one of those locks is held. Unfortunately
> there are three such locks  that I have identified (for all three
> modes involved) and hence the above line.
>

True, but I did a little grepping and here's my analysis of the call sites which 
I can't guarantee is full or complete, but it shows at least 1 problem.
bond_update_slave_arr() callers:

1. 3ad mode
1.1. bond_3ad_state_machine_handler -> ad_mux_machine -> 
ad_(en|dis)able_collecting_distributing
  - read_lock(bond->lock), rcu_read_lock, state_machine_lock
1.2. __bond_release_one -> bond_3ad_unbind_slave
  - rtnl, write_lock(bond->lock)
1.3. bond_change_active_slave -> bond_3ad_handle_link_change
  -  from 4. rtnl, new_active != NULL -> write_lock(curr_slave_lock)
1.4. bond_miimon_commit -> bond_3ad_handle_link_change
  - rtnl

2. TLB
2.1. __bond_release_one -> bond_alb_deinit_slave
  - rtnl
2.2. bond_change_active_slave -> bond_alb_handle_link_change
  - from 4. rtnl, new_active != NULL -> write_lock(curr_slave_lock)
2.3. bond_miimon_commit -> bond_alb_handle_link_change
  - rtnl

3. XOR
3.1. __bond_release_one
  - rtnl
3.2. bond_miimon_commit
  - rtnl

4. bond_change_active_slave:
1. bond_select_active_slave -> bond_change_active_slave
1.1. bond_enslave -> bond_select_active_slave
  - rtnl, write_lock(curr_slave_lock)
1.2. __bond_release_one -> bond_select_active_slave
  - rtnl, write_lock(curr_slave_lock)
1.3. bond_miimon_commit -> bond_select_active_slave
  - rtnl, write_lock(curr_slave_lock)
1.4. bond_loadbalance_arp_mon -> bond_select_active_slave
  - rtnl, write_lock(curr_slave_lock)
1.5. bond_ab_arp_commit -> bond_select_active_slave
  - rtnl, write_lock(curr_slave_lock)
1.6. bond_slave_netdev_event -> bond_select_active_slave
  - rtnl, write_lock(curr_slave_lock)
1.7. bond_options.c (all callers)
  - rtnl, write_lock(curr_slave_lock)


Almost all callers of slave_update_arr() currently have rtnl acquired, but 
there's 1 troubling caller: bond_3ad_state_machine_handler() which is called 
from a workqueue. Now if we're able to execute anything with that workqueue, we 
have a race condition, good candidates are all options which don't acquire 
write_lock(bond->lock), I think the only one that can call 
bond_slave_update_arr() of those is primary_reselect right now.
So if you come up with some way to deal with that, you probably can use only 
rtnl for syncing the array and simplify this.
Again I might be wrong since this is done only via grepping :-)

Cheers,
  Nik

next prev parent reply	other threads:[~2014-09-05 11:26 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-09-03 21:47 [PATCH net-next 2/2] bonding: Simplify the xmit function for modes that use xmit_hash Mahesh Bandewar
2014-09-03 22:51 ` Jay Vosburgh
2014-09-04  0:57   ` Mahesh Bandewar
2014-09-04 13:16 ` Nikolay Aleksandrov
2014-09-05  0:10   ` Mahesh Bandewar
2014-09-05 11:26     ` Nikolay Aleksandrov [this message]
2014-09-05 11:49       ` Nikolay Aleksandrov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=54099DD3.20109@redhat.com \
    --to=nikolay@redhat.com \
    --cc=andy@greyhouse.net \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=j.vosburgh@gmail.com \
    --cc=maheshb@google.com \
    --cc=maze@google.com \
    --cc=netdev@vger.kernel.org \
    --cc=vfalico@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).