All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jay Vosburgh <jay.vosburgh@canonical.com>
To: jin yiting <jinyiting@huawei.com>
Cc: vfalico@gmail.com, andy@greyhouse.net, davem@davemloft.net,
	kuba@kernel.org, netdev@vger.kernel.org, security@kernel.org,
	linux-kernel@vger.kernel.org, xuhanbing@huawei.com,
	wangxiaogang3@huawei.com
Subject: Re: [PATCH] bonding: 3ad: update slave arr after initialize
Date: Mon, 19 Apr 2021 22:04:00 -0700	[thread overview]
Message-ID: <492.1618895040@famine> (raw)
In-Reply-To: <1165c45f-ae7f-48c1-5c65-a879c7bf978a@huawei.com>

jin yiting <jinyiting@huawei.com> wrote:
[...]
>> 	The described issue is a race condition (in that
>> ad_agg_selection_logic clears agg->is_active under mode_lock, but
>> bond_open -> bond_update_slave_arr is inspecting agg->is_active outside
>> the lock).  I don't see how the above change will reliably manage this;
>> the real issue looks to be that bond_update_slave_arr is committing
>> changes to the array (via bond_reset_slave_arr) based on a racy
>> inspection of the active aggregator state while it is in flux.
>>
>> 	Also, the description of the issue says "The best aggregator in
>> ad_agg_selection_logic has not changed, no need to update slave arr,"
>> but the change above does the opposite, and will set update_slave_arr
>> when the aggregator has not changed (update_slave_arr remains false at
>> return of ad_agg_selection_logic).
>>
>> 	I believe I understand the described problem, but I don't see
>> how the patch fixes it.  I suspect (but haven't tested) that the proper
>> fix is to acquire mode_lock in bond_update_slave_arr while calling
>> bond_3ad_get_active_agg_info to avoid conflict with the state machine.
>>
>> 	-J
>>
>> ---
>> 	-Jay Vosburgh, jay.vosburgh@canonical.com
>> .
>>
>
>	Thank you for your reply. The last patch does have redundant
>update slave arr.Thank you for your correction.
>
>        As you said, holding mode_lock in bond_update_slave_arr while
>calling bond_3ad_get_active_agg_info can avoid conflictwith the state
>machine. I have tested this patch, with ifdown/ifup operations for bond or
>slaves.
>
>        But bond_update_slave_arr is expected to hold RTNL only and NO
>other lock. And it have WARN_ON(lockdep_is_held(&bond->mode_lock)); in
>bond_update_slave_arr. I'm not sure that holding mode_lock in
>bond_update_slave_arr while calling bond_3ad_get_active_agg_info is a
>correct action.

	That WARN_ON came up in discussion recently, and my opinion is
that it's incorrect, and is trying to insure bond_update_slave_arr is
safe for a potential sleep when allocating memory.

https://lore.kernel.org/netdev/20210322123846.3024549-1-maximmi@nvidia.com/

	The original authors haven't replied, so I would suggest you
remove the WARN_ON and the surrounding CONFIG_LOCKDEP ifdefs as part of
your patch and replace it with a call to might_sleep.

	The other callers of bond_3ad_get_active_agg_info are generally
obtaining the state in order to report it to user space, so I think it's
safe to leave those calls not holding the mode_lock.  The race is still
there, but the data returned to user space is a snapshot and so may
reflect an incomplete state during a transition.  Further, having the
inspection functions acquire the mode_lock permits user space to spam
the lock with little effort.

	-J

>diff --git a/drivers/net/bonding/bond_main.c
>b/drivers/net/bonding/bond_main.c
>index 74cbbb2..db988e5 100644
>--- a/drivers/net/bonding/bond_main.c
>+++ b/drivers/net/bonding/bond_main.c
>@@ -4406,7 +4406,9 @@ int bond_update_slave_arr(struct bonding *bond,
>struct slave *skipslave)
>    if (BOND_MODE(bond) == BOND_MODE_8023AD) {
>        struct ad_info ad_info;
>
>+       spin_lock_bh(&bond->mode_lock);
>        if (bond_3ad_get_active_agg_info(bond, &ad_info)) {
>+           spin_unlock_bh(&bond->mode_lock);
>            pr_debug("bond_3ad_get_active_agg_info failed\n");
>            /* No active aggragator means it's not safe to use
>             * the previous array.
>@@ -4414,6 +4416,7 @@ int bond_update_slave_arr(struct bonding *bond,
>struct slave *skipslave)
>            bond_reset_slave_arr(bond);
>            goto out;
>        }
>+       spin_unlock_bh(&bond->mode_lock);
>        agg_id = ad_info.aggregator_id;
>    }
>    bond_for_each_slave(bond, slave, iter) {

---
	-Jay Vosburgh, jay.vosburgh@canonical.com

  reply	other threads:[~2021-04-20  5:04 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-16  1:53 [PATCH] bonding: 3ad: update slave arr after initialize jinyiting
2021-04-16  4:28 ` Jay Vosburgh
2021-04-20  3:22   ` jin yiting
2021-04-20  5:04     ` Jay Vosburgh [this message]
2021-04-21  3:34       ` jin yiting

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=492.1618895040@famine \
    --to=jay.vosburgh@canonical.com \
    --cc=andy@greyhouse.net \
    --cc=davem@davemloft.net \
    --cc=jinyiting@huawei.com \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=security@kernel.org \
    --cc=vfalico@gmail.com \
    --cc=wangxiaogang3@huawei.com \
    --cc=xuhanbing@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.