From: Veaceslav Falico <vfalico@redhat.com>
To: Ding Tianhong <dingtianhong@huawei.com>
Cc: Jay Vosburgh <fubar@us.ibm.com>,
Andy Gospodarek <andy@greyhouse.net>,
"David S. Miller" <davem@davemloft.net>,
Nikolay Aleksandrov <nikolay@redhat.com>,
Netdev <netdev@vger.kernel.org>
Subject: Re: [PATCH net-next v2 0/5] bonding: patchset for rcu use in bonding
Date: Mon, 28 Oct 2013 02:34:28 +0100 [thread overview]
Message-ID: <20131028013428.GC11209@redhat.com> (raw)
In-Reply-To: <526DBAC8.5000009@huawei.com>
On Mon, Oct 28, 2013 at 09:15:52AM +0800, Ding Tianhong wrote:
>On 2013/10/28 6:53, Veaceslav Falico wrote:
>> On Thu, Oct 24, 2013 at 11:08:35AM +0800, Ding Tianhong wrote:
>>> Hi:
>>>
>>> The slave list will add and del by bond_master_upper_dev_link() and bond_upper_dev_unlink(),
>>> which will call call_netdevice_notifiers(), even it is safe to call it in write bond lock now,
>>> but we can't sure that whether it is safe later, because other drivers may deal NETDEV_CHANGEUPPER
>>> in sleep way, so I didn't admit move the bond_upper_dev_unlink() in write bond lock.
>>>
>>> now the bond_for_each_slave only protect by rtnl_lock(), maybe use bond_for_each_slave_rcu is a good
>>> way to protect slave list for bond, but as a system slow path, it is no need to transform bond_for_each_slave()
>>> to bond_for_each_slave_rcu() in slow path, so in the patchset, I will remove the unused read bond lock
>>> for monitor function, maybe it is a better way, I will wait to accept any relay for it.
>>>
>>> Thanks for the Veaceslav Falico opinion.
>>>
>>> v2: add and modify commit for patchset and patch, it will be the first step for the whole patchset.
>>>
>>> Ding Tianhong (5):
>>> bonding: remove bond read lock for bond_mii_monitor()
>>> bonding: remove bond read lock for bond_alb_monitor()
>>> bonding: remove bond read lock for bond_loadbalance_arp_mon()
>>> bonding: remove bond read lock for bond_activebackup_arp_mon()
>>
>> This patch introduces a regression by boot-test with active backup mode:
>>
>> bond_activebackup_arp_mon() is already not holding the bond->lock, however
>> it might call bond_change_active_slave(), which does (in case of new_active):
>>
>> 912 write_unlock_bh(&bond->curr_slave_lock);
>> 913 read_unlock(&bond->lock);
>> 914 915 call_netdevice_notifiers(NETDEV_BONDING_FAILOVER, bond->dev);
>> 916 if (should_notify_peers)
>> 917 call_netdevice_notifiers(NETDEV_NOTIFY_PEERS,
>> 918 bond->dev);
>> 919 920 read_lock(&bond->lock);
>> 921 write_lock_bh(&bond->curr_slave_lock);
>>
>> so it drops the bond->lock (which wasn't taken previously), and then takes
>> it (without anyone dropping it afterwards).
>>
>> I don't know how to fix it - cause a lot of other callers already take it,
>> and we can't just drop them (we'd race), and we can't remove it here (cause
>> we can't call notifiers while atomic).
>>
>> Which begs the question - was this patchset tested at all?
>>
>> [ 21.796823] =====================================
>> [ 21.796823] [ BUG: bad unlock balance detected! ]
... snip ...
>
>Hi David:
>yes, exactly I miss it and make a mistake, the bond_select_active_slave is still have the protect problem and
>need to be processed, I miss it, sorry, I will send a patch to fix the bug soon.
>
>Hi Veaceslav:
>sorry about the commit, I will pay more attention to the commit and test, thanks for your advise and report the bug,
>I have to admin that I was too careless.
I'll ask you once again, even though it seems that my NACK doesn't block
the patchset - try writing commit messages that actually describe why and
how you do it.
It's, actually, not only for the reviewers - it's also really good for you
- cause while writing the commit log you also understand a lot more what
are you doing, and might spot some corner cases (like this one).
Sorry for being negative, however it costs me *much* more time to review
patches without proper commit messages. I've done it once, twice, several
times more - but that's it, I refuse to spend my time on your skipped
homework.
It might also help to split patches into really small steps - as in - do
only one thing at a time, and describe it. This will help evade bugs *a
lot*. It also helps people who'll bisect it, bugfix it and review it -
because every patch will be a small, well-documented change, instead of a
chunk of code with a description 'lets remove bond->lock'.
Thank you and hope that helps.
>
>
>
>
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>> .
>>
>
>
next prev parent reply other threads:[~2013-10-28 1:38 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-10-24 3:08 [PATCH net-next v2 0/5] bonding: patchset for rcu use in bonding Ding Tianhong
2013-10-24 9:35 ` Veaceslav Falico
2013-10-27 20:37 ` David Miller
2013-10-27 21:10 ` Veaceslav Falico
2013-10-27 21:44 ` David Miller
2013-10-27 22:10 ` Veaceslav Falico
2013-10-28 4:00 ` David Miller
2013-10-27 22:53 ` Veaceslav Falico
2013-10-28 1:15 ` Ding Tianhong
2013-10-28 1:34 ` Veaceslav Falico [this message]
2013-10-28 3:02 ` Ding Tianhong
2013-10-28 4:01 ` David Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20131028013428.GC11209@redhat.com \
--to=vfalico@redhat.com \
--cc=andy@greyhouse.net \
--cc=davem@davemloft.net \
--cc=dingtianhong@huawei.com \
--cc=fubar@us.ibm.com \
--cc=netdev@vger.kernel.org \
--cc=nikolay@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.