From mboxrd@z Thu Jan 1 00:00:00 1970 From: Veaceslav Falico Subject: Re: [PATCH net-next 0/5] bonding: patchset for rcu use in bonding Date: Mon, 21 Oct 2013 15:21:36 +0200 Message-ID: <20131021132136.GE692@redhat.com> References: <5264ECBC.2090208@huawei.com> <20131021091336.GB692@redhat.com> <5264F397.50608@huawei.com> <20131021093549.GC692@redhat.com> <52651ECB.1080901@huawei.com> <20131021124144.GD692@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Cc: Jay Vosburgh , Andy Gospodarek , "David S. Miller" , Nikolay Aleksandrov , Netdev To: Ding Tianhong Return-path: Received: from mx1.redhat.com ([209.132.183.28]:39136 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752626Ab3JUNZw (ORCPT ); Mon, 21 Oct 2013 09:25:52 -0400 Content-Disposition: inline In-Reply-To: <20131021124144.GD692@redhat.com> Sender: netdev-owner@vger.kernel.org List-ID: On Mon, Oct 21, 2013 at 02:41:44PM +0200, Veaceslav Falico wrote: >On Mon, Oct 21, 2013 at 08:32:11PM +0800, Ding Tianhong wrote: >>On 2013/10/21 17:35, Veaceslav Falico wrote: >>>On Mon, Oct 21, 2013 at 05:27:51PM +0800, Ding Tianhong wrote: >>>>On 2013/10/21 17:13, Veaceslav Falico wrote: >>>>>On Mon, Oct 21, 2013 at 04:58:36PM +0800, Ding Tianhong wrote: >>>>>>Hi: >>>>>> >>>>>>The Patch Set will remove the invalid lock for bond work queue and replace it >>>>>>with rtnl lock, as read lock for bond could not protect slave list any more. >>>>> >>>>>rtnl lock is a lot more expensive than bond lock, and not only for bond, >>>>>but for all the networking stack. >>>>> >>>>>Why is the bond->lock invalid? It correctly protects slaves from being >>>>>modified concurrently. >>>>> >>>>>I don't see the point in this patchset. >>>>> >>>> >>>>yes, rtnl lock is a big lock, but I think bond->lock could not protect >>>>bond_for_each_slave any more, am I miss something? >>> >>>Why can't it protect bond_for_each_slave()? >>> >> >>bond_master_upper_dev_link() and bond_upper_dev_unlink() was only in rtnl lock, >>bond_for_each_slave may changed while loop in bond read lock, but it sees that >>nothing serious will happen yet. >>Maybe I miss something. > >Even if it is unsafe to use bond_for_each_slave() while holding bond->lock >- it means that we must protect the list by locking the >bond_upper_dev_(un)link() via bond->lock, but not by removing bond->lock >from everywhere where it is now. And I'm not that sure if it's safe or not. I've quickly looked over the code - yes, theoretically we could race between bond_for_each_slave() that is not rtnl-protected and bond_upper_dev_(un)link(). Your patchset, also, doesn't 'replace' bond->lock with rtnl_lock(), cause everywhere the rtnl_lock() is already present, it just moves it around, while removing the bond->lock. The commit message is wrong and says actually nothing why is it done, how is it done, why it's safe to do so and what do we get in the end. For every patch, and the cover letter is also not an exception. I'd suggest you either: 1) add bond->lock around bond_upper_dev_(un)link() (GFP_ATOMIC might be needed). or 2) add ASSERT_RTNL() to bond_for_each_slave() macro, catch all the offenders and remove the bond->lock from them. Also, I'm not that sure that it's safe to do so - cause one of the slaves (not the slave list) might change, and we might have race conditions there. or 3) move bond_for_each_slave() to bond_for_each_slave_rcu() where appropriate. And in any case write specific commit messages - bonding's code is really old and full of locks that were placed for some reason (and the reason might have gone away long ago, too), so it's really hard to say if the change is safe or not. I'd personally go for either 3) (preferred), or 1). Sorry, I'm a bit tired of going in-depth on your patches. Start either doing patches with commit messages that *prove* me that you're right (I'll, obviously, verify it - but at least I'll know what you're doing, and won't have to figure it out from code), or I'll start explicitly NAKing them. Sorry again, but I don't really have time for that. I didn't have time to review your last patchset (RCUifying the remaining transmit path), and now I can understand nothing from their commit description, except that you've changed bond_for_each_slave() to bond_for_each_slave_rcu() and bond->lock to rcu_read_lock(). I'm not saying that they're wrong, just that they're really hard to understand. So, please, start writing commit messages. > >> >>Ding >> >> >>>> >>>>Ding >>>> >>>>>> >>>>>>Ding Tianhong (5): >>>>>> bonding: remove bond read lock for bond_mii_monitor() >>>>>> bonding: remove bond read lock for bond_alb_monitor() >>>>>> bonding: remove bond read lock for bond_loadbalance_arp_mon() >>>>>> bonding: remove bond read lock for bond_activebackup_arp_mon() >>>>>> bonding: remove bond read lock for bond_3ad_state_machine_handler() >>>>>> >>>>>>drivers/net/bonding/bond_3ad.c | 9 ++-- >>>>>>drivers/net/bonding/bond_alb.c | 20 ++------ >>>>>>drivers/net/bonding/bond_main.c | 100 +++++++++++++--------------------------- >>>>>>3 files changed, 40 insertions(+), 89 deletions(-) >>>>>> >>>>>>-- >>>>>>1.8.2.1 >>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>> >>>> >>> >>>. >>> >> >> >-- >To unsubscribe from this list: send the line "unsubscribe netdev" in >the body of a message to majordomo@vger.kernel.org >More majordomo info at http://vger.kernel.org/majordomo-info.html