From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ding Tianhong Subject: Re: [PATCH net-next v2 1/6] bonding: simplify and use RCU protection for 3ad xmit path Date: Wed, 04 Sep 2013 22:53:20 +0800 Message-ID: <52274960.5000508@gmail.com> References: <522700D1.5060805@huawei.com> <20130904101823.GO1992@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: Ding Tianhong , Jay Vosburgh , Andy Gospodarek , "David S. Miller" , Nikolay Aleksandrov , Netdev To: Veaceslav Falico Return-path: Received: from mail-pd0-f172.google.com ([209.85.192.172]:49226 "EHLO mail-pd0-f172.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S934791Ab3IDPCi (ORCPT ); Wed, 4 Sep 2013 11:02:38 -0400 Received: by mail-pd0-f172.google.com with SMTP id z10so435720pdj.3 for ; Wed, 04 Sep 2013 08:02:38 -0700 (PDT) In-Reply-To: <20130904101823.GO1992@redhat.com> Sender: netdev-owner@vger.kernel.org List-ID: =E4=BA=8E 2013/9/4 18:18, Veaceslav Falico =E5=86=99=E9=81=93: > On Wed, Sep 04, 2013 at 05:43:45PM +0800, Ding Tianhong wrote: > ...snip... >> +/** >> + * IMPORTANT: bond_first/last_slave_rcu can return NULL in case of=20 >> an empty list >> + * Caller must hold rcu_read_lock >> + */ >> +#define bond_first_slave_rcu(bond) \ >> + list_first_or_null_rcu(&(bond)->slave_list, struct slave, list) >> +#define bond_last_slave_rcu(bond) \ >> + (list_empty(&(bond)->slave_list) ? NULL : \ >> + bond_to_slave_rcu((bond)->slave_list.prev)) > > Here, bond_last_slave_rcu() is racy. The list can be non-empty when > list_empty() is verified, however afterwards it might become empty, w= hen > you call bond_to_slave_rcu(), and thus you'll get > bond_to_slave(bond->slave_list) in the result, which is not a slave. > > Take a look at list_first_or_null_rcu() for a reference. The main ide= a is > that it first gets the ->next pointer, with RCU protection, and then > verifies if it's the list head or not, and if not - it gets the conta= iner > already. This way the ->next pointer won't get away. > > These kind of bugs are really rare, but are *EXTREMELY* hard to debug= =2E Thanks for your response and opinions, but I think your miss something, the slave_list will not changed in the rcu_read_lock, so ,the bugs will= not happen. > >> + >> #define bond_is_first_slave(bond, pos) ((pos)->list.prev =3D=3D=20 >> &(bond)->slave_list) >> #define bond_is_last_slave(bond, pos) ((pos)->list.next =3D=3D=20 >> &(bond)->slave_list) >> >> @@ -93,6 +106,15 @@ >> (bond_is_first_slave(bond, pos) ? bond_last_slave(bond) : \ >> bond_to_slave((pos)->list.prev)) >> >> +/* Since bond_first/last_slave_rcu can return NULL, these can retur= n=20 >> NULL too */ >> +#define bond_next_slave_rcu(bond, pos) \ >> + (bond_is_last_slave(bond, pos) ? bond_first_slave_rcu(bond) : \ >> + bond_to_slave_rcu((pos)->list.next)) >> + >> +#define bond_prev_slave_rcu(bond, pos) \ >> + (bond_is_first_slave(bond, pos) ? bond_last_slave_rcu(bond) : \ >> + bond_to_slave_rcu((pos)->list.prev)) >> + > > These two are also racy. bond_is_last/first_slave() is not rcu-ified,= and > thus you can't rely on it without proper locking. Same ideas apply as= per > bond_first_slave_rcu(). > --=20 refer to the above answer. > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html >