From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jiri Pirko Subject: Re: [patch net/stable v2] br: fix use of ->rx_handler_data in code executed on non-rx_handler path Date: Mon, 9 Dec 2013 10:36:03 +0100 Message-ID: <20131209093603.GA2449@minipsycho.orion> References: <1386257257-25258-1-git-send-email-jiri@resnulli.us> <20131206.154321.1673089712073824539.davem@davemloft.net> <20131206131028.41b5b8b1@nehalam.linuxnetplumber.net> <20131207085105.GA2496@minipsycho.orion> <52A372B5.8010602@gmail.com> <20131207200720.GB2516@minipsycho.orion> <52A525D6.7050409@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Stephen Hemminger , David Miller , netdev@vger.kernel.org, jtluka@redhat.com, zhiguohong@tencent.com, bridge@lists.linux-foundation.org, edumazet@google.com, laine@redhat.com, mst@redhat.com To: Vlad Yasevich Return-path: Received: from mail-ee0-f41.google.com ([74.125.83.41]:44543 "EHLO mail-ee0-f41.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1760322Ab3LIJgI (ORCPT ); Mon, 9 Dec 2013 04:36:08 -0500 Received: by mail-ee0-f41.google.com with SMTP id t10so1434639eei.28 for ; Mon, 09 Dec 2013 01:36:06 -0800 (PST) Content-Disposition: inline In-Reply-To: <52A525D6.7050409@gmail.com> Sender: netdev-owner@vger.kernel.org List-ID: Mon, Dec 09, 2013 at 03:07:18AM CET, vyasevich@gmail.com wrote: >On 12/07/2013 03:07 PM, Jiri Pirko wrote: >> Sat, Dec 07, 2013 at 08:10:45PM CET, vyasevich@gmail.com wrote: >>> On 12/07/2013 03:51 AM, Jiri Pirko wrote: >>>> Fri, Dec 06, 2013 at 10:10:28PM CET, stephen@networkplumber.org wrote: >>>>> On Fri, 06 Dec 2013 15:43:21 -0500 (EST) >>>>> David Miller wrote: >>>>> >>>>>> From: Jiri Pirko >>>>>> Date: Thu, 5 Dec 2013 16:27:37 +0100 >>>>>> >>>>>>> br_stp_rcv() is reached by non-rx_handler path. That means there is no >>>>>>> guarantee that dev is bridge port and therefore simple NULL check of >>>>>>> ->rx_handler_data is not enough. There is need to check if dev is really >>>>>>> bridge port and since only rcu read lock is held here, do it by checking >>>>>>> ->rx_handler pointer. >>>>>>> >>>>>>> Note that synchronize_net() in netdev_rx_handler_unregister() ensures >>>>>>> this approach as valid. >>>>>>> >>>>>>> Introduced originally by: >>>>>>> commit f350a0a87374418635689471606454abc7beaa3a >>>>>>> "bridge: use rx_handler_data pointer to store net_bridge_port pointer" >>>>>>> >>>>>>> Fixed but not in the best way by: >>>>>>> commit b5ed54e94d324f17c97852296d61a143f01b227a >>>>>>> "bridge: fix RCU races with bridge port" >>>>>>> >>>>>>> Reintroduced by: >>>>>>> commit 716ec052d2280d511e10e90ad54a86f5b5d4dcc2 >>>>>>> "bridge: fix NULL pointer deref of br_port_get_rcu" >>>>>>> >>>>>>> Please apply to stable trees as well. Thanks. >>>>>>> >>>>>>> RH bugzilla reference: https://bugzilla.redhat.com/show_bug.cgi?id=1025770 >>>>>>> >>>>>>> Reported-by: Laine Stump >>>>>>> Debugged-by: Michael S. Tsirkin >>>>>>> Signed-off-by: Michael S. Tsirkin >>>>>>> Signed-off-by: Jiri Pirko >>>>>>> --- >>>>>>> v1->v2: moved br_port_get_check_rcu definition below br_handle_frame definition >>>>>> >>>>>> Applied and queued up for -stable, thanks Jiri. >>>>> >>>>> How come you ignored my simpler fix, that used the existing logic. >>>>> I don't like introducing this especially into the stable; much prefer >>>>> to go back to testing the flag as was being done before. >>>> >>>> Although your patch is technically sane, it depends on rtnl indirectly. >>> >>> Pardon my ignorance, but I've been staring at this and I can't for >>> the life of me see the dependency. >>> >>> The IFF_BRIDGE_PORT flag is set after the rx_handler is registered, >>> so we are safe there. The rcu primitives will guarantee that the flag >>> will be set by the time rx_handler and rx_handler_data are set. >>> >>> The flag is cleared before rx_handler is unregistered, so it is >>> still valid to check for it in stp code. Once the flag is cleared >>> we may still have a valid rx_handler during the rcu grace period, but >>> will still avoid doing processing. >>> >>> So, where is the dependency on the rtnl? >> >> >> Imagine br would release the netdev and some other rx_handler user would >> enslave the same netdev. This two events would happen between >> IFF_BRIDGE_PORT flag check and rx_handler_data get. That is what >> rtnl_lock prevents from happening. > >I don't think this can happen. Inside br_stp_rcv(), we are in an rcu >critical section. The release of the netdev (rx_handler unregister) >forces us to to wait until synchronize_net() completes. So, if under >rcu we checked the flag and it's on, the rx_handler will not change for >the duration of that rcu section and we can safely process the packet >even if the port is in the middle of going away. Howe does the race >you describe happen? You are right. It cannot happen. > >The reason I ask, is that we check priv_flags under rcu in other >places to make sure that the device we are passing data to can handle >it. If it's not safe, then those other places are vulnerable too. >It doesn't seem to me that it is unsafe. > >Thanks >-vlad > >> >>> >>> Thanks >>> -vlad >>> >>>> My patch depends on rcu locking and synchronize_rcu which is direct. >>>> Therefore I think it is more appropriate. >>>> >>>> Jiri >>>> >>>> >>>> >>>> -- >>>> To unsubscribe from this list: send the line "unsubscribe netdev" in >>>> the body of a message to majordomo@vger.kernel.org >>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>> >>> >