From mboxrd@z Thu Jan 1 00:00:00 1970 Date: Tue, 13 Apr 2010 10:33:20 -0700 From: Stephen Hemminger Message-ID: <20100413103320.11a2a4f7@nehalam> In-Reply-To: <8304.1271177567@death.nxdomain.ibm.com> References: <20100408062234.4499.17042.sendpatchset@localhost.localdomain> <20100408062246.4499.5670.sendpatchset@localhost.localdomain> <20100408083710.2b61ee44@nehalam> <4BC2F7E2.7020309@redhat.com> <1271068737.16881.18.camel@edumazet-laptop> <20100412083842.26d71bda@nehalam> <4BC43214.6030009@redhat.com> <8304.1271177567@death.nxdomain.ibm.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [Bridge] [Bonding-devel] [v3 Patch 2/3] bridge: make bridge support netpoll List-Id: Linux Ethernet Bridging List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Jay Vosburgh Cc: Cong Wang , Neil Horman , Jeff, netdev@vger.kernel.org, Matt Mackall , bridge@lists.linux-foundation.org, linux-kernel@vger.kernel.org, David Miller , Moyer , Andy, Gospodarek , Eric Dumazet , bonding-devel@lists.sourceforge.net On Tue, 13 Apr 2010 09:52:47 -0700 Jay Vosburgh wrote: > Cong Wang wrote: >=20 > >Stephen Hemminger wrote: > >> On Mon, 12 Apr 2010 12:38:57 +0200 > >> Eric Dumazet wrote: > >>=20 > >>> Le lundi 12 avril 2010 =C3=A0 18:37 +0800, Cong Wang a =C3=A9crit : > >>>> Stephen Hemminger wrote: > >>>>> There is no protection on dev->priv_flags for SMP access. > >>>>> It would better bit value in dev->state if you are using it as cont= rol flag. > >>>>> > >>>>> Then you could use=20 > >>>>> if (unlikely(test_and_clear_bit(__IN_NETPOLL, &skb->dev->state))) > >>>>> netpoll_send_skb(...) > >>>>> > >>>>> > >>>> Hmm, I think we can't use ->state here, it is not for this kind of p= urpose, > >>>> according to its comments. > >>>> > >>>> Also, I find other usages of IFF_XXX flags of ->priv_flags are also = using > >>>> &, | to set or clear the flags. So there must be some other things p= reventing > >>>> the race... > >>> Yes, its RTNL that protects priv_flags changes, hopefully... > >>> > >>=20 > >> The patch was not protecting priv_flags with RTNL. > >> For example.. > >>=20 > >>=20 > >> @@ -308,7 +312,9 @@ static void netpoll_send_skb(struct netp > >> tries > 0; --tries) { > >> if (__netif_tx_trylock(txq)) { > >> if (!netif_tx_queue_stopped(txq)) { > >> + dev->priv_flags |=3D IFF_IN_NETPOLL; > >> status =3D ops->ndo_start_xmit(skb, dev); > >> + dev->priv_flags &=3D ~IFF_IN_NETPOLL; > >> if (status =3D=3D NETDEV_TX_OK) > >> txq_trans_update(txq); > > > >Hmm, but I checked the bonding case (IFF_BONDING), it doesn't > >hold rtnl_lock. Strange. >=20 > I looked, and there are a couple of cases in bonding that don't > have RTNL for adjusting priv_flags (in bond_ab_arp_probe when no slaves > are up, and a couple of cases in 802.3ad). I think the solution there > is to move bonding away from priv_flags for some of this (e.g., convert > bonding to use a frame hook like bridge and macvlan, and greatly > simplify skb_bond_should_drop), but that's a separate topic. >=20 > The majority of the cases, however, do hold RTNL. Bonding > generally doesn't have to acquire RTNL itself, since whatever called > into bonding is holding it already. For example, the slave add and > remove paths (bond_enslave, bond_release) are called either via sysfs or > ioctl, both of which acquire RTNL. All of the set and clear operations > for IFF_BONDING fall into this category; look at bonding_store_slaves > for an example. >=20 > Bonding does acquire RTNL itself when performing failovers, > e.g., bond_mii_monitor holds RTNL prior to calling bond_miimon_commit, > which will change priv_flags. >=20 All this was related to netpoll. And netpoll processing often needs to occur in hard IRQ context. Therefor netpoll stuff and RTNL (which is a mutex), really don't mix well. Keep RTNL for what it was meant for network reconfiguration. Don't turn it into a network special BKL. --=20 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752142Ab0DMReh (ORCPT ); Tue, 13 Apr 2010 13:34:37 -0400 Received: from smtp1.linux-foundation.org ([140.211.169.13]:45608 "EHLO smtp1.linux-foundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751131Ab0DMReg convert rfc822-to-8bit (ORCPT ); Tue, 13 Apr 2010 13:34:36 -0400 Date: Tue, 13 Apr 2010 10:33:20 -0700 From: Stephen Hemminger To: Jay Vosburgh Cc: Cong Wang , Eric Dumazet , Neil Horman , netdev@vger.kernel.org, Andy Gospodarek , bridge@lists.linux-foundation.org, linux-kernel@vger.kernel.org, bonding-devel@lists.sourceforge.net, Jeff Moyer , Matt Mackall , David Miller Subject: Re: [Bonding-devel] [v3 Patch 2/3] bridge: make bridge support netpoll Message-ID: <20100413103320.11a2a4f7@nehalam> In-Reply-To: <8304.1271177567@death.nxdomain.ibm.com> References: <20100408062234.4499.17042.sendpatchset@localhost.localdomain> <20100408062246.4499.5670.sendpatchset@localhost.localdomain> <20100408083710.2b61ee44@nehalam> <4BC2F7E2.7020309@redhat.com> <1271068737.16881.18.camel@edumazet-laptop> <20100412083842.26d71bda@nehalam> <4BC43214.6030009@redhat.com> <8304.1271177567@death.nxdomain.ibm.com> Organization: Linux Foundation X-Mailer: Claws Mail 3.7.5 (GTK+ 2.18.3; x86_64-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8BIT Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 13 Apr 2010 09:52:47 -0700 Jay Vosburgh wrote: > Cong Wang wrote: > > >Stephen Hemminger wrote: > >> On Mon, 12 Apr 2010 12:38:57 +0200 > >> Eric Dumazet wrote: > >> > >>> Le lundi 12 avril 2010 à 18:37 +0800, Cong Wang a écrit : > >>>> Stephen Hemminger wrote: > >>>>> There is no protection on dev->priv_flags for SMP access. > >>>>> It would better bit value in dev->state if you are using it as control flag. > >>>>> > >>>>> Then you could use > >>>>> if (unlikely(test_and_clear_bit(__IN_NETPOLL, &skb->dev->state))) > >>>>> netpoll_send_skb(...) > >>>>> > >>>>> > >>>> Hmm, I think we can't use ->state here, it is not for this kind of purpose, > >>>> according to its comments. > >>>> > >>>> Also, I find other usages of IFF_XXX flags of ->priv_flags are also using > >>>> &, | to set or clear the flags. So there must be some other things preventing > >>>> the race... > >>> Yes, its RTNL that protects priv_flags changes, hopefully... > >>> > >> > >> The patch was not protecting priv_flags with RTNL. > >> For example.. > >> > >> > >> @@ -308,7 +312,9 @@ static void netpoll_send_skb(struct netp > >> tries > 0; --tries) { > >> if (__netif_tx_trylock(txq)) { > >> if (!netif_tx_queue_stopped(txq)) { > >> + dev->priv_flags |= IFF_IN_NETPOLL; > >> status = ops->ndo_start_xmit(skb, dev); > >> + dev->priv_flags &= ~IFF_IN_NETPOLL; > >> if (status == NETDEV_TX_OK) > >> txq_trans_update(txq); > > > >Hmm, but I checked the bonding case (IFF_BONDING), it doesn't > >hold rtnl_lock. Strange. > > I looked, and there are a couple of cases in bonding that don't > have RTNL for adjusting priv_flags (in bond_ab_arp_probe when no slaves > are up, and a couple of cases in 802.3ad). I think the solution there > is to move bonding away from priv_flags for some of this (e.g., convert > bonding to use a frame hook like bridge and macvlan, and greatly > simplify skb_bond_should_drop), but that's a separate topic. > > The majority of the cases, however, do hold RTNL. Bonding > generally doesn't have to acquire RTNL itself, since whatever called > into bonding is holding it already. For example, the slave add and > remove paths (bond_enslave, bond_release) are called either via sysfs or > ioctl, both of which acquire RTNL. All of the set and clear operations > for IFF_BONDING fall into this category; look at bonding_store_slaves > for an example. > > Bonding does acquire RTNL itself when performing failovers, > e.g., bond_mii_monitor holds RTNL prior to calling bond_miimon_commit, > which will change priv_flags. > All this was related to netpoll. And netpoll processing often needs to occur in hard IRQ context. Therefor netpoll stuff and RTNL (which is a mutex), really don't mix well. Keep RTNL for what it was meant for network reconfiguration. Don't turn it into a network special BKL. --