public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
From: Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
To: Wendy Cheng <s.wendy.cheng-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Cc: "linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
	<linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	"roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org"
	<roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Subject: Re: [PATCH 4/8] IPoIB: fix mcast_dev_flush/mcast_restart_task race
Date: Wed, 03 Sep 2014 13:49:11 -0400	[thread overview]
Message-ID: <1409766551.26762.2.camel@firewall.xsintricity.com> (raw)
In-Reply-To: <CABgxfbHDuUrdHuLJT2oD07Cy3Ys4_rj-bJ6eR=9+uv0CuPH7_Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

[-- Attachment #1: Type: text/plain, Size: 6050 bytes --]

On Fri, 2014-08-29 at 14:53 -0700, Wendy Cheng wrote:
> On Tue, Aug 12, 2014 at 4:38 PM, Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> > Our mcast_dev_flush routine and our mcast_restart_task can race against
> > each other.  In particular, they both hold the priv->lock while
> > manipulating the rbtree and while removing mcast entries from the
> > multicast_list and while adding entries to the remove_list, but they
> > also both drop their locks prior to doing the actual removes.  The
> > mcast_dev_flush routine is run entirely under the rtnl lock and so has
> > at least some locking.  The actual race condition is like this:
> >
> > Thread 1                                Thread 2
> > ifconfig ib0 up
> >   start multicast join for broadcast
> >   multicast join completes for broadcast
> >   start to add more multicast joins
> >     call mcast_restart_task to add new entries
> >                                         ifconfig ib0 down
> >                                           mcast_dev_flush
> >                                             mcast_leave(mcast A)
> >     mcast_leave(mcast A)
> >
> > As mcast_leave calls ib_sa_multicast_leave, and as member in
> > core/multicast.c is ref counted, we run into an unbalanced refcount
> > issue.  To avoid stomping on each others removes, take the rtnl lock
> > specifically when we are deleting the entries from the remove list.
> 
> Isn't "test_and_clear_bit()" atomic so it is unlikely that
> ib_sa_free_multicast() can run multiple times  ?
> 
>     638 static int ipoib_mcast_leave(struct net_device *dev, struct
> ipoib_mcast         *mcast)
>     639 {
>     640         struct ipoib_dev_priv *priv = netdev_priv(dev);
>     641         int ret = 0;
>     642
>     643         if (test_and_clear_bit(IPOIB_MCAST_FLAG_BUSY, &mcast->flags))
>     644                 ib_sa_free_multicast(mcast->mc);
>     645
>     646         if (test_and_clear_bit(IPOIB_MCAST_FLAG_ATTACHED,
> &mcast->flags)        ) {

This is the original code, and you would be right, but it had other
problems.  Even though test_and_clear_bit is atomic, the original usage
of MCAST_FLAG_BUSY was really that the mcast join had started, there was
no guarantee it had completed (we would set the flag, call
ib_sa_multicast_join, and we could then race with our leave where it was
possible to leave prior to the join completing).

My patches change this around now, and the code only checks for
MCAST_FLAG_BUSY as a double check that we aren't leaving an in-process
join.

> 
> -- Wendy
> 
> >
> > Signed-off-by: Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> > ---
> >  drivers/infiniband/ulp/ipoib/ipoib_multicast.c | 37 ++++++++++++++++++++++----
> >  1 file changed, 32 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
> > index f5e8da530d9..19e3fe75ebf 100644
> > --- a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
> > +++ b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
> > @@ -810,7 +810,10 @@ void ipoib_mcast_dev_flush(struct net_device *dev)
> >
> >         spin_unlock_irqrestore(&priv->lock, flags);
> >
> > -       /* seperate between the wait to the leave*/
> > +       /*
> > +        * make sure the in-flight joins have finished before we attempt
> > +        * to leave
> > +        */
> >         list_for_each_entry_safe(mcast, tmcast, &remove_list, list)
> >                 if (test_bit(IPOIB_MCAST_FLAG_BUSY, &mcast->flags))
> >                         wait_for_completion(&mcast->done);
> > @@ -931,14 +934,38 @@ void ipoib_mcast_restart_task(struct work_struct *work)
> >         netif_addr_unlock(dev);
> >         local_irq_restore(flags);
> >
> > -       /* We have to cancel outside of the spinlock */
> > +       /*
> > +        * make sure the in-flight joins have finished before we attempt
> > +        * to leave
> > +        */
> > +       list_for_each_entry_safe(mcast, tmcast, &remove_list, list)
> > +               if (test_bit(IPOIB_MCAST_FLAG_BUSY, &mcast->flags))
> > +                       wait_for_completion(&mcast->done);
> > +
> > +       /*
> > +        * We have to cancel outside of the spinlock, but we have to
> > +        * take the rtnl lock or else we race with the removal of
> > +        * entries from the remove list in mcast_dev_flush as part
> > +        * of ipoib_stop() which will call mcast_stop_thread with
> > +        * flush == 1 while holding the rtnl lock, and the
> > +        * flush_workqueue won't complete until this restart_mcast_task
> > +        * completes.  So do like the carrier on task and attempt to
> > +        * take the rtnl lock, but if we can't before the ADMIN_UP flag
> > +        * goes away, then just return and know that the remove list will
> > +        * get flushed later by mcast_dev_flush.
> > +        */
> > +       while (!rtnl_trylock()) {
> > +               if (!test_bit(IPOIB_FLAG_ADMIN_UP, &priv->flags))
> > +                       return;
> > +               else
> > +                       msleep(20);
> > +       }
> >         list_for_each_entry_safe(mcast, tmcast, &remove_list, list) {
> >                 ipoib_mcast_leave(mcast->dev, mcast);
> >                 ipoib_mcast_free(mcast);
> >         }
> > -
> > -       if (test_bit(IPOIB_FLAG_ADMIN_UP, &priv->flags))
> > -               ipoib_mcast_start_thread(dev);
> > +       ipoib_mcast_start_thread(dev);
> > +       rtnl_unlock();
> >  }
> >
> >  #ifdef CONFIG_INFINIBAND_IPOIB_DEBUG
> > --
> > 1.9.3
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html


-- 
Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
              GPG KeyID: 0E572FDD



[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

  parent reply	other threads:[~2014-09-03 17:49 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-08-12 23:38 [PATCH 0/8] IPoIB: Fix multiple race conditions Doug Ledford
     [not found] ` <cover.1407885724.git.dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2014-08-12 23:38   ` [PATCH 1/8] IPoIB: Consolidate rtnl_lock tasks in workqueue Doug Ledford
     [not found]     ` <2394730ce5ae1d46522dca04066293dd842edf16.1407885724.git.dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2014-08-15 22:11       ` Wendy Cheng
2014-09-04 14:28       ` Erez Shitrit
2014-08-12 23:38   ` [PATCH 2/8] IPoIB: Make the carrier_on_task race aware Doug Ledford
     [not found]     ` <d05cdce70f1312f35f8be2d14bafd2a06809b137.1407885724.git.dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2014-08-18 23:26       ` Wendy Cheng
     [not found]         ` <CABgxfbE6edfZZ58=mTvhGqWSkCxsik0XuQPR0L-yayze=803cg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-08-19 20:32           ` Doug Ledford
     [not found]             ` <2CC1794B-10CD-49A2-8F5D-0C66A6684DBC-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2014-08-19 22:05               ` Wendy Cheng
2014-09-04 12:13       ` Erez Shitrit
     [not found]         ` <5408576C.7040609-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2014-09-09  7:17           ` Doug Ledford
2014-08-12 23:38   ` [PATCH 3/8] IPoIB: fix MCAST_FLAG_BUSY usage Doug Ledford
     [not found]     ` <a410e80dc5ca7cfa64229bbbf50c1337317e3bd8.1407885724.git.dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2014-08-19 18:08       ` Wendy Cheng
     [not found]         ` <CABgxfbH-Dt3CpxJKwCAZeHTUyupaA9y_WXVXuxgiPMet26PTQw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-08-19 20:28           ` Doug Ledford
     [not found]             ` <902D5BF2-159A-4B31-A87F-7491F3C8057F-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2014-08-25 19:51               ` Wendy Cheng
     [not found]                 ` <CABgxfbHOD75vLdZ0TtWZbk8ne3kHd_eWObxPHmoJ-D8DjE0bkw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-08-25 20:03                   ` Doug Ledford
     [not found]                     ` <E3EFCBAC-2D6E-49D3-A556-DBD40701CC5F-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2014-08-26 18:04                       ` Wendy Cheng
2014-08-12 23:38   ` [PATCH 4/8] IPoIB: fix mcast_dev_flush/mcast_restart_task race Doug Ledford
     [not found]     ` <e1dbcfc25d8930b281aad12699ebf8fa82485b0e.1407885724.git.dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2014-08-29 21:53       ` Wendy Cheng
     [not found]         ` <CABgxfbHDuUrdHuLJT2oD07Cy3Ys4_rj-bJ6eR=9+uv0CuPH7_Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-08-30 15:39           ` Wendy Cheng
     [not found]             ` <CABgxfbGJOdmAn1sokEtisDdnA=r_4mfP=PfqZVsP0cd_oL50dA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-09-03 18:06               ` Doug Ledford
     [not found]                 ` <1409767614.26762.7.camel-v+aXH1h/sVwpzh8Nc7Vzg+562jBIR2Zt@public.gmane.org>
2014-09-03 19:45                   ` Wendy Cheng
2014-09-03 17:49           ` Doug Ledford [this message]
2014-08-12 23:38   ` [PATCH 5/8] IPoIB: change init sequence ordering Doug Ledford
     [not found]     ` <ead9800512c1cb412b86cb1de3868c40f07c72be.1407885724.git.dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2014-09-04 12:36       ` Erez Shitrit
2014-08-12 23:38   ` [PATCH 6/8] IPoIB: Use dedicated workqueues per interface Doug Ledford
     [not found]     ` <f7af9c251d722675a549e4a673f46c0f31dfa266.1407885724.git.dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2014-08-20 15:01       ` Estrin, Alex
2014-09-04  6:49       ` Erez Shitrit
     [not found]         ` <54080B6B.8050707-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2014-09-09  7:09           ` Doug Ledford
     [not found]             ` <540EA7B2.2050805-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2014-09-10 17:27               ` Erez Shitrit
2014-08-12 23:38   ` [PATCH 7/8] IPoIB: Make ipoib_mcast_stop_thread flush the workqueue Doug Ledford
     [not found]     ` <ae3912431eeacd81d920a405a6bdeb3853791b1a.1407885724.git.dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2014-08-20 15:03       ` Estrin, Alex
2014-08-12 23:38   ` [PATCH 8/8] IPoIB: No longer use flush as a parameter Doug Ledford
     [not found]     ` <ad7bb2b8da52f187cf2978e6a1c77ead32b60de3.1407885724.git.dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2014-08-20 15:04       ` Estrin, Alex
2014-08-15 22:08   ` [PATCH 0/8] IPoIB: Fix multiple race conditions Wendy Cheng
     [not found]     ` <CABgxfbEGfiNGUKT4NJi1GoDRouFznxpgogDt5yr47TLfwDB7hA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-09-03 16:26       ` Wendy Cheng
2014-09-03 13:52   ` Or Gerlitz
     [not found]     ` <54071D14.9040404-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2014-09-03 18:12       ` Doug Ledford

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1409766551.26762.2.camel@firewall.xsintricity.com \
    --to=dledford-h+wxahxf7alqt0dzr+alfa@public.gmane.org \
    --cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
    --cc=s.wendy.cheng-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox