From: Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
To: Erez Shitrit <erezsh-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org,
Or Gerlitz <gerlitz.or-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
Erez Shitrit <erezsh-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Subject: Re: [PATCH 7/9] IB/ipoib: fix MCAST_FLAG_BUSY usage
Date: Mon, 02 Mar 2015 10:27:16 -0500 [thread overview]
Message-ID: <1425310036.2354.24.camel@redhat.com> (raw)
In-Reply-To: <54F2DC81.304-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
[-- Attachment #1: Type: text/plain, Size: 7404 bytes --]
On Sun, 2015-03-01 at 11:31 +0200, Erez Shitrit wrote:
> On 2/22/2015 2:27 AM, Doug Ledford wrote:
> > Commit a9c8ba5884 ("IPoIB: Fix usage of uninitialized multicast
> > objects") added a new flag MCAST_JOIN_STARTED, but was not very strict
> > in how it was used. We didn't always initialize the completion struct
> > before we set the flag, and we didn't always call complete on the
> > completion struct from all paths that complete it. And when we did
> > complete it, sometimes we continued to touch the mcast entry after
> > the completion, opening us up to possible use after free issues.
> >
> > This made it less than totally effective, and certainly made its use
> > confusing. And in the flush function we would use the presence of this
> > flag to signal that we should wait on the completion struct, but we never
> > cleared this flag, ever.
> >
> > In order to make things clearer and aid in resolving the rtnl deadlock
> > bug I've been chasing, I cleaned this up a bit.
> >
> > 1) Remove the MCAST_JOIN_STARTED flag entirely
> > 2) Change MCAST_FLAG_BUSY so it now only means a join is in-flight
> > 3) Test mcast->mc directly to see if we have completed
> > ib_sa_join_multicast (using IS_ERR_OR_NULL)
> > 4) Make sure that before setting MCAST_FLAG_BUSY we always initialize
> > the mcast->done completion struct
> > 5) Make sure that before calling complete(&mcast->done), we always clear
> > the MCAST_FLAG_BUSY bit
> > 6) Take the mcast_mutex before we call ib_sa_multicast_join and also
> > take the mutex in our join callback. This forces
> > ib_sa_multicast_join to return and set mcast->mc before we process
> > the callback. This way, our callback can safely clear mcast->mc
> > if there is an error on the join and we will do the right thing as
> > a result in mcast_dev_flush.
> > 7) Because we need the mutex to synchronize mcast->mc, we can no
> > longer call mcast_sendonly_join directly from mcast_send and
> > instead must add sendonly join processing to the mcast_join_task
> > 8) Make MCAST_RUN mean that we have a working mcast subsystem, not that
> > we have a running task. We know when we need to reschedule our
> > join task thread and don't need a flag to tell us.
> > 9) Add a helper for rescheduling the join task thread
> >
> > A number of different races are resolved with these changes. These
> > races existed with the old MCAST_FLAG_BUSY usage, the
> > MCAST_JOIN_STARTED flag was an attempt to address them, and while it
> > helped, a determined effort could still trip things up.
> >
> > One race looks something like this:
> >
> > Thread 1 Thread 2
> > ib_sa_join_multicast (as part of running restart mcast task)
> > alloc member
> > call callback
> > ifconfig ib0 down
> > wait_for_completion
> > callback call completes
> > wait_for_completion in
> > mcast_dev_flush completes
> > mcast->mc is PTR_ERR_OR_NULL
> > so we skip ib_sa_leave_multicast
> > return from callback
> > return from ib_sa_join_multicast
> > set mcast->mc = return from ib_sa_multicast
> >
> > We now have a permanently unbalanced join/leave issue that trips up the
> > refcounting in core/multicast.c
> >
> > Another like this:
> >
> > Thread 1 Thread 2 Thread 3
> > ib_sa_multicast_join
> > ifconfig ib0 down
> > priv->broadcast = NULL
> > join_complete
> > wait_for_completion
> > mcast->mc is not yet set, so don't clear
> > return from ib_sa_join_multicast and set mcast->mc
> > complete
> > return -EAGAIN (making mcast->mc invalid)
> > call ib_sa_multicast_leave
> > on invalid mcast->mc, hang
> > forever
> >
> > By holding the mutex around ib_sa_multicast_join and taking the mutex
> > early in the callback, we force mcast->mc to be valid at the time we
> > run the callback. This allows us to clear mcast->mc if there is an
> > error and the join is going to fail. We do this before we complete
> > the mcast. In this way, mcast_dev_flush always sees consistent state
> > in regards to mcast->mc membership at the time that the
> > wait_for_completion() returns.
> >
> > Signed-off-by: Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> > ---
> > drivers/infiniband/ulp/ipoib/ipoib.h | 11 +-
> > drivers/infiniband/ulp/ipoib/ipoib_multicast.c | 355 ++++++++++++++++---------
> > 2 files changed, 238 insertions(+), 128 deletions(-)
> >
> > diff --git a/drivers/infiniband/ulp/ipoib/ipoib.h b/drivers/infiniband/ulp/ipoib/ipoib.h
> > index 9ef432ae72e..c79dcd5ee8a 100644
> > --- a/drivers/infiniband/ulp/ipoib/ipoib.h
> > +++ b/drivers/infiniband/ulp/ipoib/ipoib.h
> > @@ -98,9 +98,15 @@ enum {
> >
> > IPOIB_MCAST_FLAG_FOUND = 0, /* used in set_multicast_list */
> > IPOIB_MCAST_FLAG_SENDONLY = 1,
> > - IPOIB_MCAST_FLAG_BUSY = 2, /* joining or already joined */
> > + /*
> > + * For IPOIB_MCAST_FLAG_BUSY
> > + * When set, in flight join and mcast->mc is unreliable
> > + * When clear and mcast->mc IS_ERR_OR_NULL, need to restart or
> > + * haven't started yet
> > + * When clear and mcast->mc is valid pointer, join was successful
> > + */
> > + IPOIB_MCAST_FLAG_BUSY = 2,
> > IPOIB_MCAST_FLAG_ATTACHED = 3,
> > - IPOIB_MCAST_JOIN_STARTED = 4,
> >
> > MAX_SEND_CQE = 16,
> > IPOIB_CM_COPYBREAK = 256,
> > @@ -148,6 +154,7 @@ struct ipoib_mcast {
> >
> > unsigned long created;
> > unsigned long backoff;
> > + unsigned long delay_until;
> >
> > unsigned long flags;
> > unsigned char logcount;
> > diff --git a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
> > index bb1b69904f9..277e7ac7c4d 100644
> > --- a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
> > +++ b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
> > @@ -66,6 +66,48 @@ struct ipoib_mcast_iter {
> > unsigned int send_only;
> > };
> >
> > +/*
> > + * This should be called with the mcast_mutex held
> > + */
> > +static void __ipoib_mcast_schedule_join_thread(struct ipoib_dev_priv *priv,
> > + struct ipoib_mcast *mcast,
> > + bool delay)
> > +{
> > + if (!test_bit(IPOIB_MCAST_RUN, &priv->flags))
>
> You don't need the flag IPOIB_MCAST_RUN, it is duplicated of
> IPOIB_FLAG_OPER_UP
> probably, need to be taken from all places (including ipoib.h file).
This is probably true, however I skipped it for this series of patches.
It wasn't a requirement of proper operation, and depending on where in
the patch series you tried to inject this change, it had unintended
negative consequences. In particular, up until patch 7/9,
mcast_restart_task used to do a hand rolled stop thread and a matching
start_thread and so couldn't use FLAG_OPER_UP because we needed
FLAG_OPER_UP to tell us whether or not to restart the thread.
--
Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
GPG KeyID: 0E572FDD
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]
next prev parent reply other threads:[~2015-03-02 15:27 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-02-22 0:26 [PATCH 0/9] IB/ipoib: fixup multicast locking issues Doug Ledford
[not found] ` <cover.1424562072.git.dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-02-22 0:26 ` [PATCH 1/9] IB/ipoib: factor out ah flushing Doug Ledford
[not found] ` <b06eb720c2f654f5ecdb72c66f4e89149d1c24ec.1424562072.git.dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-02-26 13:28 ` Erez Shitrit
[not found] ` <54EF1F67.4000001-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2015-02-26 16:27 ` Doug Ledford
[not found] ` <1424968046.2543.18.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-03-01 6:47 ` Erez Shitrit
[not found] ` <54F2B61C.9080308-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2015-03-02 15:09 ` Doug Ledford
[not found] ` <1425308967.2354.19.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-03-03 9:59 ` Erez Shitrit
[not found] ` <54F585E9.7070704-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2015-03-13 8:39 ` Or Gerlitz
[not found] ` <CAJ3xEMgxxHu5BQdADaRe-Grtf4rm1LMfsCRiDyF6ToPdV_62OA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-03-15 18:42 ` Doug Ledford
[not found] ` <3A0A417D-BFE4-475C-BAB3-C3FB1D313022-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-03-16 15:24 ` Erez Shitrit
[not found] ` <5506F5B2.1080900-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2015-03-16 16:06 ` Doug Ledford
[not found] ` <ADC46FD9-3179-4182-949D-1884C9D31757-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-03-16 16:51 ` Erez Shitrit
2015-03-16 18:00 ` Doug Ledford
2015-02-22 0:27 ` [PATCH 2/9] IB/ipoib: change init sequence ordering Doug Ledford
2015-02-22 0:27 ` [PATCH 3/9] IB/ipoib: Consolidate rtnl_lock tasks in workqueue Doug Ledford
2015-02-22 0:27 ` [PATCH 4/9] IB/ipoib: Make the carrier_on_task race aware Doug Ledford
2015-02-22 0:27 ` [PATCH 5/9] IB/ipoib: Use dedicated workqueues per interface Doug Ledford
[not found] ` <1cfdf15058cea312f07c2907490a1d7300603c40.1424562072.git.dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-02-23 16:48 ` Or Gerlitz
2015-02-22 0:27 ` [PATCH 6/9] IB/ipoib: No longer use flush as a parameter Doug Ledford
2015-02-22 0:27 ` [PATCH 7/9] IB/ipoib: fix MCAST_FLAG_BUSY usage Doug Ledford
[not found] ` <9d657f64ee961ee3b3233520d8b499b234a42bcd.1424562072.git.dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-03-01 9:31 ` Erez Shitrit
[not found] ` <54F2DC81.304-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2015-03-02 15:27 ` Doug Ledford [this message]
[not found] ` <1425310036.2354.24.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-03-03 9:53 ` Erez Shitrit
2015-02-22 0:27 ` [PATCH 8/9] IB/ipoib: deserialize multicast joins Doug Ledford
[not found] ` <a24ade295dfdd1369aac47a978003569ec190952.1424562072.git.dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-03-01 13:58 ` Erez Shitrit
[not found] ` <54F31AEC.3010001-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2015-03-02 15:29 ` Doug Ledford
[not found] ` <1425310145.2354.26.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-03-03 9:54 ` Erez Shitrit
2015-02-22 0:27 ` [PATCH 9/9] IB/ipoib: drop mcast_mutex usage Doug Ledford
[not found] ` <767f4c41779db63ce8c6dbba04b21959aba70ef9.1424562072.git.dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-02-23 16:56 ` Or Gerlitz
[not found] ` <CAJ3xEMgLPF9pCwQDy9QyL9fAERJXJRXN2gBj3nhuXUCcbfCMPg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-02-23 17:41 ` Doug Ledford
2015-02-22 21:34 ` [PATCH 0/9] IB/ipoib: fixup multicast locking issues Or Gerlitz
[not found] ` <CAJ3xEMgj=ATKLt0MA67c3WefCrG1hZ59eSrhpD-u_dxLJe2kfg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-02-22 21:56 ` Doug Ledford
[not found] ` <1424642176.4847.2.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-02-22 21:57 ` Doug Ledford
2015-03-13 8:41 ` Or Gerlitz
[not found] ` <CAJ3xEMjHrTH_F=zPDsH9A9qRWo=AYN4sgbsdDKV62nzBkB5kXA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-03-15 18:52 ` Doug Ledford
[not found] ` <F42024C5-60A5-4B92-B4AC-4D225E2C0FC3-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-03-31 17:04 ` ira.weiny
[not found] ` <20150331170452.GA6261-W4f6Xiosr+yv7QzWx2u06xL4W9x8LtSr@public.gmane.org>
2015-03-31 20:42 ` Or Gerlitz
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1425310036.2354.24.camel@redhat.com \
--to=dledford-h+wxahxf7alqt0dzr+alfa@public.gmane.org \
--cc=erezsh-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org \
--cc=erezsh-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
--cc=gerlitz.or-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
--cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.