From: Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
To: Roland Dreier <roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: Yishai Hadas
<yishaih-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: ipoib race in multicast flow (was: IPoIB oops)
Date: Wed, 25 Jul 2012 15:19:09 +0300 [thread overview]
Message-ID: <500FE43D.1070107@mellanox.com> (raw)
In-Reply-To: <500EBBF0.3020407-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
On 24/07/2012 18:14, Yishai Hadas wrote:
> Just encountered a kernel oops in IPoIB on upstream kernel 3.5 [...]
> oops happened in ipoib_mcast_join_task.
Roland,
I made a review now on the issue Yishai raised, and took a look on few
related commits to that area, as you wrote in a77a57a1a "IPoIB: Fix
deadlock on RTNL in ipoib_stop()" - commit c8c2afe3 "IPoIB: Use rtnl
lock/unlock when changing device flags" added a call to rtnl_lock() in
ipoib_mcast_join_task(), which is run from the ipoib_workqueue, and
hence we can't flush
the workqueue from the context ipoib_stop is called.
HOWEVER, that very same ipoib_stop() context, which doesn't flush the
workqueue, calls ipoib_mcast_dev_flush which goes and deletes all the
multicast entries, and this flow place
now without any synchronization with possible running instances of
ipoib_mcast_join_task which relate to the SAME ipoib device. Yishai's
test stepped on the broadcast point being null, but
this race can hold for any group which this device is joined to.
What would you suggest here, change the ipoib_stop flow to apply
flushing, and doing
rtnl_trylock() instead of rtnl_lock() in ipoib_mcast_join_task() doesn't
seems to be applicable, hence we don't know who took the lock, arbitrary
context that wants now to apply changes on the device or the ipoib_stop
one.
I see that this code is executed unconditionally whenever the mcast task is
> priv->mcast_mtu =
> IPOIB_UD_MTU(ib_mtu_enum_to_int(priv->broadcast->mcmember.mtu));
>
> if (!ipoib_cm_admin_enabled(dev)) {
> rtnl_lock();
> dev_set_mtu(dev, min(priv->mcast_mtu, priv->admin_mtu));
> rtnl_unlock();
> }
maybe if we go wiser and run it only after actually joining the
broadcast group
and not each time could help with solving the race? and/or move the code
that does
the dev_set_mtu call to be executed under another context?
Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
prev parent reply other threads:[~2012-07-25 12:19 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-07-24 15:14 IPoIB oops Yishai Hadas
[not found] ` <500EBBF0.3020407-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2012-07-25 12:19 ` Or Gerlitz [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=500FE43D.1070107@mellanox.com \
--to=ogerlitz-vpraknaxozvwk0htik3j/w@public.gmane.org \
--cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
--cc=yishaih-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.