From: Steve Wise <swise@opengridcomputing.com>
To: roland@purestorage.com
Cc: Bart Van Assche <bvanassche@acm.org>,
Shawn Bohrer <sbohrer@rgmadvisors.com>,
Or Gerlitz <ogerlitz@mellanox.com>,
Shawn Bohrer <shawn.bohrer@gmail.com>,
Cong Wang <xiyou.wangcong@gmail.com>,
netdev@vger.kernel.org, linux-rdma@vger.kernel.org,
swise@chelsio.com
Subject: Re: rtnl_lock deadlock on 3.10
Date: Thu, 05 Sep 2013 10:14:51 -0500 [thread overview]
Message-ID: <52289FEB.7060606@opengridcomputing.com> (raw)
In-Reply-To: <522856A4.8040800@acm.org>
On 9/5/2013 5:02 AM, Bart Van Assche wrote:
> On 07/30/13 14:54, Steve Wise wrote:
>> On 7/29/2013 6:02 PM, Shawn Bohrer wrote:
>>> On Mon, Jul 15, 2013 at 09:38:19AM -0500, Shawn Bohrer wrote:
>>>> On Wed, Jul 03, 2013 at 08:26:11PM +0300, Or Gerlitz wrote:
>>>>> On 03/07/2013 20:22, Shawn Bohrer wrote:
>>>>>> On Wed, Jul 03, 2013 at 07:33:07AM +0200, Hannes Frederic Sowa
>>>>>> wrote:
>>>>>>> On Wed, Jul 03, 2013 at 07:11:52AM +0200, Hannes Frederic Sowa
>>>>>>> wrote:
>>>>>>>> On Tue, Jul 02, 2013 at 01:38:26PM +0000, Cong Wang wrote:
>>>>>>>>> On Tue, 02 Jul 2013 at 08:28 GMT, Hannes Frederic Sowa
>>>>>>>>> <hannes@stressinduktion.org> wrote:
>>>>>>>>>> On Mon, Jul 01, 2013 at 09:54:56AM -0500, Shawn Bohrer wrote:
>>>>>>>>>>> I've managed to hit a deadlock at boot a couple times while
>>>>>>>>>>> testing
>>>>>>>>>>> the 3.10 rc kernels. It seems to always happen when my network
>>>>>>>>>>> devices are initializing. This morning I updated to v3.10 and
>>>>>>>>>>> made a
>>>>>>>>>>> few config tweaks and so far I've hit it 4 out of 5 reboots.
>>>>>>>>>>> It looks
>>>>>>>>>>> like most processes are getting stuck on rtnl_lock. Below is
>>>>>>>>>>> a boot
>>>>>>>>>>> log with the soft lockup prints. Please let know if there
>>>>>>>>>>> is any
>>>>>>>>>>> other information I can provide:
>>>>>>>>>> Could you try a build with CONFIG_LOCKDEP enabled?
>>>>>>>>>>
>>>>>>>>> The problem is clear: ib_register_device() is called with
>>>>>>>>> rtnl_lock,
>>>>>>>>> but itself needs device_mutex, however, ib_register_client()
>>>>>>>>> first
>>>>>>>>> acquires device_mutex, then indirectly calls register_netdev()
>>>>>>>>> which
>>>>>>>>> takes rtnl_lock. Deadlock!
>>>>>>>>>
>>>>>>>>> One possible fix is always taking rtnl_lock before taking
>>>>>>>>> device_mutex, something like below:
>>>>>>>>>
>>>>>>>>> diff --git a/drivers/infiniband/core/device.c
>>>>>>>>> b/drivers/infiniband/core/device.c
>>>>>>>>> index 18c1ece..890870b 100644
>>>>>>>>> --- a/drivers/infiniband/core/device.c
>>>>>>>>> +++ b/drivers/infiniband/core/device.c
>>>>>>>>> @@ -381,6 +381,7 @@ int ib_register_client(struct ib_client
>>>>>>>>> *client)
>>>>>>>>> {
>>>>>>>>> struct ib_device *device;
>>>>>>>>> + rtnl_lock();
>>>>>>>>> mutex_lock(&device_mutex);
>>>>>>>>> list_add_tail(&client->list, &client_list);
>>>>>>>>> @@ -389,6 +390,7 @@ int ib_register_client(struct ib_client
>>>>>>>>> *client)
>>>>>>>>> client->add(device);
>>>>>>>>> mutex_unlock(&device_mutex);
>>>>>>>>> + rtnl_unlock();
>>>>>>>>> return 0;
>>>>>>>>> }
>>>>>>>>> diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c
>>>>>>>>> b/drivers/infiniband/ulp/ipoib/ipoib_main.c
>>>>>>>>> index b6e049a..5a7a048 100644
>>>>>>>>> --- a/drivers/infiniband/ulp/ipoib/ipoib_main.c
>>>>>>>>> +++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c
>>>>>>>>> @@ -1609,7 +1609,7 @@ static struct net_device
>>>>>>>>> *ipoib_add_port(const char *format,
>>>>>>>>> goto event_failed;
>>>>>>>>> }
>>>>>>>>> - result = register_netdev(priv->dev);
>>>>>>>>> + result = register_netdevice(priv->dev);
>>>>>>>>> if (result) {
>>>>>>>>> printk(KERN_WARNING "%s: couldn't register ipoib port
>>>>>>>>> %d; error %d\n",
>>>>>>>>> hca->name, port, result);
>>>>>>>> Looks good to me. Shawn, could you test this patch?
>>>>>>> ib_unregister_device/ib_unregister_client would need the same
>>>>>>> change,
>>>>>>> too. I have not checked the other ->add() and ->remove()
>>>>>>> functions. Also
>>>>>>> cc'ed linux-rdma@vger.kernel.org, Roland Dreier.
>>>>>> Cong's patch is missing the #include <linux/rtnetlink.h> but
>>>>>> otherwise
>>>>>> I've had 34 successful reboots with no deadlocks which is a good
>>>>>> sign.
>>>>>> It sounds like there are more paths that need to be audited and a
>>>>>> proper patch submitted. I can do more testing later if needed.
>>>>>>
>>>>>> Thanks,
>>>>>> Shawn
>>>>>>
>>>>> Guys, I was a bit busy today looking into that, but I don't think we
>>>>> want the IB core layer (core/device.c) to
>>>>> use rtnl locking which is something that belongs to the network
>>>>> stack.
>>>> Has anymore thought been put into a proper fix for this issue?
>>> I'm no expert in this area but I'm having a hard time seeing a
>>> different solution than the one Cong suggested. Just to be clear the
>>> deadlock I hit was between cxgb3 and the ipoib module, so I've Cc'd
>>> Steve Wise in case he has a better solution from the Chelsio side.
>>
>> I don't know of another way to resolve this. The rtnl lock is used in
>> ipoib and mlx4 already. I think we should go forward with the proposed
>> patch.
>
> (replying to an e-mail of one month ago)
>
> Hello,
>
> It would be appreciated if anyone could report what the current status
> of this issue is. I think a deadlock I ran into with kernels 3.10 and
> 3.11 and PCI pass-through is related to this issue. See also
> http://bugzilla.kernel.org/show_bug.cgi?id=60856 for the lockdep report.
>
> Thanks,
>
> Bart.
Roland, what do you think?
As I've said, I think we should go ahead with using the rtnl lock in the
core. Is there a complete patch available for review? looks like the
original was a partial fix.
Steve.
next prev parent reply other threads:[~2013-09-05 15:14 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-07-01 14:54 rtnl_lock deadlock on 3.10 Shawn Bohrer
2013-07-02 8:28 ` Hannes Frederic Sowa
2013-07-02 13:38 ` Cong Wang
2013-07-03 5:11 ` Hannes Frederic Sowa
2013-07-03 5:33 ` Hannes Frederic Sowa
[not found] ` <20130703053307.GB12615-5j1vdhnGyZutBveJljeh2VPnkB77EeZ12LY78lusg7I@public.gmane.org>
2013-07-03 17:22 ` Shawn Bohrer
[not found] ` <20130703172239.GA3439-/vebjAlq/uFE7V8Yqttd03bhEEblAqRIDbRjUBewulXQT0dZR+AlfA@public.gmane.org>
2013-07-03 17:26 ` Or Gerlitz
2013-07-03 17:26 ` Or Gerlitz
2013-07-15 14:38 ` Shawn Bohrer
2013-07-29 23:02 ` Shawn Bohrer
[not found] ` <20130729230216.GB4396-/vebjAlq/uFE7V8Yqttd03bhEEblAqRIDbRjUBewulXQT0dZR+AlfA@public.gmane.org>
2013-07-30 12:54 ` Steve Wise
[not found] ` <51F7B792.7030803-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2013-09-05 10:02 ` Bart Van Assche
2013-09-05 15:14 ` Steve Wise [this message]
2013-09-05 15:34 ` Shawn Bohrer
[not found] ` <52289FEB.7060606-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2013-09-06 23:19 ` Shawn Bohrer
[not found] ` <20130906231901.GB10419-/vebjAlq/uFE7V8Yqttd03bhEEblAqRIDbRjUBewulXQT0dZR+AlfA@public.gmane.org>
2013-09-09 16:48 ` Steve Wise
[not found] ` <522856A4.8040800-HInyCGIudOg@public.gmane.org>
2013-09-06 22:55 ` Shawn Bohrer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=52289FEB.7060606@opengridcomputing.com \
--to=swise@opengridcomputing.com \
--cc=bvanassche@acm.org \
--cc=linux-rdma@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=ogerlitz@mellanox.com \
--cc=roland@purestorage.com \
--cc=sbohrer@rgmadvisors.com \
--cc=shawn.bohrer@gmail.com \
--cc=swise@chelsio.com \
--cc=xiyou.wangcong@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.