From: Jay Vosburgh <fubar@us.ibm.com>
To: Cong Wang <cwang@twopensource.com>
Cc: "Thomas Glanzmann" <thomas@glanzmann.de>,
"Eric Dumazet" <eric.dumazet@gmail.com>,
netdev <netdev@vger.kernel.org>,
"Veaceslav Falico" <vfalico@redhat.com>,
andy@greyhouse.net, "Jiří Pírko" <jiri@resnulli.us>
Subject: Re: RTNL: assertion failed at net/core/dev.c (4494) and RTNL: assertion failed at net/core/rtnetlink.c (940)
Date: Thu, 06 Feb 2014 14:07:42 -0800 [thread overview]
Message-ID: <31272.1391724462@death.nxdomain> (raw)
In-Reply-To: <30988.1391723318@death.nxdomain>
Jay Vosburgh <fubar@us.ibm.com> wrote:
>Cong Wang <cwang@twopensource.com> wrote:
>
>>On Thu, Feb 6, 2014 at 12:51 PM, Thomas Glanzmann <thomas@glanzmann.de> wrote:
>>> Hello,
>>> this morning I checked out Linus tip and compiled it after booting my
>>> dmesg is full of:
>>>
>>> [ 8.944991] RTNL: assertion failed at net/core/dev.c (4494)
>>> [ 8.950640] CPU: 3 PID: 388 Comm: kworker/u24:4 Not tainted 3.14.0-rc1+ #3
>>> [ 8.950642] Hardware name: Supermicro X9SRD-F/X9SRD-F, BIOS 1.0a 10/15/2012
>>> [ 8.950654] Workqueue: bond0 bond_3ad_state_machine_handler [bonding]
>>> [ 8.950658] 0000000000000000 ffff881020c88000 ffffffff8138e219 ffff881020c88000
>>> [ 8.950664] ffffffff812d3091 ffff881023961040 ffffffff812e3132 0000000000000246
>>> [ 8.950670] 0000000000000020 ffff881020ab1be8 0000000020ab1ba8 0000000000000000
>>> [ 8.950675] Call Trace:
>>> [ 8.950686] [<ffffffff8138e219>] ? dump_stack+0x41/0x51
>>> [ 8.950694] [<ffffffff812d3091>] ? netdev_master_upper_dev_get+0x2a/0x4d
>>> [ 8.950699] [<ffffffff812e3132>] ? rtnl_fill_ifinfo+0x2c/0xac4
>>> [ 8.950707] [<ffffffff81072211>] ? print_time.part.5+0x50/0x54
>>> [ 8.950715] [<ffffffff812caf94>] ? __kmalloc_reserve.isra.42+0x2a/0x6d
>>> [ 8.950721] [<ffffffff81102040>] ? ksize+0x12/0x1e
>>> [ 8.950726] [<ffffffff812cb2b7>] ? __alloc_skb+0xb5/0x1a9
>>> [ 8.950731] [<ffffffff812e4626>] ? rtmsg_ifinfo+0x6c/0xd6
>>> [ 8.950739] [<ffffffffa035f4f9>] ? __enable_port.isra.17+0x51/0x5a [bonding]
>>> [ 8.950747] [<ffffffffa0360463>] ? ad_agg_selection_logic+0x3d3/0x3ed [bonding]
>>> [ 8.950754] [<ffffffffa0360d40>] ? bond_3ad_state_machine_handler+0x555/0x918 [bonding]
>>> [ 8.950761] [<ffffffff8104db2d>] ? process_one_work+0x191/0x293
>>> [ 8.950766] [<ffffffff8104dfde>] ? worker_thread+0x121/0x1e7
>>> [ 8.950770] [<ffffffff8104debd>] ? rescuer_thread+0x269/0x269
>>> [ 8.950777] [<ffffffff810527b6>] ? kthread+0x99/0xa1
>>> [ 8.950782] [<ffffffff8105271d>] ? __kthread_parkme+0x59/0x59
>>> [ 8.950789] [<ffffffff8139733c>] ? ret_from_fork+0x7c/0xb0
>>> [ 8.950794] [<ffffffff8105271d>] ? __kthread_parkme+0x59/0x59
>>
>>
>>Hmm, rtmsg_ifinfo() should be called with rtnl lock, but
>>__enable_port() is called
>>with rcu_read_lock() which means we can't block inside it, therefore we probably
>>should take rtnl lock outside:
>>
>>diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
>>index cce1f1b..3c09ffa 100644
>>--- a/drivers/net/bonding/bond_3ad.c
>>+++ b/drivers/net/bonding/bond_3ad.c
>>@@ -2065,6 +2065,7 @@ void bond_3ad_state_machine_handler(struct
>>work_struct *work)
>> struct slave *slave;
>> struct port *port;
>>
>>+ rtnl_lock();
>> read_lock(&bond->lock);
>> rcu_read_lock();
>>
>>@@ -2123,6 +2124,7 @@ void bond_3ad_state_machine_handler(struct
>>work_struct *work)
>> re_arm:
>> rcu_read_unlock();
>> read_unlock(&bond->lock);
>>+ rtnl_unlock();
>> queue_delayed_work(bond->wq, &bond->ad_work, ad_delta_in_ticks);
>> }
>
> That would eliminate the warning, but is suboptimal. Acquiring
>RTNL is not necessary on the vast majority of state machine runs
>(because no state changes take place, i.e., no ports are disabled or
>enabled). The above change would add 10 round trips per second to RTNL,
>which seems excessive.
>
> Also, we cannot unconditionally acquire RTNL in this function,
>as it would race with the call to cancel_delayed_work_sync from
>bond_close (via bond_work_cancel_all).
Thought of one more problem: we can't hold a regular lock while
calling rtmsg_ifinfo, as it may sleep in alloc_skb. The rtmsg_ifinfo
call has to be RTNL and nothing else.
-J
---
-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com
next prev parent reply other threads:[~2014-02-06 22:07 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-02-06 20:51 RTNL: assertion failed at net/core/dev.c (4494) and RTNL: assertion failed at net/core/rtnetlink.c (940) Thomas Glanzmann
2014-02-06 21:40 ` Cong Wang
2014-02-06 21:48 ` Jay Vosburgh
2014-02-06 22:07 ` Jay Vosburgh [this message]
2014-02-06 22:12 ` Cong Wang
2014-02-06 22:33 ` Jay Vosburgh
2014-02-08 1:21 ` Jay Vosburgh
2014-02-08 1:43 ` Ding Tianhong
2014-02-06 21:45 ` Jay Vosburgh
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=31272.1391724462@death.nxdomain \
--to=fubar@us.ibm.com \
--cc=andy@greyhouse.net \
--cc=cwang@twopensource.com \
--cc=eric.dumazet@gmail.com \
--cc=jiri@resnulli.us \
--cc=netdev@vger.kernel.org \
--cc=thomas@glanzmann.de \
--cc=vfalico@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).