From: Ying Xue <ying.xue@windriver.com>
To: Ding Tianhong <dingtianhong@huawei.com>
Cc: Jon Maloy <jon.maloy@ericsson.com>,
Allan Stephens <allan.stephens@windriver.com>,
"David S. Miller" <davem@davemloft.net>,
Netdev <netdev@vger.kernel.org>,
<tipc-discussion@lists.sourceforge.net>,
<wangweidong1@huawei.com>
Subject: Re: [PATCH RESEND] tipc: avoid possible deadlock while enable and disable bearer
Date: Fri, 9 Aug 2013 14:39:59 +0800 [thread overview]
Message-ID: <52048EBF.7070709@windriver.com> (raw)
In-Reply-To: <52048CB9.1030203@huawei.com>
On 08/09/2013 02:31 PM, Ding Tianhong wrote:
> We met lockdep warning when enable and disable the bearer for commands such as:
>
> tipc-config -netid=1234 -addr=1.1.3 -be=eth:eth0
> tipc-config -netid=1234 -addr=1.1.3 -bd=eth:eth0
>
> ---------------------------------------------------
>
> [ 327.693595] ======================================================
> [ 327.693994] [ INFO: possible circular locking dependency detected ]
> [ 327.694519] 3.11.0-rc3-wwd-default #4 Tainted: G O
> [ 327.694882] -------------------------------------------------------
> [ 327.695385] tipc-config/5825 is trying to acquire lock:
> [ 327.695754] (((timer))#2){+.-...}, at: [<ffffffff8105be80>] del_timer_sync+0x0/0xd0
> [ 327.696018]
> [ 327.696018] but task is already holding lock:
> [ 327.696018] (&(&b_ptr->lock)->rlock){+.-...}, at: [<ffffffffa02be58d>] bearer_disable+ 0xdd/0x120 [tipc]
> [ 327.696018]
> [ 327.696018] which lock already depends on the new lock.
> [ 327.696018]
> [ 327.696018]
> [ 327.696018] the existing dependency chain (in reverse order) is:
> [ 327.696018]
> [ 327.696018] -> #1 (&(&b_ptr->lock)->rlock){+.-...}:
> [ 327.696018] [<ffffffff810b3b4d>] validate_chain+0x6dd/0x870
> [ 327.696018] [<ffffffff810b40bb>] __lock_acquire+0x3db/0x670
> [ 327.696018] [<ffffffff810b4453>] lock_acquire+0x103/0x130
> [ 327.696018] [<ffffffff814d65b1>] _raw_spin_lock_bh+0x41/0x80
> [ 327.696018] [<ffffffffa02c5d48>] disc_timeout+0x18/0xd0 [tipc]
> [ 327.696018] [<ffffffff8105b92a>] call_timer_fn+0xda/0x1e0
> [ 327.696018] [<ffffffff8105bcd7>] run_timer_softirq+0x2a7/0x2d0
> [ 327.696018] [<ffffffff8105379a>] __do_softirq+0x16a/0x2e0
> [ 327.696018] [<ffffffff81053a35>] irq_exit+0xd5/0xe0
> [ 327.696018] [<ffffffff81033005>] smp_apic_timer_interrupt+0x45/0x60
> [ 327.696018] [<ffffffff814df4af>] apic_timer_interrupt+0x6f/0x80
> [ 327.696018] [<ffffffff8100b70e>] arch_cpu_idle+0x1e/0x30
> [ 327.696018] [<ffffffff810a039d>] cpu_idle_loop+0x1fd/0x280
> [ 327.696018] [<ffffffff810a043e>] cpu_startup_entry+0x1e/0x20
> [ 327.696018] [<ffffffff81031589>] start_secondary+0x89/0x90
> [ 327.696018]
> [ 327.696018] -> #0 (((timer))#2){+.-...}:
> [ 327.696018] [<ffffffff810b33fe>] check_prev_add+0x43e/0x4b0
> [ 327.696018] [<ffffffff810b3b4d>] validate_chain+0x6dd/0x870
> [ 327.696018] [<ffffffff810b40bb>] __lock_acquire+0x3db/0x670
> [ 327.696018] [<ffffffff810b4453>] lock_acquire+0x103/0x130
> [ 327.696018] [<ffffffff8105bebd>] del_timer_sync+0x3d/0xd0
> [ 327.696018] [<ffffffffa02c5855>] tipc_disc_delete+0x15/0x30 [tipc]
> [ 327.696018] [<ffffffffa02be59f>] bearer_disable+0xef/0x120 [tipc]
> [ 327.696018] [<ffffffffa02be74f>] tipc_disable_bearer+0x2f/0x60 [tipc]
> [ 327.696018] [<ffffffffa02bfb32>] tipc_cfg_do_cmd+0x2e2/0x550 [tipc]
> [ 327.696018] [<ffffffffa02c8c79>] handle_cmd+0x49/0xe0 [tipc]
> [ 327.696018] [<ffffffff8143e898>] genl_family_rcv_msg+0x268/0x340
> [ 327.696018] [<ffffffff8143ed30>] genl_rcv_msg+0x70/0xd0
> [ 327.696018] [<ffffffff8143d4c9>] netlink_rcv_skb+0x89/0xb0
> [ 327.696018] [<ffffffff8143e617>] genl_rcv+0x27/0x40
> [ 327.696018] [<ffffffff8143d21e>] netlink_unicast+0x15e/0x1b0
> [ 327.696018] [<ffffffff8143ddcf>] netlink_sendmsg+0x22f/0x400
> [ 327.696018] [<ffffffff813f7836>] __sock_sendmsg+0x66/0x80
> [ 327.696018] [<ffffffff813f7957>] sock_aio_write+0x107/0x120
> [ 327.696018] [<ffffffff8117f76d>] do_sync_write+0x7d/0xc0
> [ 327.696018] [<ffffffff8117fc56>] vfs_write+0x186/0x190
> [ 327.696018] [<ffffffff811803e0>] SyS_write+0x60/0xb0
> [ 327.696018] [<ffffffff814de852>] system_call_fastpath+0x16/0x1b
> [ 327.696018]
> [ 327.696018] other info that might help us debug this:
> [ 327.696018]
> [ 327.696018] Possible unsafe locking scenario:
> [ 327.696018]
> [ 327.696018] CPU0 CPU1
> [ 327.696018] ---- ----
> [ 327.696018] lock(&(&b_ptr->lock)->rlock);
> [ 327.696018] lock(((timer))#2);
> [ 327.696018] lock(&(&b_ptr->lock)->rlock);
> [ 327.696018] lock(((timer))#2);
> [ 327.696018]
> [ 327.696018] *** DEADLOCK ***
> [ 327.696018]
> [ 327.696018] 5 locks held by tipc-config/5825:
> [ 327.696018] #0: (cb_lock){++++++}, at: [<ffffffff8143e608>] genl_rcv+0x18/0x40
> [ 327.696018] #1: (genl_mutex){+.+.+.}, at: [<ffffffff8143ed66>] genl_rcv_msg+0xa6/0xd0
> [ 327.696018] #2: (config_mutex){+.+.+.}, at: [<ffffffffa02bf889>] tipc_cfg_do_cmd+0x39/ 0x550 [tipc]
> [ 327.696018] #3: (tipc_net_lock){++.-..}, at: [<ffffffffa02be738>] tipc_disable_bearer+ 0x18/0x60 [tipc]
> [ 327.696018] #4: (&(&b_ptr->lock)->rlock){+.-...}, at: [<ffffffffa02be58d>] bearer_disable+0xdd/0x120 [tipc]
> [ 327.696018]
> [ 327.696018] stack backtrace:
> [ 327.696018] CPU: 2 PID: 5825 Comm: tipc-config Tainted: G O 3.11.0-rc3-wwd- default #4
> [ 327.696018] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
> [ 327.696018] 00000000ffffffff ffff880037fa77a8 ffffffff814d03dd 0000000000000000
> [ 327.696018] ffff880037fa7808 ffff880037fa77e8 ffffffff810b1c4f 0000000037fa77e8
> [ 327.696018] ffff880037fa7808 ffff880037e4db40 0000000000000000 ffff880037e4e318
> [ 327.696018] Call Trace:
> [ 327.696018] [<ffffffff814d03dd>] dump_stack+0x4d/0xa0
> [ 327.696018] [<ffffffff810b1c4f>] print_circular_bug+0x10f/0x120
> [ 327.696018] [<ffffffff810b33fe>] check_prev_add+0x43e/0x4b0
> [ 327.696018] [<ffffffff810b3b4d>] validate_chain+0x6dd/0x870
> [ 327.696018] [<ffffffff81087a28>] ? sched_clock_cpu+0xd8/0x110
> [ 327.696018] [<ffffffff810b40bb>] __lock_acquire+0x3db/0x670
> [ 327.696018] [<ffffffff810b4453>] lock_acquire+0x103/0x130
> [ 327.696018] [<ffffffff8105be80>] ? try_to_del_timer_sync+0x70/0x70
> [ 327.696018] [<ffffffff8105bebd>] del_timer_sync+0x3d/0xd0
> [ 327.696018] [<ffffffff8105be80>] ? try_to_del_timer_sync+0x70/0x70
> [ 327.696018] [<ffffffffa02c5855>] tipc_disc_delete+0x15/0x30 [tipc]
> [ 327.696018] [<ffffffffa02be59f>] bearer_disable+0xef/0x120 [tipc]
> [ 327.696018] [<ffffffffa02be74f>] tipc_disable_bearer+0x2f/0x60 [tipc]
> [ 327.696018] [<ffffffffa02bfb32>] tipc_cfg_do_cmd+0x2e2/0x550 [tipc]
> [ 327.696018] [<ffffffff81218783>] ? security_capable+0x13/0x20
> [ 327.696018] [<ffffffffa02c8c79>] handle_cmd+0x49/0xe0 [tipc]
> [ 327.696018] [<ffffffff8143e898>] genl_family_rcv_msg+0x268/0x340
> [ 327.696018] [<ffffffff8143ed30>] genl_rcv_msg+0x70/0xd0
> [ 327.696018] [<ffffffff8143ecc0>] ? genl_lock+0x20/0x20
> [ 327.696018] [<ffffffff8143d4c9>] netlink_rcv_skb+0x89/0xb0
> [ 327.696018] [<ffffffff8143e608>] ? genl_rcv+0x18/0x40
> [ 327.696018] [<ffffffff8143e617>] genl_rcv+0x27/0x40
> [ 327.696018] [<ffffffff8143d21e>] netlink_unicast+0x15e/0x1b0
> [ 327.696018] [<ffffffff81289d7c>] ? memcpy_fromiovec+0x6c/0x90
> [ 327.696018] [<ffffffff8143ddcf>] netlink_sendmsg+0x22f/0x400
> [ 327.696018] [<ffffffff813f7836>] __sock_sendmsg+0x66/0x80
> [ 327.696018] [<ffffffff813f7957>] sock_aio_write+0x107/0x120
> [ 327.696018] [<ffffffff813fe29c>] ? release_sock+0x8c/0xa0
> [ 327.696018] [<ffffffff8117f76d>] do_sync_write+0x7d/0xc0
> [ 327.696018] [<ffffffff8117fa24>] ? rw_verify_area+0x54/0x100
> [ 327.696018] [<ffffffff8117fc56>] vfs_write+0x186/0x190
> [ 327.696018] [<ffffffff811803e0>] SyS_write+0x60/0xb0
> [ 327.696018] [<ffffffff814de852>] system_call_fastpath+0x16/0x1b
>
> -----------------------------------------------------------------------
>
> The problem is that the tipc_link_delete() will cancel the timer disc_timeout() when
> the b_ptr->lock is hold, but the disc_timeout() still call b_ptr->lock to finish the
> work, so the dead lock occurs.
>
> We should unlock the b_ptr->lock when del the disc_timeout().
>
> Remove link_timeout() still met the same problem, the patch:
>
> http://article.gmane.org/gmane.network.tipc.general/4380
>
> fix the problem, so no need to send patch for fix link_timeout() deadlock warming.
>
> Signed-off-by: Wang Weidong <wangweidong1@huawei.com>
> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
> ---
> net/tipc/bearer.c | 8 ++++++--
> 1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/net/tipc/bearer.c b/net/tipc/bearer.c
> index cb29ef7..41cad5f 100644
> --- a/net/tipc/bearer.c
> +++ b/net/tipc/bearer.c
> @@ -460,6 +460,7 @@ static void bearer_disable(struct tipc_bearer *b_ptr)
> {
> struct tipc_link *l_ptr;
> struct tipc_link *temp_l_ptr;
> + struct tipc_link_req *temp_req;
>
> pr_info("Disabling bearer <%s>\n", b_ptr->name);
> spin_lock_bh(&b_ptr->lock);
> @@ -468,9 +469,12 @@ static void bearer_disable(struct tipc_bearer *b_ptr)
> list_for_each_entry_safe(l_ptr, temp_l_ptr, &b_ptr->links, link_list) {
> tipc_link_delete(l_ptr);
> }
> - if (b_ptr->link_req)
> - tipc_disc_delete(b_ptr->link_req);
> + temp_req = b_ptr->link_req;
No, here is unsafe for us. Below line should be added:
b_ptr->link_req = NULL;
Otherwise, another oops will be involved!
Regards,
Ying
> spin_unlock_bh(&b_ptr->lock);
> +
> + if (temp_req)
> + tipc_disc_delete(temp_req);
> +
> memset(b_ptr, 0, sizeof(struct tipc_bearer));
> }
>
>
next prev parent reply other threads:[~2013-08-09 6:40 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-08-09 6:31 [PATCH RESEND] tipc: avoid possible deadlock while enable and disable bearer Ding Tianhong
2013-08-09 6:39 ` Ying Xue [this message]
2013-08-09 9:03 ` Ding Tianhong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=52048EBF.7070709@windriver.com \
--to=ying.xue@windriver.com \
--cc=allan.stephens@windriver.com \
--cc=davem@davemloft.net \
--cc=dingtianhong@huawei.com \
--cc=jon.maloy@ericsson.com \
--cc=netdev@vger.kernel.org \
--cc=tipc-discussion@lists.sourceforge.net \
--cc=wangweidong1@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.