rtnl_mutex deadlock?

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* rtnl_mutex deadlock?
@ 2015-08-04 15:48 Linus Torvalds
  2015-08-05  5:31 ` Cong Wang
  0 siblings, 1 reply; 14+ messages in thread
From: Linus Torvalds @ 2015-08-04 15:48 UTC (permalink / raw)
  To: David Miller, Nicolas Dichtel, Thomas Graf, Jiri Pirko,
	Scott Feldman, Daniel Borkmann
  Cc: Network Development

[-- Attachment #1: Type: text/plain, Size: 1195 bytes --]

Sorry for the spamming of random rtnetlink people, but I just resumed
my laptop at PDX, and networking was dead.

It looks like a deadlock on rtnl_mutex, possibly due to some error
path not releasing the lock. No network op was making any progress,
and as you can see from the attached sysrq-w, it all seems to be hung
in rtnl_lock().

The call trace from NetworkManager looks different from the others,
and looks to me like it might actually be a recursive invocation of
rtnetlink_rcv(), but since I have a fairly light configuration on this
laptop and don't have frame pointers enabled, I'm not sure how
reliable that stack trace is. It might be just stale entries. But if
they aren't stale, then that would certainly explain the deadlock.

I had to reboot the laptop to get it to be usable, and the problem
didn't happen again, so I have no other real debugging info to help.

The only thing special here was that I suspended the machine in one
wireless network, and resumed it in another one. That doesn't sound
very special to me, but it's the only thing remotely different from my
normal suspend/resume cycles that have worked fine during this release
so far..

                     Linus

[-- Attachment #2: hung-wireless-after-resume --]
[-- Type: application/octet-stream, Size: 26458 bytes --]

[12073.466963] wlp1s0: authenticate with d0:72:dc:1e:1d:3f
[12073.469531] wlp1s0: send auth to d0:72:dc:1e:1d:3f (try 1/3)
[12073.470011] wlp1s0: authenticated
[12073.471114] wlp1s0: associate with d0:72:dc:1e:1d:3f (try 1/3)
[12073.485741] wlp1s0: RX AssocResp from d0:72:dc:1e:1d:3f (capab=0x1 status=0 aid=52)
[12073.486967] wlp1s0: associated
[12073.486997] IPv6: ADDRCONF(NETDEV_CHANGE): wlp1s0: link becomes ready
[12073.492467] cfg80211: Regulatory domain changed to country: US
[12073.492470] cfg80211:  DFS Master region: FCC
[12073.492471] cfg80211:   (start_freq - end_freq @ bandwidth), (max_antenna_gain, max_eirp), (dfs_cac_time)
[12073.492473] cfg80211:   (2402000 KHz - 2472000 KHz @ 40000 KHz), (N/A, 3000 mBm), (N/A)
[12073.492475] cfg80211:   (5170000 KHz - 5250000 KHz @ 80000 KHz, 160000 KHz AUTO), (N/A, 1700 mBm), (N/A)
[12073.492476] cfg80211:   (5250000 KHz - 5330000 KHz @ 80000 KHz, 160000 KHz AUTO), (N/A, 2300 mBm), (0 s)
[12073.492478] cfg80211:   (5490000 KHz - 5730000 KHz @ 160000 KHz), (N/A, 2300 mBm), (0 s)
[12073.492479] cfg80211:   (5735000 KHz - 5835000 KHz @ 80000 KHz), (N/A, 3000 mBm), (N/A)
[12073.492480] cfg80211:   (57240000 KHz - 63720000 KHz @ 2160000 KHz), (N/A, 4000 mBm), (N/A)
[12073.555762] wlp1s0: Limiting TX power to 23 dBm as advertised by d0:72:dc:1e:1d:3f
[12215.003492] wlp1s0: deauthenticated from d0:72:dc:1e:1d:3f (Reason: 2=PREV_AUTH_NOT_VALID)

[12236.694167] sysrq: SysRq : Show Blocked State
[12236.694180]   task                        PC stack   pid father

[12236.694209] NetworkManager  D 0000000000013b80     0  1047      1 0x00000000
[12236.694218]  ffff88003f902640 0000000000000000 ffffffff815d15a9 0000000000000018
[12236.694224]  ffff880119538000 ffff88003f902640 ffffffff81a8ff84 00000000ffffffff
[12236.694230]  ffffffff81a8ff88 ffff880119c47f00 ffffffff815d133a ffffffff81a8ff80
[12236.694235] Call Trace:
[12236.694250]  [<ffffffff815d15a9>] ? schedule_preempt_disabled+0x9/0x10
[12236.694257]  [<ffffffff815d133a>] ? schedule+0x2a/0x70
[12236.694263]  [<ffffffff815d15a9>] ? schedule_preempt_disabled+0x9/0x10
[12236.694271]  [<ffffffff815d2c3f>] ? __mutex_lock_slowpath+0x7f/0xf0
[12236.694280]  [<ffffffff815d2cc6>] ? mutex_lock+0x16/0x30
[12236.694291]  [<ffffffff814f1f90>] ? rtnetlink_rcv+0x10/0x30
[12236.694299]  [<ffffffff8150ce3b>] ? netlink_unicast+0xfb/0x180
[12236.694309]  [<ffffffff814f5ad3>] ? rtnl_getlink+0x113/0x190
[12236.694319]  [<ffffffff814f202a>] ? rtnetlink_rcv_msg+0x7a/0x210
[12236.694331]  [<ffffffff8124565c>] ? sock_has_perm+0x5c/0x70
[12236.694339]  [<ffffffff814f1fb0>] ? rtnetlink_rcv+0x30/0x30
[12236.694346]  [<ffffffff8150d62c>] ? netlink_rcv_skb+0x9c/0xc0
[12236.694354]  [<ffffffff814f1f9f>] ? rtnetlink_rcv+0x1f/0x30
[12236.694360]  [<ffffffff8150ce3b>] ? netlink_unicast+0xfb/0x180
[12236.694367]  [<ffffffff8150d344>] ? netlink_sendmsg+0x484/0x5d0
[12236.694376]  [<ffffffff810a236f>] ? __wake_up+0x2f/0x50
[12236.694387]  [<ffffffff814cad23>] ? sock_sendmsg+0x33/0x40
[12236.694396]  [<ffffffff814cb05e>] ? ___sys_sendmsg+0x22e/0x240
[12236.694405]  [<ffffffff814cab75>] ? ___sys_recvmsg+0x135/0x1a0
[12236.694415]  [<ffffffff811a9d12>] ? eventfd_write+0x82/0x210
[12236.694423]  [<ffffffff811a0f9e>] ? fsnotify+0x32e/0x4c0
[12236.694429]  [<ffffffff8108cb70>] ? wake_up_q+0x60/0x60
[12236.694434]  [<ffffffff814cba09>] ? __sys_sendmsg+0x39/0x70
[12236.694440]  [<ffffffff815d4797>] ? entry_SYSCALL_64_fastpath+0x12/0x6a

[12236.694449] wpa_supplicant  D 0000000000013b80     0  1209      1 0x00000000
[12236.694455]  ffff88011959e600 fffffff900000000 0000000000000001 0000000000000001
[12236.694460]  ffff8800c6ad0000 ffff88011959e600 ffffffff81a8ff84 00000000ffffffff
[12236.694465]  ffffffff81a8ff88 0000000000000028 ffffffff815d133a ffffffff81a8ff80
[12236.694471] Call Trace:
[12236.694477]  [<ffffffff815d133a>] ? schedule+0x2a/0x70
[12236.694484]  [<ffffffff815d15a9>] ? schedule_preempt_disabled+0x9/0x10
[12236.694490]  [<ffffffff815d2c3f>] ? __mutex_lock_slowpath+0x7f/0xf0
[12236.694497]  [<ffffffff815d2cc6>] ? mutex_lock+0x16/0x30
[12236.694503]  [<ffffffff814f1f90>] ? rtnetlink_rcv+0x10/0x30
[12236.694507]  [<ffffffff8150ce3b>] ? netlink_unicast+0xfb/0x180
[12236.694511]  [<ffffffff8150d344>] ? netlink_sendmsg+0x484/0x5d0
[12236.694518]  [<ffffffff814cad23>] ? sock_sendmsg+0x33/0x40
[12236.694524]  [<ffffffff814cb3d0>] ? SYSC_sendto+0xe0/0x130
[12236.694529]  [<ffffffff814cba09>] ? __sys_sendmsg+0x39/0x70
[12236.694534]  [<ffffffff815d4797>] ? entry_SYSCALL_64_fastpath+0x12/0x6a

[12236.694550] goa-daemon      D 0000000000013b80     0  1496      1 0x00000000
[12236.694555]  ffff8800c6950000 fffffff900000000 0000000000000001 0000000000000000
[12236.694560]  ffff8800b49dc000 ffff8800c6950000 ffffffff81a8ff84 00000000ffffffff
[12236.694565]  ffffffff81a8ff88 0000000000000014 ffffffff815d133a ffffffff81a8ff80
[12236.694570] Call Trace:
[12236.694576]  [<ffffffff815d133a>] ? schedule+0x2a/0x70
[12236.694582]  [<ffffffff815d15a9>] ? schedule_preempt_disabled+0x9/0x10
[12236.694588]  [<ffffffff815d2c3f>] ? __mutex_lock_slowpath+0x7f/0xf0
[12236.694595]  [<ffffffff815d2cc6>] ? mutex_lock+0x16/0x30
[12236.694600]  [<ffffffff814f1f90>] ? rtnetlink_rcv+0x10/0x30
[12236.694605]  [<ffffffff8150ce3b>] ? netlink_unicast+0xfb/0x180
[12236.694609]  [<ffffffff8150d344>] ? netlink_sendmsg+0x484/0x5d0
[12236.694615]  [<ffffffff814cad23>] ? sock_sendmsg+0x33/0x40
[12236.694621]  [<ffffffff814cb3d0>] ? SYSC_sendto+0xe0/0x130
[12236.694629]  [<ffffffff8116a459>] ? rw_verify_area+0x49/0xe0
[12236.694635]  [<ffffffff8116a569>] ? vfs_read+0x79/0x120
[12236.694640]  [<ffffffff8116b1b2>] ? SyS_read+0x62/0x90
[12236.694645]  [<ffffffff815d4797>] ? entry_SYSCALL_64_fastpath+0x12/0x6a

[12236.694649] mission-control D 0000000000013b80     0  1502      1 0x00000000
[12236.694654]  ffff8800c6a94c80 fffffff900000000 0000000000000001 0000000000000000
[12236.694659]  ffff8800b4ac8000 ffff8800c6a94c80 ffffffff81a8ff84 00000000ffffffff
[12236.694664]  ffffffff81a8ff88 0000000000000014 ffffffff815d133a ffffffff81a8ff80
[12236.694669] Call Trace:
[12236.694675]  [<ffffffff815d133a>] ? schedule+0x2a/0x70
[12236.694681]  [<ffffffff815d15a9>] ? schedule_preempt_disabled+0x9/0x10
[12236.694687]  [<ffffffff815d2c3f>] ? __mutex_lock_slowpath+0x7f/0xf0
[12236.694693]  [<ffffffff815d2cc6>] ? mutex_lock+0x16/0x30
[12236.694699]  [<ffffffff814f1f90>] ? rtnetlink_rcv+0x10/0x30
[12236.694703]  [<ffffffff8150ce3b>] ? netlink_unicast+0xfb/0x180
[12236.694707]  [<ffffffff8150d344>] ? netlink_sendmsg+0x484/0x5d0
[12236.694713]  [<ffffffff814cad23>] ? sock_sendmsg+0x33/0x40
[12236.694719]  [<ffffffff814cb3d0>] ? SYSC_sendto+0xe0/0x130
[12236.694725]  [<ffffffff8116a459>] ? rw_verify_area+0x49/0xe0
[12236.694731]  [<ffffffff8116a569>] ? vfs_read+0x79/0x120
[12236.694736]  [<ffffffff8116b1b2>] ? SyS_read+0x62/0x90
[12236.694741]  [<ffffffff815d4797>] ? entry_SYSCALL_64_fastpath+0x12/0x6a

[12236.694751] geoclue         D 0000000000000000     0  2054      1 0x00000000
[12236.694756]  ffff8800c6952640 fffffff900000000 0000000000000001 0000000000000000
[12236.694760]  ffff880119724000 ffff8800c6952640 ffffffff81a8ff84 00000000ffffffff
[12236.694766]  ffffffff81a8ff88 0000000000000014 ffffffff815d133a ffffffff81a8ff80
[12236.694770] Call Trace:
[12236.694776]  [<ffffffff815d133a>] ? schedule+0x2a/0x70
[12236.694782]  [<ffffffff815d15a9>] ? schedule_preempt_disabled+0x9/0x10
[12236.694788]  [<ffffffff815d2c3f>] ? __mutex_lock_slowpath+0x7f/0xf0
[12236.694794]  [<ffffffff815d2cc6>] ? mutex_lock+0x16/0x30
[12236.694800]  [<ffffffff814f1f90>] ? rtnetlink_rcv+0x10/0x30
[12236.694804]  [<ffffffff8150ce3b>] ? netlink_unicast+0xfb/0x180
[12236.694808]  [<ffffffff8150d344>] ? netlink_sendmsg+0x484/0x5d0
[12236.694815]  [<ffffffff814cad23>] ? sock_sendmsg+0x33/0x40
[12236.694821]  [<ffffffff814cb3d0>] ? SYSC_sendto+0xe0/0x130
[12236.694827]  [<ffffffff8116a459>] ? rw_verify_area+0x49/0xe0
[12236.694832]  [<ffffffff8116a569>] ? vfs_read+0x79/0x120
[12236.694838]  [<ffffffff8116b1b2>] ? SyS_read+0x62/0x90
[12236.694843]  [<ffffffff815d4797>] ? entry_SYSCALL_64_fastpath+0x12/0x6a

[12236.694850] goa-daemon      D 0000000000013b80     0  2145      1 0x00000000
[12236.694854]  ffff88011497a640 fffffff900000000 0000000000000001 0000000000000000
[12236.694859]  ffff8800aaf64000 ffff88011497a640 ffffffff81a8ff84 00000000ffffffff
[12236.694864]  ffffffff81a8ff88 0000000000000014 ffffffff815d133a ffffffff81a8ff80
[12236.694869] Call Trace:
[12236.694875]  [<ffffffff815d133a>] ? schedule+0x2a/0x70
[12236.694881]  [<ffffffff815d15a9>] ? schedule_preempt_disabled+0x9/0x10
[12236.694887]  [<ffffffff815d2c3f>] ? __mutex_lock_slowpath+0x7f/0xf0
[12236.694893]  [<ffffffff815d2cc6>] ? mutex_lock+0x16/0x30
[12236.694899]  [<ffffffff814f1f90>] ? rtnetlink_rcv+0x10/0x30
[12236.694903]  [<ffffffff8150ce3b>] ? netlink_unicast+0xfb/0x180
[12236.694907]  [<ffffffff8150d344>] ? netlink_sendmsg+0x484/0x5d0
[12236.694914]  [<ffffffff814cad23>] ? sock_sendmsg+0x33/0x40
[12236.694920]  [<ffffffff814cb3d0>] ? SYSC_sendto+0xe0/0x130
[12236.694926]  [<ffffffff810d36da>] ? SyS_futex+0x6a/0x140
[12236.694931]  [<ffffffff810c6afe>] ? ktime_get_ts64+0x3e/0xf0
[12236.694936]  [<ffffffff8117d776>] ? SyS_poll+0x56/0x100
[12236.694941]  [<ffffffff815d4797>] ? entry_SYSCALL_64_fastpath+0x12/0x6a

[12236.694944] mission-control D 0000000000013b80     0  2156      1 0x00000000
[12236.694970]  ffff8800b4a9e600 fffffff900000000 0000000000000001 0000000000000000
[12236.694975]  ffff8800a1840000 ffff8800b4a9e600 ffffffff81a8ff84 00000000ffffffff
[12236.694980]  ffffffff81a8ff88 0000000000000014 ffffffff815d133a ffffffff81a8ff80
[12236.694985] Call Trace:
[12236.694991]  [<ffffffff815d133a>] ? schedule+0x2a/0x70
[12236.694997]  [<ffffffff815d15a9>] ? schedule_preempt_disabled+0x9/0x10
[12236.695004]  [<ffffffff815d2c3f>] ? __mutex_lock_slowpath+0x7f/0xf0
[12236.695010]  [<ffffffff815d2cc6>] ? mutex_lock+0x16/0x30
[12236.695016]  [<ffffffff814f1f90>] ? rtnetlink_rcv+0x10/0x30
[12236.695025]  [<ffffffff8150ce3b>] ? netlink_unicast+0xfb/0x180
[12236.695029]  [<ffffffff8150d344>] ? netlink_sendmsg+0x484/0x5d0
[12236.695035]  [<ffffffff814cad23>] ? sock_sendmsg+0x33/0x40
[12236.695041]  [<ffffffff814cb3d0>] ? SYSC_sendto+0xe0/0x130
[12236.695047]  [<ffffffff8116a459>] ? rw_verify_area+0x49/0xe0
[12236.695053]  [<ffffffff8116a569>] ? vfs_read+0x79/0x120
[12236.695058]  [<ffffffff8116b1b2>] ? SyS_read+0x62/0x90
[12236.695063]  [<ffffffff815d4797>] ? entry_SYSCALL_64_fastpath+0x12/0x6a

[12236.695068] abrt-applet     D 0000000000013b80     0  2311   1904 0x00000000
[12236.695073]  ffff8800a1ba8cc0 fffffff900000000 0000000000000001 0000000000000000
[12236.695078]  ffff88009f440000 ffff8800a1ba8cc0 ffffffff81a8ff84 00000000ffffffff
[12236.695083]  ffffffff81a8ff88 0000000000000014 ffffffff815d133a ffffffff81a8ff80
[12236.695088] Call Trace:
[12236.695094]  [<ffffffff815d133a>] ? schedule+0x2a/0x70
[12236.695100]  [<ffffffff815d15a9>] ? schedule_preempt_disabled+0x9/0x10
[12236.695106]  [<ffffffff815d2c3f>] ? __mutex_lock_slowpath+0x7f/0xf0
[12236.695112]  [<ffffffff815d2cc6>] ? mutex_lock+0x16/0x30
[12236.695118]  [<ffffffff814f1f90>] ? rtnetlink_rcv+0x10/0x30
[12236.695122]  [<ffffffff8150ce3b>] ? netlink_unicast+0xfb/0x180
[12236.695126]  [<ffffffff8150d344>] ? netlink_sendmsg+0x484/0x5d0
[12236.695132]  [<ffffffff814cad23>] ? sock_sendmsg+0x33/0x40
[12236.695138]  [<ffffffff814cb3d0>] ? SYSC_sendto+0xe0/0x130
[12236.695144]  [<ffffffff8116a459>] ? rw_verify_area+0x49/0xe0
[12236.695150]  [<ffffffff8116a569>] ? vfs_read+0x79/0x120
[12236.695155]  [<ffffffff8116b1b2>] ? SyS_read+0x62/0x90
[12236.695159]  [<ffffffff815d4797>] ? entry_SYSCALL_64_fastpath+0x12/0x6a

[12236.695163] gnome-software  D 0000000000013b80     0  2314   1904 0x00000000
[12236.695167]  ffff8800a1bab300 fffffff900000000 0000000000000001 0000000000000000
[12236.695172]  ffff88009f448000 ffff8800a1bab300 ffffffff81a8ff84 00000000ffffffff
[12236.695177]  ffffffff81a8ff88 0000000000000014 ffffffff815d133a ffffffff81a8ff80
[12236.695182] Call Trace:
[12236.695188]  [<ffffffff815d133a>] ? schedule+0x2a/0x70
[12236.695194]  [<ffffffff815d15a9>] ? schedule_preempt_disabled+0x9/0x10
[12236.695203]  [<ffffffff815d2c3f>] ? __mutex_lock_slowpath+0x7f/0xf0
[12236.695210]  [<ffffffff815d2cc6>] ? mutex_lock+0x16/0x30
[12236.695215]  [<ffffffff814f1f90>] ? rtnetlink_rcv+0x10/0x30
[12236.695219]  [<ffffffff8150ce3b>] ? netlink_unicast+0xfb/0x180
[12236.695223]  [<ffffffff8150d344>] ? netlink_sendmsg+0x484/0x5d0
[12236.695230]  [<ffffffff814cad23>] ? sock_sendmsg+0x33/0x40
[12236.695236]  [<ffffffff814cb3d0>] ? SYSC_sendto+0xe0/0x130
[12236.695242]  [<ffffffff8116a459>] ? rw_verify_area+0x49/0xe0
[12236.695247]  [<ffffffff8116a569>] ? vfs_read+0x79/0x120
[12236.695253]  [<ffffffff8116b1b2>] ? SyS_read+0x62/0x90
[12236.695257]  [<ffffffff815d4797>] ? entry_SYSCALL_64_fastpath+0x12/0x6a

[12236.695262] deja-dup-monito D 0000000000013b80     0  2319   1904 0x00000000
[12236.695266]  ffff8800a1b62640 fffffff900000000 0000000000000001 0000000000000000
[12236.695271]  ffff88009f528000 ffff8800a1b62640 ffffffff81a8ff84 00000000ffffffff
[12236.695276]  ffffffff81a8ff88 0000000000000014 ffffffff815d133a ffffffff81a8ff80
[12236.695281] Call Trace:
[12236.695287]  [<ffffffff815d133a>] ? schedule+0x2a/0x70
[12236.695293]  [<ffffffff815d15a9>] ? schedule_preempt_disabled+0x9/0x10
[12236.695299]  [<ffffffff815d2c3f>] ? __mutex_lock_slowpath+0x7f/0xf0
[12236.695305]  [<ffffffff815d2cc6>] ? mutex_lock+0x16/0x30
[12236.695311]  [<ffffffff814f1f90>] ? rtnetlink_rcv+0x10/0x30
[12236.695315]  [<ffffffff8150ce3b>] ? netlink_unicast+0xfb/0x180
[12236.695319]  [<ffffffff8150d344>] ? netlink_sendmsg+0x484/0x5d0
[12236.695325]  [<ffffffff814cad23>] ? sock_sendmsg+0x33/0x40
[12236.695332]  [<ffffffff814cb3d0>] ? SYSC_sendto+0xe0/0x130
[12236.695341]  [<ffffffff8116a459>] ? rw_verify_area+0x49/0xe0
[12236.695347]  [<ffffffff8116a569>] ? vfs_read+0x79/0x120
[12236.695352]  [<ffffffff8116b1b2>] ? SyS_read+0x62/0x90
[12236.695357]  [<ffffffff815d4797>] ? entry_SYSCALL_64_fastpath+0x12/0x6a

[12236.695366] evolution-calen D 0000000000000000     0  2435   2375 0x00000000
[12236.695371]  ffff880114bf8cc0 fffffff900000000 0000000000000001 0000000000000000
[12236.695376]  ffff880097c4c000 ffff880114bf8cc0 ffffffff81a8ff84 00000000ffffffff
[12236.695381]  ffffffff81a8ff88 0000000000000014 ffffffff815d133a ffffffff81a8ff80
[12236.695386] Call Trace:
[12236.695392]  [<ffffffff815d133a>] ? schedule+0x2a/0x70
[12236.695397]  [<ffffffff815d15a9>] ? schedule_preempt_disabled+0x9/0x10
[12236.695403]  [<ffffffff815d2c3f>] ? __mutex_lock_slowpath+0x7f/0xf0
[12236.695409]  [<ffffffff815d2cc6>] ? mutex_lock+0x16/0x30
[12236.695415]  [<ffffffff814f1f90>] ? rtnetlink_rcv+0x10/0x30
[12236.695419]  [<ffffffff8150ce3b>] ? netlink_unicast+0xfb/0x180
[12236.695423]  [<ffffffff8150d344>] ? netlink_sendmsg+0x484/0x5d0
[12236.695430]  [<ffffffff814cad23>] ? sock_sendmsg+0x33/0x40
[12236.695436]  [<ffffffff814cb3d0>] ? SYSC_sendto+0xe0/0x130
[12236.695442]  [<ffffffff8116a459>] ? rw_verify_area+0x49/0xe0
[12236.695447]  [<ffffffff8116a569>] ? vfs_read+0x79/0x120
[12236.695452]  [<ffffffff8116b1b2>] ? SyS_read+0x62/0x90
[12236.695457]  [<ffffffff815d4797>] ? entry_SYSCALL_64_fastpath+0x12/0x6a

[12236.695462] evolution-calen D 0000000000000000     0  2468   2375 0x00000000
[12236.695466]  ffff880097c7f2c0 fffffff900000000 0000000000000001 0000000000000000
[12236.695471]  ffff880097d00000 ffff880097c7f2c0 ffffffff81a8ff84 00000000ffffffff
[12236.695476]  ffffffff81a8ff88 0000000000000014 ffffffff815d133a ffffffff81a8ff80
[12236.695481] Call Trace:
[12236.695486]  [<ffffffff815d133a>] ? schedule+0x2a/0x70
[12236.695492]  [<ffffffff815d15a9>] ? schedule_preempt_disabled+0x9/0x10
[12236.695498]  [<ffffffff815d2c3f>] ? __mutex_lock_slowpath+0x7f/0xf0
[12236.695504]  [<ffffffff815d2cc6>] ? mutex_lock+0x16/0x30
[12236.695510]  [<ffffffff814f1f90>] ? rtnetlink_rcv+0x10/0x30
[12236.695514]  [<ffffffff8150ce3b>] ? netlink_unicast+0xfb/0x180
[12236.695518]  [<ffffffff8150d344>] ? netlink_sendmsg+0x484/0x5d0
[12236.695524]  [<ffffffff814cad23>] ? sock_sendmsg+0x33/0x40
[12236.695530]  [<ffffffff814cb3d0>] ? SYSC_sendto+0xe0/0x130
[12236.695536]  [<ffffffff8116a459>] ? rw_verify_area+0x49/0xe0
[12236.695542]  [<ffffffff8116a569>] ? vfs_read+0x79/0x120
[12236.695547]  [<ffffffff8116b1b2>] ? SyS_read+0x62/0x90
[12236.695552]  [<ffffffff815d4797>] ? entry_SYSCALL_64_fastpath+0x12/0x6a

[12236.695556] evolution-addre D 0000000000000000     0  2489   2469 0x00000000
[12236.695560]  ffff880114ad0cc0 fffffff900000000 0000000000000001 0000000000000000
[12236.695565]  ffff880097cb0000 ffff880114ad0cc0 ffffffff81a8ff84 00000000ffffffff
[12236.695570]  ffffffff81a8ff88 0000000000000014 ffffffff815d133a ffffffff81a8ff80
[12236.695575] Call Trace:
[12236.695581]  [<ffffffff815d133a>] ? schedule+0x2a/0x70
[12236.695587]  [<ffffffff815d15a9>] ? schedule_preempt_disabled+0x9/0x10
[12236.695593]  [<ffffffff815d2c3f>] ? __mutex_lock_slowpath+0x7f/0xf0
[12236.695599]  [<ffffffff815d2cc6>] ? mutex_lock+0x16/0x30
[12236.695605]  [<ffffffff814f1f90>] ? rtnetlink_rcv+0x10/0x30
[12236.695608]  [<ffffffff8150ce3b>] ? netlink_unicast+0xfb/0x180
[12236.695612]  [<ffffffff8150d344>] ? netlink_sendmsg+0x484/0x5d0
[12236.695619]  [<ffffffff814cad23>] ? sock_sendmsg+0x33/0x40
[12236.695625]  [<ffffffff814cb3d0>] ? SYSC_sendto+0xe0/0x130
[12236.695631]  [<ffffffff8116a459>] ? rw_verify_area+0x49/0xe0
[12236.695636]  [<ffffffff8116a569>] ? vfs_read+0x79/0x120
[12236.695642]  [<ffffffff8116b1b2>] ? SyS_read+0x62/0x90
[12236.695646]  [<ffffffff815d4797>] ? entry_SYSCALL_64_fastpath+0x12/0x6a

[12236.695651] kworker/0:1     D 0000000000013b80     0 28149      2 0x00000000
[12236.695663] Workqueue: ipv6_addrconf addrconf_dad_work
[12236.695666]  ffff8800b4ab8cc0 ffff88011fa13b80 0000000000000000 0000000200000000
[12236.695671]  ffff880013ec8000 ffff8800b4ab8cc0 ffffffff81a8ff84 00000000ffffffff
[12236.695676]  ffffffff81a8ff88 0000000000000000 ffffffff815d133a ffffffff81a8ff80
[12236.695681] Call Trace:
[12236.695687]  [<ffffffff815d133a>] ? schedule+0x2a/0x70
[12236.695692]  [<ffffffff815d15a9>] ? schedule_preempt_disabled+0x9/0x10
[12236.695698]  [<ffffffff815d2c3f>] ? __mutex_lock_slowpath+0x7f/0xf0
[12236.695704]  [<ffffffff8109c6cc>] ? pick_next_task_fair+0x16c/0x900
[12236.695710]  [<ffffffff815d2cc6>] ? mutex_lock+0x16/0x30
[12236.695716]  [<ffffffff8158e6e5>] ? addrconf_dad_work+0x25/0x2f0
[12236.695724]  [<ffffffff8107fb77>] ? process_one_work+0x127/0x3f0
[12236.695729]  [<ffffffff8107fe82>] ? worker_thread+0x42/0x490
[12236.695735]  [<ffffffff8107fe40>] ? process_one_work+0x3f0/0x3f0
[12236.695740]  [<ffffffff8108501c>] ? kthread+0xbc/0xe0
[12236.695745]  [<ffffffff81084f60>] ? kthread_worker_fn+0x160/0x160
[12236.695750]  [<ffffffff815d4b5f>] ? ret_from_fork+0x3f/0x70
[12236.695754]  [<ffffffff81084f60>] ? kthread_worker_fn+0x160/0x160

[12236.695760] kworker/3:2     D 0000000000013b80     0 29594      2 0x00000000
[12236.695786] Workqueue: events disconnect_work [cfg80211]
[12236.695788]  ffff88001bf5e600 ffffffff812b5ae0 0000000000000003 ffff88003fa8d060
[12236.695793]  ffff88010a1fc000 ffff88001bf5e600 ffffffff81a8ff84 00000000ffffffff
[12236.695798]  ffffffff81a8ff88 0000000000000000 ffffffff815d133a ffffffff81a8ff80
[12236.695803] Call Trace:
[12236.695809]  [<ffffffff812b5ae0>] ? sg_free_table+0x70/0x70
[12236.695815]  [<ffffffff815d133a>] ? schedule+0x2a/0x70
[12236.695821]  [<ffffffff815d15a9>] ? schedule_preempt_disabled+0x9/0x10
[12236.695827]  [<ffffffff815d2c3f>] ? __mutex_lock_slowpath+0x7f/0xf0
[12236.695834]  [<ffffffff815d2cc6>] ? mutex_lock+0x16/0x30
[12236.695849]  [<ffffffffc02ae903>] ? disconnect_work+0x13/0xc0 [cfg80211]
[12236.695856]  [<ffffffff8107fb77>] ? process_one_work+0x127/0x3f0
[12236.695861]  [<ffffffff8107fe82>] ? worker_thread+0x42/0x490
[12236.695867]  [<ffffffff815d0d20>] ? __schedule+0x250/0x840
[12236.695873]  [<ffffffff8107fe40>] ? process_one_work+0x3f0/0x3f0
[12236.695877]  [<ffffffff8108501c>] ? kthread+0xbc/0xe0
[12236.695882]  [<ffffffff81086360>] ? override_creds+0x20/0x20
[12236.695887]  [<ffffffff81084f60>] ? kthread_worker_fn+0x160/0x160
[12236.695891]  [<ffffffff815d4b5f>] ? ret_from_fork+0x3f/0x70
[12236.695895]  [<ffffffff81084f60>] ? kthread_worker_fn+0x160/0x160

[12236.695913] kworker/3:1     D 0000000000013b80     0  5175      2 0x00000000
[12236.695920] Workqueue: events linkwatch_event
[12236.695923]  ffff88011ab8cc80 ffff880013f63d38 ffff880013f63d38 0000000000000000
[12236.695928]  ffff880013f64000 ffff88011ab8cc80 ffffffff81a8ff84 00000000ffffffff
[12236.695933]  ffffffff81a8ff88 0000000000000000 ffffffff815d133a ffffffff81a8ff80
[12236.695938] Call Trace:
[12236.695944]  [<ffffffff815d133a>] ? schedule+0x2a/0x70
[12236.695956]  [<ffffffff815d15a9>] ? schedule_preempt_disabled+0x9/0x10
[12236.695962]  [<ffffffff815d2c3f>] ? __mutex_lock_slowpath+0x7f/0xf0
[12236.695968]  [<ffffffff815d2cc6>] ? mutex_lock+0x16/0x30
[12236.695972]  [<ffffffff814f6f39>] ? linkwatch_event+0x9/0x30
[12236.695978]  [<ffffffff8107fb77>] ? process_one_work+0x127/0x3f0
[12236.695984]  [<ffffffff8107fe82>] ? worker_thread+0x42/0x490
[12236.695989]  [<ffffffff815d0d20>] ? __schedule+0x250/0x840
[12236.695995]  [<ffffffff8107fe40>] ? process_one_work+0x3f0/0x3f0
[12236.695999]  [<ffffffff8108501c>] ? kthread+0xbc/0xe0
[12236.696003]  [<ffffffff81084f60>] ? kthread_worker_fn+0x160/0x160
[12236.696008]  [<ffffffff815d4b5f>] ? ret_from_fork+0x3f/0x70
[12236.696012]  [<ffffffff81084f60>] ? kthread_worker_fn+0x160/0x160

[12236.696015] dhclient        D 0000000000013b80     0  5205   1047 0x00000000
[12236.696019]  ffff8800a1a66600 fffffff900000000 0000000000000001 ffff8800a1a66600
[12236.696025]  ffff880092dc0000 ffff8800a1a66600 ffffffff81a8ff84 00000000ffffffff
[12236.696029]  ffffffff81a8ff88 0000000000000014 ffffffff815d133a ffffffff81a8ff80
[12236.696034] Call Trace:
[12236.696040]  [<ffffffff815d133a>] ? schedule+0x2a/0x70
[12236.696046]  [<ffffffff815d15a9>] ? schedule_preempt_disabled+0x9/0x10
[12236.696052]  [<ffffffff815d2c3f>] ? __mutex_lock_slowpath+0x7f/0xf0
[12236.696058]  [<ffffffff815d2cc6>] ? mutex_lock+0x16/0x30
[12236.696064]  [<ffffffff814f1f90>] ? rtnetlink_rcv+0x10/0x30
[12236.696068]  [<ffffffff8150ce3b>] ? netlink_unicast+0xfb/0x180
[12236.696072]  [<ffffffff8150d344>] ? netlink_sendmsg+0x484/0x5d0
[12236.696079]  [<ffffffff814cad23>] ? sock_sendmsg+0x33/0x40
[12236.696086]  [<ffffffff814cb3d0>] ? SYSC_sendto+0xe0/0x130
[12236.696092]  [<ffffffff814ca40e>] ? SYSC_getsockname+0x8e/0xa0
[12236.696099]  [<ffffffff814c9e88>] ? sock_map_fd+0x38/0x60
[12236.696102]  [<ffffffff814cb4b9>] ? SyS_socket+0x59/0x80
[12236.696107]  [<ffffffff815d4797>] ? entry_SYSCALL_64_fastpath+0x12/0x6a

[12236.696111] kworker/2:1     D 0000000000013b80     0  5269      2 0x00000000
[12236.696126] Workqueue: events_power_efficient reg_check_chans_work [cfg80211]
[12236.696128]  ffff880097ded940 ffff88011fb93b80 0000000200000003 ffff88011fb13b80
[12236.696134]  ffff88010a054000 ffff880097ded940 ffffffff81a8ff84 00000000ffffffff
[12236.696139]  ffffffff81a8ff88 0000000000000000 ffffffff815d133a ffffffff81a8ff80
[12236.696143] Call Trace:
[12236.696150]  [<ffffffff815d133a>] ? schedule+0x2a/0x70
[12236.696155]  [<ffffffff815d15a9>] ? schedule_preempt_disabled+0x9/0x10
[12236.696161]  [<ffffffff815d2c3f>] ? __mutex_lock_slowpath+0x7f/0xf0
[12236.696167]  [<ffffffff815d2cc6>] ? mutex_lock+0x16/0x30
[12236.696178]  [<ffffffffc0292623>] ? reg_check_chans_work+0x13/0x240 [cfg80211]
[12236.696184]  [<ffffffff8107fb77>] ? process_one_work+0x127/0x3f0
[12236.696190]  [<ffffffff8107fe82>] ? worker_thread+0x42/0x490
[12236.696196]  [<ffffffff815d0d20>] ? __schedule+0x250/0x840
[12236.696201]  [<ffffffff8107fe40>] ? process_one_work+0x3f0/0x3f0
[12236.696205]  [<ffffffff8108501c>] ? kthread+0xbc/0xe0
[12236.696210]  [<ffffffff81084f60>] ? kthread_worker_fn+0x160/0x160
[12236.696214]  [<ffffffff815d4b5f>] ? ret_from_fork+0x3f/0x70
[12236.696218]  [<ffffffff81084f60>] ? kthread_worker_fn+0x160/0x160

[12236.696221] DNS Resolver #1 D 0000000000013b80     0  5349   5223 0x00000004
[12236.696226]  ffff880054729980 fffffff900000000 0000000000000001 ffff880054729980
[12236.696231]  ffff8800a2394000 ffff880054729980 ffffffff81a8ff84 00000000ffffffff
[12236.696236]  ffffffff81a8ff88 0000000000000014 ffffffff815d133a ffffffff81a8ff80
[12236.696241] Call Trace:
[12236.696247]  [<ffffffff815d133a>] ? schedule+0x2a/0x70
[12236.696252]  [<ffffffff815d15a9>] ? schedule_preempt_disabled+0x9/0x10
[12236.696258]  [<ffffffff815d2c3f>] ? __mutex_lock_slowpath+0x7f/0xf0
[12236.696264]  [<ffffffff815d2cc6>] ? mutex_lock+0x16/0x30
[12236.696270]  [<ffffffff814f1f90>] ? rtnetlink_rcv+0x10/0x30
[12236.696274]  [<ffffffff8150ce3b>] ? netlink_unicast+0xfb/0x180
[12236.696278]  [<ffffffff8150d344>] ? netlink_sendmsg+0x484/0x5d0
[12236.696285]  [<ffffffff814cad23>] ? sock_sendmsg+0x33/0x40
[12236.696291]  [<ffffffff814cb3d0>] ? SYSC_sendto+0xe0/0x130
[12236.696298]  [<ffffffff814ca3da>] ? SYSC_getsockname+0x5a/0xa0
[12236.696303]  [<ffffffff8103bff9>] ? __do_page_fault+0x169/0x380
[12236.696308]  [<ffffffff815d4797>] ? entry_SYSCALL_64_fastpath+0x12/0x6a

[12236.696311] ifconfig        D 0000000000013b80     0  5423   5364 0x00000000
[12236.696315]  ffff880009393fc0 ffff8800d962fe40 0000000000000246 ffffffff81a4ee40
[12236.696321]  ffff88010a040000 ffff880009393fc0 ffffffff81a8ff84 00000000ffffffff
[12236.696326]  ffffffff81a8ff88 0000000000000000 ffffffff815d133a ffffffff81a8ff80
[12236.696330] Call Trace:
[12236.696336]  [<ffffffff815d133a>] ? schedule+0x2a/0x70
[12236.696342]  [<ffffffff815d15a9>] ? schedule_preempt_disabled+0x9/0x10
[12236.696348]  [<ffffffff815d2c3f>] ? __mutex_lock_slowpath+0x7f/0xf0
[12236.696354]  [<ffffffff815d2cc6>] ? mutex_lock+0x16/0x30
[12236.696359]  [<ffffffff814f9704>] ? dev_ioctl+0x2d4/0x560
[12236.696367]  [<ffffffff81178af4>] ? do_filp_open+0x84/0xd0
[12236.696373]  [<ffffffff814c8796>] ? sock_do_ioctl+0x36/0x40
[12236.696379]  [<ffffffff814c8bfc>] ? sock_ioctl+0x1ac/0x250
[12236.696383]  [<ffffffff8117b70a>] ? do_vfs_ioctl+0x28a/0x470
[12236.696389]  [<ffffffff8124220d>] ? security_file_ioctl+0x3d/0x60
[12236.696393]  [<ffffffff8117b95f>] ? SyS_ioctl+0x6f/0x80
[12236.696398]  [<ffffffff815d4797>] ? entry_SYSCALL_64_fastpath+0x12/0x6a


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: rtnl_mutex deadlock?
  2015-08-04 15:48 rtnl_mutex deadlock? Linus Torvalds
@ 2015-08-05  5:31 ` Cong Wang
  2015-08-05  7:43   ` Jiri Pirko
  0 siblings, 1 reply; 14+ messages in thread
From: Cong Wang @ 2015-08-05  5:31 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: David Miller, Nicolas Dichtel, Thomas Graf, Jiri Pirko,
	Scott Feldman, Daniel Borkmann, Network Development

On Tue, Aug 4, 2015 at 8:48 AM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> Sorry for the spamming of random rtnetlink people, but I just resumed
> my laptop at PDX, and networking was dead.
>
> It looks like a deadlock on rtnl_mutex, possibly due to some error
> path not releasing the lock. No network op was making any progress,
> and as you can see from the attached sysrq-w, it all seems to be hung
> in rtnl_lock().
>
> The call trace from NetworkManager looks different from the others,
> and looks to me like it might actually be a recursive invocation of
> rtnetlink_rcv(), but since I have a fairly light configuration on this
> laptop and don't have frame pointers enabled, I'm not sure how
> reliable that stack trace is. It might be just stale entries. But if
> they aren't stale, then that would certainly explain the deadlock.
>

You are right, looks like kernel received a GETLINK netlink message
from NetworkManager and then replied back but accidentally sent the
reply to itself, seems something wrong with NETLINK_CB(skb).portid.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: rtnl_mutex deadlock?
  2015-08-05  5:31 ` Cong Wang
@ 2015-08-05  7:43   ` Jiri Pirko
  2015-08-05  8:44     ` Linus Torvalds
  0 siblings, 1 reply; 14+ messages in thread
From: Jiri Pirko @ 2015-08-05  7:43 UTC (permalink / raw)
  To: Cong Wang
  Cc: Linus Torvalds, David Miller, Nicolas Dichtel, Thomas Graf,
	Scott Feldman, Daniel Borkmann, Network Development

Wed, Aug 05, 2015 at 07:31:30AM CEST, cwang@twopensource.com wrote:
>On Tue, Aug 4, 2015 at 8:48 AM, Linus Torvalds
><torvalds@linux-foundation.org> wrote:
>> Sorry for the spamming of random rtnetlink people, but I just resumed
>> my laptop at PDX, and networking was dead.
>>
>> It looks like a deadlock on rtnl_mutex, possibly due to some error
>> path not releasing the lock. No network op was making any progress,
>> and as you can see from the attached sysrq-w, it all seems to be hung
>> in rtnl_lock().
>>
>> The call trace from NetworkManager looks different from the others,
>> and looks to me like it might actually be a recursive invocation of
>> rtnetlink_rcv(), but since I have a fairly light configuration on this
>> laptop and don't have frame pointers enabled, I'm not sure how
>> reliable that stack trace is. It might be just stale entries. But if
>> they aren't stale, then that would certainly explain the deadlock.
>>
>
>You are right, looks like kernel received a GETLINK netlink message
>from NetworkManager and then replied back but accidentally sent the
>reply to itself, seems something wrong with NETLINK_CB(skb).portid.

Indeed. Most probably, NETLINK_CB(skb).portid got zeroed.

Linus, are you able to reproduce this or is it a one-time issue?

Thanks!

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: rtnl_mutex deadlock?
  2015-08-05  7:43   ` Jiri Pirko
@ 2015-08-05  8:44     ` Linus Torvalds
  2015-08-05 18:59       ` Daniel Borkmann
  0 siblings, 1 reply; 14+ messages in thread
From: Linus Torvalds @ 2015-08-05  8:44 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: Cong Wang, David Miller, Nicolas Dichtel, Thomas Graf,
	Scott Feldman, Daniel Borkmann, Network Development

On Wed, Aug 5, 2015 at 9:43 AM, Jiri Pirko <jiri@resnulli.us> wrote:
>
> Indeed. Most probably, NETLINK_CB(skb).portid got zeroed.
>
> Linus, are you able to reproduce this or is it a one-time issue?

I don't think I'm able to reproduce this, it's happened only once so far.

                 Linus

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: rtnl_mutex deadlock?
  2015-08-05  8:44     ` Linus Torvalds
@ 2015-08-05 18:59       ` Daniel Borkmann
  2015-08-06  0:30         ` Herbert Xu
  2015-08-06  5:19         ` Herbert Xu
  0 siblings, 2 replies; 14+ messages in thread
From: Daniel Borkmann @ 2015-08-05 18:59 UTC (permalink / raw)
  To: Linus Torvalds, Jiri Pirko
  Cc: Cong Wang, David Miller, Nicolas Dichtel, Thomas Graf,
	Scott Feldman, Network Development, herbert

On 08/05/2015 10:44 AM, Linus Torvalds wrote:
> On Wed, Aug 5, 2015 at 9:43 AM, Jiri Pirko <jiri@resnulli.us> wrote:
>>
>> Indeed. Most probably, NETLINK_CB(skb).portid got zeroed.
>>
>> Linus, are you able to reproduce this or is it a one-time issue?
>
> I don't think I'm able to reproduce this, it's happened only once so far.

Here's a theory and patch below. Herbert, Thomas, does this make any
sense to you resp. sound plausible? ;)

I'm not quite sure what's best to return from here, i.e. whether we
propagate -ENOMEM or instead retry over and over again hoping that the
rehashing completed (and no new rehashing started in the mean time) ...

The rehashing could take quite some time on large hashtables and given
we can also fail with -ENOMEM from rhashtable_insert_rehash() when we
cannot allocate a bucket table, it's probably okay to go with -ENOMEM?

[PATCH net] netlink, rhashtable: fix deadlock when grabbing rtnl_mutex

Linus reports the following deadlock on rtnl_mutex; triggered only
once so far:

[12236.694209] NetworkManager  D 0000000000013b80     0  1047      1 0x00000000
[12236.694218]  ffff88003f902640 0000000000000000 ffffffff815d15a9 0000000000000018
[12236.694224]  ffff880119538000 ffff88003f902640 ffffffff81a8ff84 00000000ffffffff
[12236.694230]  ffffffff81a8ff88 ffff880119c47f00 ffffffff815d133a ffffffff81a8ff80
[12236.694235] Call Trace:
[12236.694250]  [<ffffffff815d15a9>] ? schedule_preempt_disabled+0x9/0x10
[12236.694257]  [<ffffffff815d133a>] ? schedule+0x2a/0x70
[12236.694263]  [<ffffffff815d15a9>] ? schedule_preempt_disabled+0x9/0x10
[12236.694271]  [<ffffffff815d2c3f>] ? __mutex_lock_slowpath+0x7f/0xf0
[12236.694280]  [<ffffffff815d2cc6>] ? mutex_lock+0x16/0x30
[12236.694291]  [<ffffffff814f1f90>] ? rtnetlink_rcv+0x10/0x30
[12236.694299]  [<ffffffff8150ce3b>] ? netlink_unicast+0xfb/0x180
[12236.694309]  [<ffffffff814f5ad3>] ? rtnl_getlink+0x113/0x190
[12236.694319]  [<ffffffff814f202a>] ? rtnetlink_rcv_msg+0x7a/0x210
[12236.694331]  [<ffffffff8124565c>] ? sock_has_perm+0x5c/0x70
[12236.694339]  [<ffffffff814f1fb0>] ? rtnetlink_rcv+0x30/0x30
[12236.694346]  [<ffffffff8150d62c>] ? netlink_rcv_skb+0x9c/0xc0
[12236.694354]  [<ffffffff814f1f9f>] ? rtnetlink_rcv+0x1f/0x30
[12236.694360]  [<ffffffff8150ce3b>] ? netlink_unicast+0xfb/0x180
[12236.694367]  [<ffffffff8150d344>] ? netlink_sendmsg+0x484/0x5d0
[12236.694376]  [<ffffffff810a236f>] ? __wake_up+0x2f/0x50
[12236.694387]  [<ffffffff814cad23>] ? sock_sendmsg+0x33/0x40
[12236.694396]  [<ffffffff814cb05e>] ? ___sys_sendmsg+0x22e/0x240
[12236.694405]  [<ffffffff814cab75>] ? ___sys_recvmsg+0x135/0x1a0
[12236.694415]  [<ffffffff811a9d12>] ? eventfd_write+0x82/0x210
[12236.694423]  [<ffffffff811a0f9e>] ? fsnotify+0x32e/0x4c0
[12236.694429]  [<ffffffff8108cb70>] ? wake_up_q+0x60/0x60
[12236.694434]  [<ffffffff814cba09>] ? __sys_sendmsg+0x39/0x70
[12236.694440]  [<ffffffff815d4797>] ? entry_SYSCALL_64_fastpath+0x12/0x6a

It seems so far plausible that the recursive call into rtnetlink_rcv()
looks suspicious. One way, where this could trigger is that the senders
NETLINK_CB(skb).portid was wrongly 0 (which is rtnetlink socket), so the
rtnl_getlink() request's answer would be sent to the kernel instead to
the actual user process, thus grabbing rtnl_mutex() twice.

One theory how we could end up with a NETLINK_CB(skb).portid of 0 on a
user space process is, when we start out from netlink_sendmsg() with an
unbound portid, so that we need to do netlink_autobind().

Here, we would need to have an error of 0 returned, so that we can
continue with sending the frame and setting NETLINK_CB(skb).portid to 0
eventually. I.e. in netlink_autobind(), we need to return with -EBUSY
from netlink_insert(), so that the error code gets overwritten with 0.

In order to get to this point, the inner __netlink_insert() must return
with -EBUSY so that we reset the socket's portid to 0, and violate the 2nd
rule documented in d470e3b483dc ("[NETLINK]: Fix two socket hashing bugs."),
where it seemed to be a very similar issue that got fixed.

There's one possibility where the rhashtable backend could in-fact return
with -EBUSY. The insert is done via rhashtable_lookup_insert_key(), which
invokes __rhashtable_insert_fast(). From here, we need to trigger the
slow path with rhashtable_insert_rehash(), which can return -EBUSY in
case a rehash of the hashtable is currently already in progress.

This error propagates back to __netlink_insert() and provides us the
needed precondition. Looks like the -EBUSY was introduced first in
ccd57b1bd324 ("rhashtable: Add immediate rehash during insertion"). So,
as -EBUSY must not escape from there, we would need to remap it to a
different error code for user space. As the current rhashtable cannot
take any inserts in that case, it could be mapped to -ENOMEM.

Fixes: ccd57b1bd324 ("rhashtable: Add immediate rehash during insertion")
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
---
  net/netlink/af_netlink.c | 5 +++++
  1 file changed, 5 insertions(+)

diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index d8e2e39..1cfd4af 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -1096,6 +1096,11 @@ static int netlink_insert(struct sock *sk, u32 portid)

  	err = __netlink_insert(table, sk);
  	if (err) {
+		/* Currently, a rehashing of rhashtable might be in progress,
+		 * we however must not allow -EBUSY to escape from here.
+		 */
+		if (err == -EBUSY)
+			err = -ENOMEM;
  		if (err == -EEXIST)
  			err = -EADDRINUSE;
  		nlk_sk(sk)->portid = 0;
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: rtnl_mutex deadlock?
  2015-08-05 18:59       ` Daniel Borkmann
@ 2015-08-06  0:30         ` Herbert Xu
  2015-08-06 14:50           ` Daniel Borkmann
  2015-08-06  5:19         ` Herbert Xu
  1 sibling, 1 reply; 14+ messages in thread
From: Herbert Xu @ 2015-08-06  0:30 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Linus Torvalds, Jiri Pirko, Cong Wang, David Miller,
	Nicolas Dichtel, Thomas Graf, Scott Feldman, Network Development

On Wed, Aug 05, 2015 at 08:59:07PM +0200, Daniel Borkmann wrote:
>
> Here's a theory and patch below. Herbert, Thomas, does this make any
> sense to you resp. sound plausible? ;)

It's certainly possible.  Whether it's plausible I'm not so sure.
The netlink hashtable is unlimited in size.  So it should always
be expanding, not rehashing.  The bug you found should only affect
rehashing.

> I'm not quite sure what's best to return from here, i.e. whether we
> propagate -ENOMEM or instead retry over and over again hoping that the
> rehashing completed (and no new rehashing started in the mean time) ...

Please use something other than ENOMEM as it is already heavily
used in this context.  Perhaps EOVERFLOW?

We should probably add a WARN_ON_ONCE in rhashtable_insert_rehash
since two concurrent rehashings indicates something is going
seriously wrong.

Thanks,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: rtnl_mutex deadlock?
  2015-08-05 18:59       ` Daniel Borkmann
  2015-08-06  0:30         ` Herbert Xu
@ 2015-08-06  5:19         ` Herbert Xu
  1 sibling, 0 replies; 14+ messages in thread
From: Herbert Xu @ 2015-08-06  5:19 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Linus Torvalds, Jiri Pirko, Cong Wang, David Miller,
	Nicolas Dichtel, Thomas Graf, Scott Feldman, Network Development

On Wed, Aug 05, 2015 at 08:59:07PM +0200, Daniel Borkmann wrote:
> 
> Here's a theory and patch below. Herbert, Thomas, does this make any
> sense to you resp. sound plausible? ;)

Another possibility is the following bug:

https://patchwork.ozlabs.org/patch/503374/

It can cause a use-after-free which may lead to corruption of skb
state, including the cb buffer.  Of course it's a long shot.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: rtnl_mutex deadlock?
  2015-08-06  0:30         ` Herbert Xu
@ 2015-08-06 14:50           ` Daniel Borkmann
  2015-08-06 22:39             ` Daniel Borkmann
  2015-08-06 23:41             ` Herbert Xu
  0 siblings, 2 replies; 14+ messages in thread
From: Daniel Borkmann @ 2015-08-06 14:50 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Linus Torvalds, Jiri Pirko, Cong Wang, David Miller,
	Nicolas Dichtel, Thomas Graf, Scott Feldman, Network Development

On 08/06/2015 02:30 AM, Herbert Xu wrote:
> On Wed, Aug 05, 2015 at 08:59:07PM +0200, Daniel Borkmann wrote:
>>
>> Here's a theory and patch below. Herbert, Thomas, does this make any
>> sense to you resp. sound plausible? ;)
>
> It's certainly possible.  Whether it's plausible I'm not so sure.
> The netlink hashtable is unlimited in size.  So it should always
> be expanding, not rehashing.  The bug you found should only affect
> rehashing.
>
>> I'm not quite sure what's best to return from here, i.e. whether we
>> propagate -ENOMEM or instead retry over and over again hoping that the
>> rehashing completed (and no new rehashing started in the mean time) ...
>
> Please use something other than ENOMEM as it is already heavily
> used in this context.  Perhaps EOVERFLOW?

Okay, I'll do that.

> We should probably add a WARN_ON_ONCE in rhashtable_insert_rehash
> since two concurrent rehashings indicates something is going
> seriously wrong.

So, if I didn't miss anything, it looks like the following could have
happened: the worker thread, that is rht_deferred_worker(), itself could
trigger the first rehashing, e.g. after shrinking or expanding (or also
in case none of both happen).

Then, in __rhashtable_insert_fast(), I could trigger an -EBUSY when I'm
really unlucky and exceed the ht->elasticity limit of 16. I would then
end up in rhashtable_insert_rehash() to find out there's already one
ongoing and thus, I'm getting -EBUSY via __netlink_insert().

Perhaps that is what could have happened? Seems rare though, but it was
also only seen rarely so far ...

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: rtnl_mutex deadlock?
  2015-08-06 14:50           ` Daniel Borkmann
@ 2015-08-06 22:39             ` Daniel Borkmann
  2015-08-06 23:42               ` Herbert Xu
  2015-08-06 23:41             ` Herbert Xu
  1 sibling, 1 reply; 14+ messages in thread
From: Daniel Borkmann @ 2015-08-06 22:39 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Linus Torvalds, Jiri Pirko, Cong Wang, David Miller,
	Nicolas Dichtel, Thomas Graf, Scott Feldman, Network Development

On 08/06/2015 04:50 PM, Daniel Borkmann wrote:
> On 08/06/2015 02:30 AM, Herbert Xu wrote:
>> On Wed, Aug 05, 2015 at 08:59:07PM +0200, Daniel Borkmann wrote:
>>>
>>> Here's a theory and patch below. Herbert, Thomas, does this make any
>>> sense to you resp. sound plausible? ;)
>>
>> It's certainly possible.  Whether it's plausible I'm not so sure.
>> The netlink hashtable is unlimited in size.  So it should always
>> be expanding, not rehashing.  The bug you found should only affect
>> rehashing.
>>
>>> I'm not quite sure what's best to return from here, i.e. whether we
>>> propagate -ENOMEM or instead retry over and over again hoping that the
>>> rehashing completed (and no new rehashing started in the mean time) ...../net/ipv4/af_inet.c:172:static
>>
>> Please use something other than ENOMEM as it is already heavily
>> used in this context.  Perhaps EOVERFLOW?
>
> Okay, I'll do that.
>
>> We should probably add a WARN_ON_ONCE in rhashtable_insert_rehash
>> since two concurrent rehashings indicates something is going
>> seriously wrong.
>
> So, if I didn't miss anything, it looks like the following could have
> happened: the worker thread, that is rht_deferred_worker(), itself could
> trigger the first rehashing, e.g. after shrinking or expanding (or also
> in case none of both happen).
>
> Then, in __rhashtable_insert_fast(), I could trigger an -EBUSY when I'm
> really unlucky and exceed the ht->elasticity limit of 16. I would then
> end up in rhashtable_insert_rehash() to find out there's already one
> ongoing and thus, I'm getting -EBUSY via __netlink_insert().
>
> Perhaps that is what could have happened? Seems rare though, but it was
> also only seen rarely so far ...

Experimenting a bit more, letting __netlink_insert() return -EBUSY so far,
I only managed when either artificially reducing ht->elasticity limit a bit
or biasing the hash function, that means, it would require some specific
knowledge at what slot we end up to overcome the elasticity limit and thus
trigger rehashing. Pretty unlikely though if you ask me. The other thing
I could observe, when I used the bind stress test from Thomas' repo and
reduced the amount of bind()'s, so that we very frequently fluctuate in the
ranges of 4 to 256 of the hashtable size, I could observe that we from time
to time enter rhashtable_insert_rehash() on insertions, but probably the
window was too small to trigger an error. I think in any case, remapping
seems okay.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: rtnl_mutex deadlock?
  2015-08-06 14:50           ` Daniel Borkmann
  2015-08-06 22:39             ` Daniel Borkmann
@ 2015-08-06 23:41             ` Herbert Xu
  2015-08-06 23:58               ` Daniel Borkmann
  1 sibling, 1 reply; 14+ messages in thread
From: Herbert Xu @ 2015-08-06 23:41 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Linus Torvalds, Jiri Pirko, Cong Wang, David Miller,
	Nicolas Dichtel, Thomas Graf, Scott Feldman, Network Development

On Thu, Aug 06, 2015 at 04:50:39PM +0200, Daniel Borkmann wrote:
> 
> Then, in __rhashtable_insert_fast(), I could trigger an -EBUSY when I'm
> really unlucky and exceed the ht->elasticity limit of 16. I would then
> end up in rhashtable_insert_rehash() to find out there's already one
> ongoing and thus, I'm getting -EBUSY via __netlink_insert().

Right, so the only way you can trigger this is if you hit a chain
longer than 16 and the number of entries in the table is less than
75% the size of the table, as well as there being an existing resize
or rehash operation.

This should be pretty much impossible.

But if we had a WARN_ON_ONCE there then we'll know for sure.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: rtnl_mutex deadlock?
  2015-08-06 22:39             ` Daniel Borkmann
@ 2015-08-06 23:42               ` Herbert Xu
  0 siblings, 0 replies; 14+ messages in thread
From: Herbert Xu @ 2015-08-06 23:42 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Linus Torvalds, Jiri Pirko, Cong Wang, David Miller,
	Nicolas Dichtel, Thomas Graf, Scott Feldman, Network Development

On Fri, Aug 07, 2015 at 12:39:47AM +0200, Daniel Borkmann wrote:
>
> window was too small to trigger an error. I think in any case, remapping
> seems okay.

Oh there is no doubt that we need your EBUSY remapping patch.
It's just that it's very unlikely for this to be responsible
for the dead-lock that Linus saw.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: rtnl_mutex deadlock?
  2015-08-06 23:41             ` Herbert Xu
@ 2015-08-06 23:58               ` Daniel Borkmann
  2015-08-07  0:00                 ` Herbert Xu
  0 siblings, 1 reply; 14+ messages in thread
From: Daniel Borkmann @ 2015-08-06 23:58 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Linus Torvalds, Jiri Pirko, Cong Wang, David Miller,
	Nicolas Dichtel, Thomas Graf, Scott Feldman, Network Development

On 08/07/2015 01:41 AM, Herbert Xu wrote:
> On Thu, Aug 06, 2015 at 04:50:39PM +0200, Daniel Borkmann wrote:
>>
>> Then, in __rhashtable_insert_fast(), I could trigger an -EBUSY when I'm
>> really unlucky and exceed the ht->elasticity limit of 16. I would then
>> end up in rhashtable_insert_rehash() to find out there's already one
>> ongoing and thus, I'm getting -EBUSY via __netlink_insert().
>
> Right, so the only way you can trigger this is if you hit a chain
> longer than 16 and the number of entries in the table is less than
> 75% the size of the table, as well as there being an existing resize
> or rehash operation.
>
> This should be pretty much impossible.
>
> But if we had a WARN_ON_ONCE there then we'll know for sure.

Looks like we had a WARN_ON() in rhashtable_insert_rehash() before, but
was removed in a87b9ebf1709 ("rhashtable: Do not schedule more than one
rehash if we can't grow further"). Do you want to re-add a WARN_ON_ONCE()?

Thanks,
Daniel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: rtnl_mutex deadlock?
  2015-08-06 23:58               ` Daniel Borkmann
@ 2015-08-07  0:00                 ` Herbert Xu
  2015-08-08 17:22                   ` Thomas Graf
  0 siblings, 1 reply; 14+ messages in thread
From: Herbert Xu @ 2015-08-07  0:00 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Linus Torvalds, Jiri Pirko, Cong Wang, David Miller,
	Nicolas Dichtel, Thomas Graf, Scott Feldman, Network Development

On Fri, Aug 07, 2015 at 01:58:15AM +0200, Daniel Borkmann wrote:
>
> Looks like we had a WARN_ON() in rhashtable_insert_rehash() before, but
> was removed in a87b9ebf1709 ("rhashtable: Do not schedule more than one
> rehash if we can't grow further"). Do you want to re-add a WARN_ON_ONCE()?

I think so.  Thomas?

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: rtnl_mutex deadlock?
  2015-08-07  0:00                 ` Herbert Xu
@ 2015-08-08 17:22                   ` Thomas Graf
  0 siblings, 0 replies; 14+ messages in thread
From: Thomas Graf @ 2015-08-08 17:22 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Daniel Borkmann, Linus Torvalds, Jiri Pirko, Cong Wang,
	David Miller, Nicolas Dichtel, Scott Feldman, Network Development

On 08/07/15 at 08:00am, Herbert Xu wrote:
> On Fri, Aug 07, 2015 at 01:58:15AM +0200, Daniel Borkmann wrote:
> >
> > Looks like we had a WARN_ON() in rhashtable_insert_rehash() before, but
> > was removed in a87b9ebf1709 ("rhashtable: Do not schedule more than one
> > rehash if we can't grow further"). Do you want to re-add a WARN_ON_ONCE()?
> 
> I think so.  Thomas?

Makes sense. I removed it because I thought it was not possible to
reach.

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2015-08-08 17:22 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-08-04 15:48 rtnl_mutex deadlock? Linus Torvalds
2015-08-05  5:31 ` Cong Wang
2015-08-05  7:43   ` Jiri Pirko
2015-08-05  8:44     ` Linus Torvalds
2015-08-05 18:59       ` Daniel Borkmann
2015-08-06  0:30         ` Herbert Xu
2015-08-06 14:50           ` Daniel Borkmann
2015-08-06 22:39             ` Daniel Borkmann
2015-08-06 23:42               ` Herbert Xu
2015-08-06 23:41             ` Herbert Xu
2015-08-06 23:58               ` Daniel Borkmann
2015-08-07  0:00                 ` Herbert Xu
2015-08-08 17:22                   ` Thomas Graf
2015-08-06  5:19         ` Herbert Xu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).