* Lockdep warning after c0feea594e058223973db94c1c32a830c9807c86
@ 2022-10-12 21:08 sdf
2022-10-13 16:45 ` Martin KaFai Lau
2022-10-13 20:39 ` Jakub Sitnicki
0 siblings, 2 replies; 5+ messages in thread
From: sdf @ 2022-10-12 21:08 UTC (permalink / raw)
To: john.fastabend, jakub; +Cc: netdev, bpf
Hi John & Jakub,
Upstream commit c0feea594e05 ("workqueue: don't skip lockdep work
dependency in cancel_work_sync()") seems to trigger the following
lockdep warning during test_prog's sockmap_listen:
[ +0.003631] WARNING: possible circular locking dependency detected
[ +0.003647] 6.0.0-dbx-DEV #10 Not tainted
[ +0.002402] ------------------------------------------------------
[ +0.003685] kworker/1:0/23 is trying to acquire lock:
[ +0.003012] ffff888100b1e3f0 (sk_lock-AF_INET){+.+.}-{0:0}, at:
tcp_sendpage+0x28/0x80
[ +0.004655]
but task is already holding lock:
[ +0.003434] ffff88810642c360 (&psock->work_mutex){+.+.}-{3:3}, at:
sk_psock_backlog+0x2e/0x370
[ +0.005043]
which lock already depends on the new lock.
[ +0.004792]
the existing dependency chain (in reverse order) is:
[ +0.004397]
-> #2 (&psock->work_mutex){+.+.}-{3:3}:
[ +0.003732] __mutex_lock_common+0xdf/0xe70
[ +0.002958] mutex_lock_nested+0x20/0x30
[ +0.002685] sk_psock_backlog+0x2e/0x370
[ +0.002689] process_one_work+0x22c/0x3b0
[ +0.002815] worker_thread+0x21b/0x400
[ +0.002652] kthread+0xf7/0x110
[ +0.002406] ret_from_fork+0x1f/0x30
[ +0.002512]
-> #1 ((work_completion)(&psock->work)){+.+.}-{0:0}:
[ +0.004457] __flush_work+0x6b/0xd0
[ +0.002638] __cancel_work_timer+0x11a/0x1a0
[ +0.002973] cancel_work_sync+0x10/0x20
[ +0.002724] sk_psock_stop+0x298/0x2b0
[ +0.002969] sock_map_close+0xd8/0x140
[ +0.002739] inet_release+0x57/0x80
[ +0.002475] sock_close+0x4b/0xe0
[ +0.002380] __fput+0x101/0x230
[ +0.002347] ____fput+0xe/0x10
[ +0.002259] task_work_run+0x5d/0xb0
[ +0.002535] exit_to_user_mode_loop+0xd6/0xf0
[ +0.003019] exit_to_user_mode_prepare+0xa6/0x100
[ +0.003201] syscall_exit_to_user_mode+0x5b/0x160
[ +0.003145] do_syscall_64+0x49/0x80
[ +0.002549] entry_SYSCALL_64_after_hwframe+0x63/0xcd
[ +0.003410]
-> #0 (sk_lock-AF_INET){+.+.}-{0:0}:
[ +0.003906] __lock_acquire+0x16f4/0x30c0
[ +0.002837] lock_acquire+0xc5/0x1c0
[ +0.002599] lock_sock_nested+0x32/0x80
[ +0.002690] tcp_sendpage+0x28/0x80
[ +0.002435] inet_sendpage+0x7b/0xe0
[ +0.002534] kernel_sendpage+0x5d/0xa0
[ +0.002709] skb_send_sock+0x24b/0x2d0
[ +0.002662] sk_psock_backlog+0x106/0x370
[ +0.002908] process_one_work+0x22c/0x3b0
[ +0.002736] worker_thread+0x21b/0x400
[ +0.002552] kthread+0xf7/0x110
[ +0.002252] ret_from_fork+0x1f/0x30
[ +0.002480]
other info that might help us debug this:
[ +0.004778] Chain exists of:
sk_lock-AF_INET --> (work_completion)(&psock->work) -->
&psock->work_mutex
[ +0.007265] Possible unsafe locking scenario:
[ +0.003496] CPU0 CPU1
[ +0.002717] ---- ----
[ +0.002809] lock(&psock->work_mutex);
[ +0.002335]
lock((work_completion)(&psock->work));
[ +0.004496] lock(&psock->work_mutex);
[ +0.003766] lock(sk_lock-AF_INET);
[ +0.002185]
*** DEADLOCK ***
[ +0.003600] 3 locks held by kworker/1:0/23:
[ +0.002698] #0: ffff888100055138 ((wq_completion)events){+.+.}-{0:0},
at: process_one_work+0x1d6/0x3b0
[ +0.005552] #1: ffffc900001e7e58
((work_completion)(&psock->work)){+.+.}-{0:0}, at:
process_one_work+0x1fc/0x3b0
[ +0.006085] #2: ffff88810642c360 (&psock->work_mutex){+.+.}-{3:3}, at:
sk_psock_backlog+0x2e/0x370
[ +0.005424]
stack backtrace:
[ +0.002689] CPU: 1 PID: 23 Comm: kworker/1:0 Not tainted 6.0.0-dbx-DEV #10
[ +0.004086] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
[ +0.006806] Workqueue: events sk_psock_backlog
[ +0.002699] Call Trace:
[ +0.001577] <TASK>
[ +0.001350] dump_stack_lvl+0x69/0xaa
[ +0.002225] dump_stack+0x10/0x12
[ +0.002051] print_circular_bug+0x289/0x290
[ +0.002531] check_noncircular+0x12c/0x140
[ +0.002578] __lock_acquire+0x16f4/0x30c0
[ +0.002483] ? ret_from_fork+0x1f/0x30
[ +0.002297] ? __this_cpu_preempt_check+0x13/0x20
[ +0.002869] ? lock_is_held_type+0xf8/0x160
[ +0.002511] lock_acquire+0xc5/0x1c0
[ +0.002165] ? tcp_sendpage+0x28/0x80
[ +0.002367] lock_sock_nested+0x32/0x80
[ +0.002401] ? tcp_sendpage+0x28/0x80
[ +0.002262] tcp_sendpage+0x28/0x80
[ +0.002148] inet_sendpage+0x7b/0x[ 12.231432] sysrq: Power Off
e0
[ +0.002202[ 12.234545] kvm: exiting hardware virtualization
] kernel_sendpage+0x5d/0xa0
[ +0.002277] skb_send_sock+0x24b/0x2d0
[ +0.002278] ? _raw_spin_unlock_irqrestore+0x35/0x60
[ +0.003030] ? __this_cpu_preempt_check+0x13/0x20
[ +0.002861] ? lockdep_hardirqs_on+0x97/0x140
[ +0.002685] ? trace_hardirqs_on+0x47/0x50
[ +0.002576] ? _raw_spin_unlock_irqrestore+0x40/0x60
[ +0.003207] sk_psock_backlog+0x106/0x370
[ +0.002476] process_one_work+0x22c/0x3b0
[ +0.002473] worker_thread+0x21b/0x400
[ +0.002335] kthread+0xf7/0x110
[ +0.001954] ? rcu_lock_release+0x20/0x20
[ +0.002444] ? kthread_blkcg+0x30/0x30
[ +0.002325] ret_from_fork+0x1f/0x30
[ +0.002221] </TASK>
This is on bpf-next:
commit d31ada3b511141f4b78cae5a05cc2dad887c40b7 (HEAD -> bpf-next,
bpf-next/master)
Author: David Vernet <void@manifault.com>
Date: Tue Oct 11 11:52:55 2022 -0500
selftests/bpf: Alphabetize DENYLISTs
Are you ware? Any idea what's wrong?
Is there some stable fix I'm missing in bpf-next?
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: Lockdep warning after c0feea594e058223973db94c1c32a830c9807c86 2022-10-12 21:08 Lockdep warning after c0feea594e058223973db94c1c32a830c9807c86 sdf @ 2022-10-13 16:45 ` Martin KaFai Lau 2022-10-13 20:39 ` Jakub Sitnicki 1 sibling, 0 replies; 5+ messages in thread From: Martin KaFai Lau @ 2022-10-13 16:45 UTC (permalink / raw) To: john.fastabend, jakub; +Cc: netdev, bpf, sdf On 10/12/22 2:08 PM, sdf@google.com wrote: > Hi John & Jakub, > > Upstream commit c0feea594e05 ("workqueue: don't skip lockdep work > dependency in cancel_work_sync()") seems to trigger the following > lockdep warning during test_prog's sockmap_listen: > > [ +0.003631] WARNING: possible circular locking dependency detected > [ +0.003647] 6.0.0-dbx-DEV #10 Not tainted > [ +0.002402] ------------------------------------------------------ > [ +0.003685] kworker/1:0/23 is trying to acquire lock: > [ +0.003012] ffff888100b1e3f0 (sk_lock-AF_INET){+.+.}-{0:0}, at: > tcp_sendpage+0x28/0x80 > [ +0.004655] > but task is already holding lock: > [ +0.003434] ffff88810642c360 (&psock->work_mutex){+.+.}-{3:3}, at: > sk_psock_backlog+0x2e/0x370 > [ +0.005043] > which lock already depends on the new lock. > > [ +0.004792] > the existing dependency chain (in reverse order) is: > [ +0.004397] > -> #2 (&psock->work_mutex){+.+.}-{3:3}: > [ +0.003732] __mutex_lock_common+0xdf/0xe70 > [ +0.002958] mutex_lock_nested+0x20/0x30 > [ +0.002685] sk_psock_backlog+0x2e/0x370 > [ +0.002689] process_one_work+0x22c/0x3b0 > [ +0.002815] worker_thread+0x21b/0x400 > [ +0.002652] kthread+0xf7/0x110 > [ +0.002406] ret_from_fork+0x1f/0x30 > [ +0.002512] > -> #1 ((work_completion)(&psock->work)){+.+.}-{0:0}: > [ +0.004457] __flush_work+0x6b/0xd0 > [ +0.002638] __cancel_work_timer+0x11a/0x1a0 > [ +0.002973] cancel_work_sync+0x10/0x20 > [ +0.002724] sk_psock_stop+0x298/0x2b0 > [ +0.002969] sock_map_close+0xd8/0x140 > [ +0.002739] inet_release+0x57/0x80 > [ +0.002475] sock_close+0x4b/0xe0 > [ +0.002380] __fput+0x101/0x230 > [ +0.002347] ____fput+0xe/0x10 > [ +0.002259] task_work_run+0x5d/0xb0 > [ +0.002535] exit_to_user_mode_loop+0xd6/0xf0 > [ +0.003019] exit_to_user_mode_prepare+0xa6/0x100 > [ +0.003201] syscall_exit_to_user_mode+0x5b/0x160 > [ +0.003145] do_syscall_64+0x49/0x80 > [ +0.002549] entry_SYSCALL_64_after_hwframe+0x63/0xcd > [ +0.003410] > -> #0 (sk_lock-AF_INET){+.+.}-{0:0}: > [ +0.003906] __lock_acquire+0x16f4/0x30c0 > [ +0.002837] lock_acquire+0xc5/0x1c0 > [ +0.002599] lock_sock_nested+0x32/0x80 > [ +0.002690] tcp_sendpage+0x28/0x80 > [ +0.002435] inet_sendpage+0x7b/0xe0 > [ +0.002534] kernel_sendpage+0x5d/0xa0 > [ +0.002709] skb_send_sock+0x24b/0x2d0 > [ +0.002662] sk_psock_backlog+0x106/0x370 > [ +0.002908] process_one_work+0x22c/0x3b0 > [ +0.002736] worker_thread+0x21b/0x400 > [ +0.002552] kthread+0xf7/0x110 > [ +0.002252] ret_from_fork+0x1f/0x30 > [ +0.002480] > other info that might help us debug this: > > [ +0.004778] Chain exists of: > sk_lock-AF_INET --> (work_completion)(&psock->work) --> > &psock->work_mutex > > [ +0.007265] Possible unsafe locking scenario: > > [ +0.003496] CPU0 CPU1 > [ +0.002717] ---- ---- > [ +0.002809] lock(&psock->work_mutex); > [ +0.002335] lock((work_completion)(&psock->work)); > [ +0.004496] lock(&psock->work_mutex); > [ +0.003766] lock(sk_lock-AF_INET); > [ +0.002185] > *** DEADLOCK *** > > [ +0.003600] 3 locks held by kworker/1:0/23: > [ +0.002698] #0: ffff888100055138 ((wq_completion)events){+.+.}-{0:0}, at: > process_one_work+0x1d6/0x3b0 > [ +0.005552] #1: ffffc900001e7e58 > ((work_completion)(&psock->work)){+.+.}-{0:0}, at: process_one_work+0x1fc/0x3b0 > [ +0.006085] #2: ffff88810642c360 (&psock->work_mutex){+.+.}-{3:3}, at: > sk_psock_backlog+0x2e/0x370 > [ +0.005424] > stack backtrace: > [ +0.002689] CPU: 1 PID: 23 Comm: kworker/1:0 Not tainted 6.0.0-dbx-DEV #10 > [ +0.004086] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 > [ +0.006806] Workqueue: events sk_psock_backlog > [ +0.002699] Call Trace: > [ +0.001577] <TASK> > [ +0.001350] dump_stack_lvl+0x69/0xaa > [ +0.002225] dump_stack+0x10/0x12 > [ +0.002051] print_circular_bug+0x289/0x290 > [ +0.002531] check_noncircular+0x12c/0x140 > [ +0.002578] __lock_acquire+0x16f4/0x30c0 > [ +0.002483] ? ret_from_fork+0x1f/0x30 > [ +0.002297] ? __this_cpu_preempt_check+0x13/0x20 > [ +0.002869] ? lock_is_held_type+0xf8/0x160 > [ +0.002511] lock_acquire+0xc5/0x1c0 > [ +0.002165] ? tcp_sendpage+0x28/0x80 > [ +0.002367] lock_sock_nested+0x32/0x80 > [ +0.002401] ? tcp_sendpage+0x28/0x80 > [ +0.002262] tcp_sendpage+0x28/0x80 > [ +0.002148] inet_sendpage+0x7b/0x[ 12.231432] sysrq: Power Off > e0 > [ +0.002202[ 12.234545] kvm: exiting hardware virtualization > ] kernel_sendpage+0x5d/0xa0 > [ +0.002277] skb_send_sock+0x24b/0x2d0 > [ +0.002278] ? _raw_spin_unlock_irqrestore+0x35/0x60 > [ +0.003030] ? __this_cpu_preempt_check+0x13/0x20 > [ +0.002861] ? lockdep_hardirqs_on+0x97/0x140 > [ +0.002685] ? trace_hardirqs_on+0x47/0x50 > [ +0.002576] ? _raw_spin_unlock_irqrestore+0x40/0x60 > [ +0.003207] sk_psock_backlog+0x106/0x370 > [ +0.002476] process_one_work+0x22c/0x3b0 > [ +0.002473] worker_thread+0x21b/0x400 > [ +0.002335] kthread+0xf7/0x110 > [ +0.001954] ? rcu_lock_release+0x20/0x20 > [ +0.002444] ? kthread_blkcg+0x30/0x30 > [ +0.002325] ret_from_fork+0x1f/0x30 > [ +0.002221] </TASK> > > This is on bpf-next: > > commit d31ada3b511141f4b78cae5a05cc2dad887c40b7 (HEAD -> bpf-next, > bpf-next/master) > Author: David Vernet <void@manifault.com> > Date: Tue Oct 11 11:52:55 2022 -0500 > > selftests/bpf: Alphabetize DENYLISTs > > Are you ware? Any idea what's wrong? > Is there some stable fix I'm missing in bpf-next? fwiw, CI has been hitting it pretty often also: https://github.com/kernel-patches/bpf/actions/runs/3238290042/jobs/5306551130#step:6:5522 Unless bpf-next is missing some fixes, this needs to be addressed. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Lockdep warning after c0feea594e058223973db94c1c32a830c9807c86 2022-10-12 21:08 Lockdep warning after c0feea594e058223973db94c1c32a830c9807c86 sdf 2022-10-13 16:45 ` Martin KaFai Lau @ 2022-10-13 20:39 ` Jakub Sitnicki 2022-10-16 18:11 ` Cong Wang 1 sibling, 1 reply; 5+ messages in thread From: Jakub Sitnicki @ 2022-10-13 20:39 UTC (permalink / raw) To: sdf; +Cc: john.fastabend, netdev, bpf, Cong Wang, Martin KaFai Lau Hi Stan, On Wed, Oct 12, 2022 at 02:08 PM -07, sdf@google.com wrote: > Hi John & Jakub, > > Upstream commit c0feea594e05 ("workqueue: don't skip lockdep work > dependency in cancel_work_sync()") seems to trigger the following > lockdep warning during test_prog's sockmap_listen: > > [ +0.003631] WARNING: possible circular locking dependency detected [...] > Are you ware? Any idea what's wrong? > Is there some stable fix I'm missing in bpf-next? Thanks for bringing it up. I didn't know. The mentioned commit doesn't look that fresh commit c0feea594e058223973db94c1c32a830c9807c86 Author: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Date: Fri Jul 29 13:30:23 2022 +0900 workqueue: don't skip lockdep work dependency in cancel_work_sync() ... but then it just landed not so long ago, which explains things: $ git describe --contains c0feea594e058223973db94c1c32a830c9807c86 --match 'v*' v6.0-rc7~10^2 I've untangled the call chains leading to the potential dead-lock a bit. There does seem to be a window of opportunity there. psock->work.func = sk_psock_backlog() ACQUIRE psock->work_mutex sk_psock_handle_skb() skb_send_sock() __skb_send_sock() sendpage_unlocked() kernel_sendpage() sock->ops->sendpage = inet_sendpage() sk->sk_prot->sendpage = tcp_sendpage() ACQUIRE sk->sk_lock tcp_sendpage_locked() RELEASE sk->sk_lock RELEASE psock->work_mutex sock_map_close() ACQUIRE sk->sk_lock sk_psock_stop() sk_psock_clear_state(psock, SK_PSOCK_TX_ENABLED) cancel_work_sync() __cancel_work_timer() __flush_work() // wait for psock->work to finish RELEASE sk->sk_lock There is no fix I know of. Need to think. Ideas welcome. CC Cong, just FYI, because we did rearrange the locking scheme in [1]. However it looks to me like the dead-lock was already there before that. [1] https://lore.kernel.org/bpf/20210331023237.41094-5-xiyou.wangcong@gmail.com/ ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Lockdep warning after c0feea594e058223973db94c1c32a830c9807c86 2022-10-13 20:39 ` Jakub Sitnicki @ 2022-10-16 18:11 ` Cong Wang 2022-10-24 9:36 ` Jakub Sitnicki 0 siblings, 1 reply; 5+ messages in thread From: Cong Wang @ 2022-10-16 18:11 UTC (permalink / raw) To: Jakub Sitnicki Cc: sdf, john.fastabend, netdev, bpf, Cong Wang, Martin KaFai Lau On Thu, Oct 13, 2022 at 10:39:08PM +0200, Jakub Sitnicki wrote: > Hi Stan, > > On Wed, Oct 12, 2022 at 02:08 PM -07, sdf@google.com wrote: > > Hi John & Jakub, > > > > Upstream commit c0feea594e05 ("workqueue: don't skip lockdep work > > dependency in cancel_work_sync()") seems to trigger the following > > lockdep warning during test_prog's sockmap_listen: > > > > [ +0.003631] WARNING: possible circular locking dependency detected > > [...] > > > Are you ware? Any idea what's wrong? > > Is there some stable fix I'm missing in bpf-next? > > Thanks for bringing it up. I didn't know. > > The mentioned commit doesn't look that fresh > > commit c0feea594e058223973db94c1c32a830c9807c86 > Author: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> > Date: Fri Jul 29 13:30:23 2022 +0900 > > workqueue: don't skip lockdep work dependency in cancel_work_sync() > > ... but then it just landed not so long ago, which explains things: > > $ git describe --contains c0feea594e058223973db94c1c32a830c9807c86 --match 'v*' > v6.0-rc7~10^2 > > I've untangled the call chains leading to the potential dead-lock a > bit. There does seem to be a window of opportunity there. > > psock->work.func = sk_psock_backlog() > ACQUIRE psock->work_mutex > sk_psock_handle_skb() > skb_send_sock() > __skb_send_sock() > sendpage_unlocked() > kernel_sendpage() > sock->ops->sendpage = inet_sendpage() > sk->sk_prot->sendpage = tcp_sendpage() > ACQUIRE sk->sk_lock > tcp_sendpage_locked() > RELEASE sk->sk_lock > RELEASE psock->work_mutex > > sock_map_close() > ACQUIRE sk->sk_lock > sk_psock_stop() > sk_psock_clear_state(psock, SK_PSOCK_TX_ENABLED) > cancel_work_sync() > __cancel_work_timer() > __flush_work() > // wait for psock->work to finish > RELEASE sk->sk_lock > > There is no fix I know of. Need to think. Ideas welcome. > Thanks for the analysis. I wonder if we can simply move this cancel_work_sync() out of sock lock... Something like this: diff --git a/include/linux/skmsg.h b/include/linux/skmsg.h index 48f4b645193b..70d6cb94e580 100644 --- a/include/linux/skmsg.h +++ b/include/linux/skmsg.h @@ -376,7 +376,7 @@ static inline void sk_psock_report_error(struct sk_psock *psock, int err) } struct sk_psock *sk_psock_init(struct sock *sk, int node); -void sk_psock_stop(struct sk_psock *psock, bool wait); +void sk_psock_stop(struct sk_psock *psock); #if IS_ENABLED(CONFIG_BPF_STREAM_PARSER) int sk_psock_init_strp(struct sock *sk, struct sk_psock *psock); diff --git a/net/core/skmsg.c b/net/core/skmsg.c index ca70525621c7..ddc56660ce97 100644 --- a/net/core/skmsg.c +++ b/net/core/skmsg.c @@ -803,16 +803,13 @@ static void sk_psock_link_destroy(struct sk_psock *psock) } } -void sk_psock_stop(struct sk_psock *psock, bool wait) +void sk_psock_stop(struct sk_psock *psock) { spin_lock_bh(&psock->ingress_lock); sk_psock_clear_state(psock, SK_PSOCK_TX_ENABLED); sk_psock_cork_free(psock); __sk_psock_zap_ingress(psock); spin_unlock_bh(&psock->ingress_lock); - - if (wait) - cancel_work_sync(&psock->work); } static void sk_psock_done_strp(struct sk_psock *psock); @@ -850,7 +847,7 @@ void sk_psock_drop(struct sock *sk, struct sk_psock *psock) sk_psock_stop_verdict(sk, psock); write_unlock_bh(&sk->sk_callback_lock); - sk_psock_stop(psock, false); + sk_psock_stop(psock); INIT_RCU_WORK(&psock->rwork, sk_psock_destroy); queue_rcu_work(system_wq, &psock->rwork); diff --git a/net/core/sock_map.c b/net/core/sock_map.c index a660baedd9e7..81beb16ab1eb 100644 --- a/net/core/sock_map.c +++ b/net/core/sock_map.c @@ -1596,7 +1596,7 @@ void sock_map_destroy(struct sock *sk) saved_destroy = psock->saved_destroy; sock_map_remove_links(sk, psock); rcu_read_unlock(); - sk_psock_stop(psock, false); + sk_psock_stop(psock); sk_psock_put(sk, psock); saved_destroy(sk); } @@ -1619,9 +1619,10 @@ void sock_map_close(struct sock *sk, long timeout) saved_close = psock->saved_close; sock_map_remove_links(sk, psock); rcu_read_unlock(); - sk_psock_stop(psock, true); - sk_psock_put(sk, psock); + sk_psock_stop(psock); release_sock(sk); + cancel_work_sync(&psock->work); + sk_psock_put(sk, psock); saved_close(sk, timeout); } EXPORT_SYMBOL_GPL(sock_map_close); ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: Lockdep warning after c0feea594e058223973db94c1c32a830c9807c86 2022-10-16 18:11 ` Cong Wang @ 2022-10-24 9:36 ` Jakub Sitnicki 0 siblings, 0 replies; 5+ messages in thread From: Jakub Sitnicki @ 2022-10-24 9:36 UTC (permalink / raw) To: Cong Wang; +Cc: sdf, john.fastabend, netdev, bpf, Cong Wang, Martin KaFai Lau On Sun, Oct 16, 2022 at 11:11 AM -07, Cong Wang wrote: > On Thu, Oct 13, 2022 at 10:39:08PM +0200, Jakub Sitnicki wrote: >> Hi Stan, >> >> On Wed, Oct 12, 2022 at 02:08 PM -07, sdf@google.com wrote: >> > Hi John & Jakub, >> > >> > Upstream commit c0feea594e05 ("workqueue: don't skip lockdep work >> > dependency in cancel_work_sync()") seems to trigger the following >> > lockdep warning during test_prog's sockmap_listen: >> > >> > [ +0.003631] WARNING: possible circular locking dependency detected >> >> [...] >> >> > Are you ware? Any idea what's wrong? >> > Is there some stable fix I'm missing in bpf-next? >> >> Thanks for bringing it up. I didn't know. >> >> The mentioned commit doesn't look that fresh >> >> commit c0feea594e058223973db94c1c32a830c9807c86 >> Author: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> >> Date: Fri Jul 29 13:30:23 2022 +0900 >> >> workqueue: don't skip lockdep work dependency in cancel_work_sync() >> >> ... but then it just landed not so long ago, which explains things: >> >> $ git describe --contains c0feea594e058223973db94c1c32a830c9807c86 --match 'v*' >> v6.0-rc7~10^2 >> >> I've untangled the call chains leading to the potential dead-lock a >> bit. There does seem to be a window of opportunity there. >> >> psock->work.func = sk_psock_backlog() >> ACQUIRE psock->work_mutex >> sk_psock_handle_skb() >> skb_send_sock() >> __skb_send_sock() >> sendpage_unlocked() >> kernel_sendpage() >> sock->ops->sendpage = inet_sendpage() >> sk->sk_prot->sendpage = tcp_sendpage() >> ACQUIRE sk->sk_lock >> tcp_sendpage_locked() >> RELEASE sk->sk_lock >> RELEASE psock->work_mutex >> >> sock_map_close() >> ACQUIRE sk->sk_lock >> sk_psock_stop() >> sk_psock_clear_state(psock, SK_PSOCK_TX_ENABLED) >> cancel_work_sync() >> __cancel_work_timer() >> __flush_work() >> // wait for psock->work to finish >> RELEASE sk->sk_lock >> >> There is no fix I know of. Need to think. Ideas welcome. >> > > Thanks for the analysis. > > I wonder if we can simply move this cancel_work_sync() out of sock > lock... Something like this: [...] > diff --git a/net/core/sock_map.c b/net/core/sock_map.c > index a660baedd9e7..81beb16ab1eb 100644 > --- a/net/core/sock_map.c > +++ b/net/core/sock_map.c > @@ -1596,7 +1596,7 @@ void sock_map_destroy(struct sock *sk) > saved_destroy = psock->saved_destroy; > sock_map_remove_links(sk, psock); > rcu_read_unlock(); > - sk_psock_stop(psock, false); > + sk_psock_stop(psock); > sk_psock_put(sk, psock); > saved_destroy(sk); > } > @@ -1619,9 +1619,10 @@ void sock_map_close(struct sock *sk, long timeout) > saved_close = psock->saved_close; > sock_map_remove_links(sk, psock); > rcu_read_unlock(); > - sk_psock_stop(psock, true); > - sk_psock_put(sk, psock); > + sk_psock_stop(psock); > release_sock(sk); > + cancel_work_sync(&psock->work); > + sk_psock_put(sk, psock); > saved_close(sk, timeout); > } > EXPORT_SYMBOL_GPL(sock_map_close); Sorry for the delay. I've been out. Great idea. I don't see why not. ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2022-10-24 9:39 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2022-10-12 21:08 Lockdep warning after c0feea594e058223973db94c1c32a830c9807c86 sdf 2022-10-13 16:45 ` Martin KaFai Lau 2022-10-13 20:39 ` Jakub Sitnicki 2022-10-16 18:11 ` Cong Wang 2022-10-24 9:36 ` Jakub Sitnicki
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).