Lockdep warning after c0feea594e058223973db94c1c32a830c9807c86

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Lockdep warning after c0feea594e058223973db94c1c32a830c9807c86
@ 2022-10-12 21:08 sdf
  2022-10-13 16:45 ` Martin KaFai Lau
  2022-10-13 20:39 ` Jakub Sitnicki
  0 siblings, 2 replies; 5+ messages in thread
From: sdf @ 2022-10-12 21:08 UTC (permalink / raw)
  To: john.fastabend, jakub; +Cc: netdev, bpf

Hi John & Jakub,

Upstream commit c0feea594e05 ("workqueue: don't skip lockdep work
dependency in cancel_work_sync()") seems to trigger the following
lockdep warning during test_prog's sockmap_listen:

[  +0.003631] WARNING: possible circular locking dependency detected
[  +0.003647] 6.0.0-dbx-DEV #10 Not tainted
[  +0.002402] ------------------------------------------------------
[  +0.003685] kworker/1:0/23 is trying to acquire lock:
[  +0.003012] ffff888100b1e3f0 (sk_lock-AF_INET){+.+.}-{0:0}, at:  
tcp_sendpage+0x28/0x80
[  +0.004655]
               but task is already holding lock:
[  +0.003434] ffff88810642c360 (&psock->work_mutex){+.+.}-{3:3}, at:  
sk_psock_backlog+0x2e/0x370
[  +0.005043]
               which lock already depends on the new lock.

[  +0.004792]
               the existing dependency chain (in reverse order) is:
[  +0.004397]
               -> #2 (&psock->work_mutex){+.+.}-{3:3}:
[  +0.003732]        __mutex_lock_common+0xdf/0xe70
[  +0.002958]        mutex_lock_nested+0x20/0x30
[  +0.002685]        sk_psock_backlog+0x2e/0x370
[  +0.002689]        process_one_work+0x22c/0x3b0
[  +0.002815]        worker_thread+0x21b/0x400
[  +0.002652]        kthread+0xf7/0x110
[  +0.002406]        ret_from_fork+0x1f/0x30
[  +0.002512]
               -> #1 ((work_completion)(&psock->work)){+.+.}-{0:0}:
[  +0.004457]        __flush_work+0x6b/0xd0
[  +0.002638]        __cancel_work_timer+0x11a/0x1a0
[  +0.002973]        cancel_work_sync+0x10/0x20
[  +0.002724]        sk_psock_stop+0x298/0x2b0
[  +0.002969]        sock_map_close+0xd8/0x140
[  +0.002739]        inet_release+0x57/0x80
[  +0.002475]        sock_close+0x4b/0xe0
[  +0.002380]        __fput+0x101/0x230
[  +0.002347]        ____fput+0xe/0x10
[  +0.002259]        task_work_run+0x5d/0xb0
[  +0.002535]        exit_to_user_mode_loop+0xd6/0xf0
[  +0.003019]        exit_to_user_mode_prepare+0xa6/0x100
[  +0.003201]        syscall_exit_to_user_mode+0x5b/0x160
[  +0.003145]        do_syscall_64+0x49/0x80
[  +0.002549]        entry_SYSCALL_64_after_hwframe+0x63/0xcd
[  +0.003410]
               -> #0 (sk_lock-AF_INET){+.+.}-{0:0}:
[  +0.003906]        __lock_acquire+0x16f4/0x30c0
[  +0.002837]        lock_acquire+0xc5/0x1c0
[  +0.002599]        lock_sock_nested+0x32/0x80
[  +0.002690]        tcp_sendpage+0x28/0x80
[  +0.002435]        inet_sendpage+0x7b/0xe0
[  +0.002534]        kernel_sendpage+0x5d/0xa0
[  +0.002709]        skb_send_sock+0x24b/0x2d0
[  +0.002662]        sk_psock_backlog+0x106/0x370
[  +0.002908]        process_one_work+0x22c/0x3b0
[  +0.002736]        worker_thread+0x21b/0x400
[  +0.002552]        kthread+0xf7/0x110
[  +0.002252]        ret_from_fork+0x1f/0x30
[  +0.002480]
               other info that might help us debug this:

[  +0.004778] Chain exists of:
                 sk_lock-AF_INET --> (work_completion)(&psock->work) -->  
&psock->work_mutex

[  +0.007265]  Possible unsafe locking scenario:

[  +0.003496]        CPU0                    CPU1
[  +0.002717]        ----                    ----
[  +0.002809]   lock(&psock->work_mutex);
[  +0.002335]                                 
lock((work_completion)(&psock->work));
[  +0.004496]                                lock(&psock->work_mutex);
[  +0.003766]   lock(sk_lock-AF_INET);
[  +0.002185]
                *** DEADLOCK ***

[  +0.003600] 3 locks held by kworker/1:0/23:
[  +0.002698]  #0: ffff888100055138 ((wq_completion)events){+.+.}-{0:0},  
at: process_one_work+0x1d6/0x3b0
[  +0.005552]  #1: ffffc900001e7e58  
((work_completion)(&psock->work)){+.+.}-{0:0}, at:  
process_one_work+0x1fc/0x3b0
[  +0.006085]  #2: ffff88810642c360 (&psock->work_mutex){+.+.}-{3:3}, at:  
sk_psock_backlog+0x2e/0x370
[  +0.005424]
               stack backtrace:
[  +0.002689] CPU: 1 PID: 23 Comm: kworker/1:0 Not tainted 6.0.0-dbx-DEV #10
[  +0.004086] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS  
rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
[  +0.006806] Workqueue: events sk_psock_backlog
[  +0.002699] Call Trace:
[  +0.001577]  <TASK>
[  +0.001350]  dump_stack_lvl+0x69/0xaa
[  +0.002225]  dump_stack+0x10/0x12
[  +0.002051]  print_circular_bug+0x289/0x290
[  +0.002531]  check_noncircular+0x12c/0x140
[  +0.002578]  __lock_acquire+0x16f4/0x30c0
[  +0.002483]  ? ret_from_fork+0x1f/0x30
[  +0.002297]  ? __this_cpu_preempt_check+0x13/0x20
[  +0.002869]  ? lock_is_held_type+0xf8/0x160
[  +0.002511]  lock_acquire+0xc5/0x1c0
[  +0.002165]  ? tcp_sendpage+0x28/0x80
[  +0.002367]  lock_sock_nested+0x32/0x80
[  +0.002401]  ? tcp_sendpage+0x28/0x80
[  +0.002262]  tcp_sendpage+0x28/0x80
[  +0.002148]  inet_sendpage+0x7b/0x[   12.231432] sysrq: Power Off
e0
[  +0.002202[   12.234545] kvm: exiting hardware virtualization
]  kernel_sendpage+0x5d/0xa0
[  +0.002277]  skb_send_sock+0x24b/0x2d0
[  +0.002278]  ? _raw_spin_unlock_irqrestore+0x35/0x60
[  +0.003030]  ? __this_cpu_preempt_check+0x13/0x20
[  +0.002861]  ? lockdep_hardirqs_on+0x97/0x140
[  +0.002685]  ? trace_hardirqs_on+0x47/0x50
[  +0.002576]  ? _raw_spin_unlock_irqrestore+0x40/0x60
[  +0.003207]  sk_psock_backlog+0x106/0x370
[  +0.002476]  process_one_work+0x22c/0x3b0
[  +0.002473]  worker_thread+0x21b/0x400
[  +0.002335]  kthread+0xf7/0x110
[  +0.001954]  ? rcu_lock_release+0x20/0x20
[  +0.002444]  ? kthread_blkcg+0x30/0x30
[  +0.002325]  ret_from_fork+0x1f/0x30
[  +0.002221]  </TASK>

This is on bpf-next:

commit d31ada3b511141f4b78cae5a05cc2dad887c40b7 (HEAD -> bpf-next,
bpf-next/master)
Author: David Vernet <void@manifault.com>
Date:   Tue Oct 11 11:52:55 2022 -0500

     selftests/bpf: Alphabetize DENYLISTs

Are you ware? Any idea what's wrong?
Is there some stable fix I'm missing in bpf-next?

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Lockdep warning after c0feea594e058223973db94c1c32a830c9807c86
  2022-10-12 21:08 Lockdep warning after c0feea594e058223973db94c1c32a830c9807c86 sdf
@ 2022-10-13 16:45 ` Martin KaFai Lau
  2022-10-13 20:39 ` Jakub Sitnicki
  1 sibling, 0 replies; 5+ messages in thread
From: Martin KaFai Lau @ 2022-10-13 16:45 UTC (permalink / raw)
  To: john.fastabend, jakub; +Cc: netdev, bpf, sdf

On 10/12/22 2:08 PM, sdf@google.com wrote:
> Hi John & Jakub,
> 
> Upstream commit c0feea594e05 ("workqueue: don't skip lockdep work
> dependency in cancel_work_sync()") seems to trigger the following
> lockdep warning during test_prog's sockmap_listen:
> 
> [  +0.003631] WARNING: possible circular locking dependency detected
> [  +0.003647] 6.0.0-dbx-DEV #10 Not tainted
> [  +0.002402] ------------------------------------------------------
> [  +0.003685] kworker/1:0/23 is trying to acquire lock:
> [  +0.003012] ffff888100b1e3f0 (sk_lock-AF_INET){+.+.}-{0:0}, at: 
> tcp_sendpage+0x28/0x80
> [  +0.004655]
>                but task is already holding lock:
> [  +0.003434] ffff88810642c360 (&psock->work_mutex){+.+.}-{3:3}, at: 
> sk_psock_backlog+0x2e/0x370
> [  +0.005043]
>                which lock already depends on the new lock.
> 
> [  +0.004792]
>                the existing dependency chain (in reverse order) is:
> [  +0.004397]
>                -> #2 (&psock->work_mutex){+.+.}-{3:3}:
> [  +0.003732]        __mutex_lock_common+0xdf/0xe70
> [  +0.002958]        mutex_lock_nested+0x20/0x30
> [  +0.002685]        sk_psock_backlog+0x2e/0x370
> [  +0.002689]        process_one_work+0x22c/0x3b0
> [  +0.002815]        worker_thread+0x21b/0x400
> [  +0.002652]        kthread+0xf7/0x110
> [  +0.002406]        ret_from_fork+0x1f/0x30
> [  +0.002512]
>                -> #1 ((work_completion)(&psock->work)){+.+.}-{0:0}:
> [  +0.004457]        __flush_work+0x6b/0xd0
> [  +0.002638]        __cancel_work_timer+0x11a/0x1a0
> [  +0.002973]        cancel_work_sync+0x10/0x20
> [  +0.002724]        sk_psock_stop+0x298/0x2b0
> [  +0.002969]        sock_map_close+0xd8/0x140
> [  +0.002739]        inet_release+0x57/0x80
> [  +0.002475]        sock_close+0x4b/0xe0
> [  +0.002380]        __fput+0x101/0x230
> [  +0.002347]        ____fput+0xe/0x10
> [  +0.002259]        task_work_run+0x5d/0xb0
> [  +0.002535]        exit_to_user_mode_loop+0xd6/0xf0
> [  +0.003019]        exit_to_user_mode_prepare+0xa6/0x100
> [  +0.003201]        syscall_exit_to_user_mode+0x5b/0x160
> [  +0.003145]        do_syscall_64+0x49/0x80
> [  +0.002549]        entry_SYSCALL_64_after_hwframe+0x63/0xcd
> [  +0.003410]
>                -> #0 (sk_lock-AF_INET){+.+.}-{0:0}:
> [  +0.003906]        __lock_acquire+0x16f4/0x30c0
> [  +0.002837]        lock_acquire+0xc5/0x1c0
> [  +0.002599]        lock_sock_nested+0x32/0x80
> [  +0.002690]        tcp_sendpage+0x28/0x80
> [  +0.002435]        inet_sendpage+0x7b/0xe0
> [  +0.002534]        kernel_sendpage+0x5d/0xa0
> [  +0.002709]        skb_send_sock+0x24b/0x2d0
> [  +0.002662]        sk_psock_backlog+0x106/0x370
> [  +0.002908]        process_one_work+0x22c/0x3b0
> [  +0.002736]        worker_thread+0x21b/0x400
> [  +0.002552]        kthread+0xf7/0x110
> [  +0.002252]        ret_from_fork+0x1f/0x30
> [  +0.002480]
>                other info that might help us debug this:
> 
> [  +0.004778] Chain exists of:
>                  sk_lock-AF_INET --> (work_completion)(&psock->work) --> 
> &psock->work_mutex
> 
> [  +0.007265]  Possible unsafe locking scenario:
> 
> [  +0.003496]        CPU0                    CPU1
> [  +0.002717]        ----                    ----
> [  +0.002809]   lock(&psock->work_mutex);
> [  +0.002335]                                lock((work_completion)(&psock->work));
> [  +0.004496]                                lock(&psock->work_mutex);
> [  +0.003766]   lock(sk_lock-AF_INET);
> [  +0.002185]
>                 *** DEADLOCK ***
> 
> [  +0.003600] 3 locks held by kworker/1:0/23:
> [  +0.002698]  #0: ffff888100055138 ((wq_completion)events){+.+.}-{0:0}, at: 
> process_one_work+0x1d6/0x3b0
> [  +0.005552]  #1: ffffc900001e7e58 
> ((work_completion)(&psock->work)){+.+.}-{0:0}, at: process_one_work+0x1fc/0x3b0
> [  +0.006085]  #2: ffff88810642c360 (&psock->work_mutex){+.+.}-{3:3}, at: 
> sk_psock_backlog+0x2e/0x370
> [  +0.005424]
>                stack backtrace:
> [  +0.002689] CPU: 1 PID: 23 Comm: kworker/1:0 Not tainted 6.0.0-dbx-DEV #10
> [  +0.004086] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
> rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
> [  +0.006806] Workqueue: events sk_psock_backlog
> [  +0.002699] Call Trace:
> [  +0.001577]  <TASK>
> [  +0.001350]  dump_stack_lvl+0x69/0xaa
> [  +0.002225]  dump_stack+0x10/0x12
> [  +0.002051]  print_circular_bug+0x289/0x290
> [  +0.002531]  check_noncircular+0x12c/0x140
> [  +0.002578]  __lock_acquire+0x16f4/0x30c0
> [  +0.002483]  ? ret_from_fork+0x1f/0x30
> [  +0.002297]  ? __this_cpu_preempt_check+0x13/0x20
> [  +0.002869]  ? lock_is_held_type+0xf8/0x160
> [  +0.002511]  lock_acquire+0xc5/0x1c0
> [  +0.002165]  ? tcp_sendpage+0x28/0x80
> [  +0.002367]  lock_sock_nested+0x32/0x80
> [  +0.002401]  ? tcp_sendpage+0x28/0x80
> [  +0.002262]  tcp_sendpage+0x28/0x80
> [  +0.002148]  inet_sendpage+0x7b/0x[   12.231432] sysrq: Power Off
> e0
> [  +0.002202[   12.234545] kvm: exiting hardware virtualization
> ]  kernel_sendpage+0x5d/0xa0
> [  +0.002277]  skb_send_sock+0x24b/0x2d0
> [  +0.002278]  ? _raw_spin_unlock_irqrestore+0x35/0x60
> [  +0.003030]  ? __this_cpu_preempt_check+0x13/0x20
> [  +0.002861]  ? lockdep_hardirqs_on+0x97/0x140
> [  +0.002685]  ? trace_hardirqs_on+0x47/0x50
> [  +0.002576]  ? _raw_spin_unlock_irqrestore+0x40/0x60
> [  +0.003207]  sk_psock_backlog+0x106/0x370
> [  +0.002476]  process_one_work+0x22c/0x3b0
> [  +0.002473]  worker_thread+0x21b/0x400
> [  +0.002335]  kthread+0xf7/0x110
> [  +0.001954]  ? rcu_lock_release+0x20/0x20
> [  +0.002444]  ? kthread_blkcg+0x30/0x30
> [  +0.002325]  ret_from_fork+0x1f/0x30
> [  +0.002221]  </TASK>
> 
> This is on bpf-next:
> 
> commit d31ada3b511141f4b78cae5a05cc2dad887c40b7 (HEAD -> bpf-next,
> bpf-next/master)
> Author: David Vernet <void@manifault.com>
> Date:   Tue Oct 11 11:52:55 2022 -0500
> 
>      selftests/bpf: Alphabetize DENYLISTs
> 
> Are you ware? Any idea what's wrong?
> Is there some stable fix I'm missing in bpf-next?

fwiw, CI has been hitting it pretty often also: 
https://github.com/kernel-patches/bpf/actions/runs/3238290042/jobs/5306551130#step:6:5522

Unless bpf-next is missing some fixes, this needs to be addressed.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Lockdep warning after c0feea594e058223973db94c1c32a830c9807c86
  2022-10-12 21:08 Lockdep warning after c0feea594e058223973db94c1c32a830c9807c86 sdf
  2022-10-13 16:45 ` Martin KaFai Lau
@ 2022-10-13 20:39 ` Jakub Sitnicki
  2022-10-16 18:11   ` Cong Wang
  1 sibling, 1 reply; 5+ messages in thread
From: Jakub Sitnicki @ 2022-10-13 20:39 UTC (permalink / raw)
  To: sdf; +Cc: john.fastabend, netdev, bpf, Cong Wang, Martin KaFai Lau

Hi Stan,

On Wed, Oct 12, 2022 at 02:08 PM -07, sdf@google.com wrote:
> Hi John & Jakub,
>
> Upstream commit c0feea594e05 ("workqueue: don't skip lockdep work
> dependency in cancel_work_sync()") seems to trigger the following
> lockdep warning during test_prog's sockmap_listen:
>
> [  +0.003631] WARNING: possible circular locking dependency detected

[...]

> Are you ware? Any idea what's wrong?
> Is there some stable fix I'm missing in bpf-next?

Thanks for bringing it up. I didn't know.

The mentioned commit doesn't look that fresh

commit c0feea594e058223973db94c1c32a830c9807c86
Author: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Date:   Fri Jul 29 13:30:23 2022 +0900

    workqueue: don't skip lockdep work dependency in cancel_work_sync()

... but then it just landed not so long ago, which explains things:

$ git describe --contains c0feea594e058223973db94c1c32a830c9807c86 --match 'v*'
v6.0-rc7~10^2

I've untangled the call chains leading to the potential dead-lock a
bit. There does seem to be a window of opportunity there.

psock->work.func = sk_psock_backlog()
  ACQUIRE psock->work_mutex
    sk_psock_handle_skb()
      skb_send_sock()
        __skb_send_sock()
          sendpage_unlocked()
            kernel_sendpage()
              sock->ops->sendpage = inet_sendpage()
                sk->sk_prot->sendpage = tcp_sendpage()
                  ACQUIRE sk->sk_lock
                    tcp_sendpage_locked()
                  RELEASE sk->sk_lock
  RELEASE psock->work_mutex

sock_map_close()
  ACQUIRE sk->sk_lock
  sk_psock_stop()
    sk_psock_clear_state(psock, SK_PSOCK_TX_ENABLED)
    cancel_work_sync()
      __cancel_work_timer()
        __flush_work()
          // wait for psock->work to finish
  RELEASE sk->sk_lock

There is no fix I know of. Need to think. Ideas welcome.

CC Cong, just FYI, because we did rearrange the locking scheme in [1].

However it looks to me like the dead-lock was already there before that.

[1] https://lore.kernel.org/bpf/20210331023237.41094-5-xiyou.wangcong@gmail.com/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Lockdep warning after c0feea594e058223973db94c1c32a830c9807c86
  2022-10-13 20:39 ` Jakub Sitnicki
@ 2022-10-16 18:11   ` Cong Wang
  2022-10-24  9:36     ` Jakub Sitnicki
  0 siblings, 1 reply; 5+ messages in thread
From: Cong Wang @ 2022-10-16 18:11 UTC (permalink / raw)
  To: Jakub Sitnicki
  Cc: sdf, john.fastabend, netdev, bpf, Cong Wang, Martin KaFai Lau

On Thu, Oct 13, 2022 at 10:39:08PM +0200, Jakub Sitnicki wrote:
> Hi Stan,
> 
> On Wed, Oct 12, 2022 at 02:08 PM -07, sdf@google.com wrote:
> > Hi John & Jakub,
> >
> > Upstream commit c0feea594e05 ("workqueue: don't skip lockdep work
> > dependency in cancel_work_sync()") seems to trigger the following
> > lockdep warning during test_prog's sockmap_listen:
> >
> > [  +0.003631] WARNING: possible circular locking dependency detected
> 
> [...]
> 
> > Are you ware? Any idea what's wrong?
> > Is there some stable fix I'm missing in bpf-next?
> 
> Thanks for bringing it up. I didn't know.
> 
> The mentioned commit doesn't look that fresh
> 
> commit c0feea594e058223973db94c1c32a830c9807c86
> Author: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> Date:   Fri Jul 29 13:30:23 2022 +0900
> 
>     workqueue: don't skip lockdep work dependency in cancel_work_sync()
> 
> ... but then it just landed not so long ago, which explains things:
> 
> $ git describe --contains c0feea594e058223973db94c1c32a830c9807c86 --match 'v*'
> v6.0-rc7~10^2
> 
> I've untangled the call chains leading to the potential dead-lock a
> bit. There does seem to be a window of opportunity there.
> 
> psock->work.func = sk_psock_backlog()
>   ACQUIRE psock->work_mutex
>     sk_psock_handle_skb()
>       skb_send_sock()
>         __skb_send_sock()
>           sendpage_unlocked()
>             kernel_sendpage()
>               sock->ops->sendpage = inet_sendpage()
>                 sk->sk_prot->sendpage = tcp_sendpage()
>                   ACQUIRE sk->sk_lock
>                     tcp_sendpage_locked()
>                   RELEASE sk->sk_lock
>   RELEASE psock->work_mutex
> 
> sock_map_close()
>   ACQUIRE sk->sk_lock
>   sk_psock_stop()
>     sk_psock_clear_state(psock, SK_PSOCK_TX_ENABLED)
>     cancel_work_sync()
>       __cancel_work_timer()
>         __flush_work()
>           // wait for psock->work to finish
>   RELEASE sk->sk_lock
> 
> There is no fix I know of. Need to think. Ideas welcome.
> 

Thanks for the analysis.

I wonder if we can simply move this cancel_work_sync() out of sock
lock... Something like this:

diff --git a/include/linux/skmsg.h b/include/linux/skmsg.h
index 48f4b645193b..70d6cb94e580 100644
--- a/include/linux/skmsg.h
+++ b/include/linux/skmsg.h
@@ -376,7 +376,7 @@ static inline void sk_psock_report_error(struct sk_psock *psock, int err)
 }
 
 struct sk_psock *sk_psock_init(struct sock *sk, int node);
-void sk_psock_stop(struct sk_psock *psock, bool wait);
+void sk_psock_stop(struct sk_psock *psock);
 
 #if IS_ENABLED(CONFIG_BPF_STREAM_PARSER)
 int sk_psock_init_strp(struct sock *sk, struct sk_psock *psock);
diff --git a/net/core/skmsg.c b/net/core/skmsg.c
index ca70525621c7..ddc56660ce97 100644
--- a/net/core/skmsg.c
+++ b/net/core/skmsg.c
@@ -803,16 +803,13 @@ static void sk_psock_link_destroy(struct sk_psock *psock)
 	}
 }
 
-void sk_psock_stop(struct sk_psock *psock, bool wait)
+void sk_psock_stop(struct sk_psock *psock)
 {
 	spin_lock_bh(&psock->ingress_lock);
 	sk_psock_clear_state(psock, SK_PSOCK_TX_ENABLED);
 	sk_psock_cork_free(psock);
 	__sk_psock_zap_ingress(psock);
 	spin_unlock_bh(&psock->ingress_lock);
-
-	if (wait)
-		cancel_work_sync(&psock->work);
 }
 
 static void sk_psock_done_strp(struct sk_psock *psock);
@@ -850,7 +847,7 @@ void sk_psock_drop(struct sock *sk, struct sk_psock *psock)
 		sk_psock_stop_verdict(sk, psock);
 	write_unlock_bh(&sk->sk_callback_lock);
 
-	sk_psock_stop(psock, false);
+	sk_psock_stop(psock);
 
 	INIT_RCU_WORK(&psock->rwork, sk_psock_destroy);
 	queue_rcu_work(system_wq, &psock->rwork);
diff --git a/net/core/sock_map.c b/net/core/sock_map.c
index a660baedd9e7..81beb16ab1eb 100644
--- a/net/core/sock_map.c
+++ b/net/core/sock_map.c
@@ -1596,7 +1596,7 @@ void sock_map_destroy(struct sock *sk)
 	saved_destroy = psock->saved_destroy;
 	sock_map_remove_links(sk, psock);
 	rcu_read_unlock();
-	sk_psock_stop(psock, false);
+	sk_psock_stop(psock);
 	sk_psock_put(sk, psock);
 	saved_destroy(sk);
 }
@@ -1619,9 +1619,10 @@ void sock_map_close(struct sock *sk, long timeout)
 	saved_close = psock->saved_close;
 	sock_map_remove_links(sk, psock);
 	rcu_read_unlock();
-	sk_psock_stop(psock, true);
-	sk_psock_put(sk, psock);
+	sk_psock_stop(psock);
 	release_sock(sk);
+	cancel_work_sync(&psock->work);
+	sk_psock_put(sk, psock);
 	saved_close(sk, timeout);
 }
 EXPORT_SYMBOL_GPL(sock_map_close);

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: Lockdep warning after c0feea594e058223973db94c1c32a830c9807c86
  2022-10-16 18:11   ` Cong Wang
@ 2022-10-24  9:36     ` Jakub Sitnicki
  0 siblings, 0 replies; 5+ messages in thread
From: Jakub Sitnicki @ 2022-10-24  9:36 UTC (permalink / raw)
  To: Cong Wang; +Cc: sdf, john.fastabend, netdev, bpf, Cong Wang, Martin KaFai Lau

On Sun, Oct 16, 2022 at 11:11 AM -07, Cong Wang wrote:
> On Thu, Oct 13, 2022 at 10:39:08PM +0200, Jakub Sitnicki wrote:
>> Hi Stan,
>> 
>> On Wed, Oct 12, 2022 at 02:08 PM -07, sdf@google.com wrote:
>> > Hi John & Jakub,
>> >
>> > Upstream commit c0feea594e05 ("workqueue: don't skip lockdep work
>> > dependency in cancel_work_sync()") seems to trigger the following
>> > lockdep warning during test_prog's sockmap_listen:
>> >
>> > [  +0.003631] WARNING: possible circular locking dependency detected
>> 
>> [...]
>> 
>> > Are you ware? Any idea what's wrong?
>> > Is there some stable fix I'm missing in bpf-next?
>> 
>> Thanks for bringing it up. I didn't know.
>> 
>> The mentioned commit doesn't look that fresh
>> 
>> commit c0feea594e058223973db94c1c32a830c9807c86
>> Author: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
>> Date:   Fri Jul 29 13:30:23 2022 +0900
>> 
>>     workqueue: don't skip lockdep work dependency in cancel_work_sync()
>> 
>> ... but then it just landed not so long ago, which explains things:
>> 
>> $ git describe --contains c0feea594e058223973db94c1c32a830c9807c86 --match 'v*'
>> v6.0-rc7~10^2
>> 
>> I've untangled the call chains leading to the potential dead-lock a
>> bit. There does seem to be a window of opportunity there.
>> 
>> psock->work.func = sk_psock_backlog()
>>   ACQUIRE psock->work_mutex
>>     sk_psock_handle_skb()
>>       skb_send_sock()
>>         __skb_send_sock()
>>           sendpage_unlocked()
>>             kernel_sendpage()
>>               sock->ops->sendpage = inet_sendpage()
>>                 sk->sk_prot->sendpage = tcp_sendpage()
>>                   ACQUIRE sk->sk_lock
>>                     tcp_sendpage_locked()
>>                   RELEASE sk->sk_lock
>>   RELEASE psock->work_mutex
>> 
>> sock_map_close()
>>   ACQUIRE sk->sk_lock
>>   sk_psock_stop()
>>     sk_psock_clear_state(psock, SK_PSOCK_TX_ENABLED)
>>     cancel_work_sync()
>>       __cancel_work_timer()
>>         __flush_work()
>>           // wait for psock->work to finish
>>   RELEASE sk->sk_lock
>> 
>> There is no fix I know of. Need to think. Ideas welcome.
>> 
>
> Thanks for the analysis.
>
> I wonder if we can simply move this cancel_work_sync() out of sock
> lock... Something like this:

[...]

> diff --git a/net/core/sock_map.c b/net/core/sock_map.c
> index a660baedd9e7..81beb16ab1eb 100644
> --- a/net/core/sock_map.c
> +++ b/net/core/sock_map.c
> @@ -1596,7 +1596,7 @@ void sock_map_destroy(struct sock *sk)
>  	saved_destroy = psock->saved_destroy;
>  	sock_map_remove_links(sk, psock);
>  	rcu_read_unlock();
> -	sk_psock_stop(psock, false);
> +	sk_psock_stop(psock);
>  	sk_psock_put(sk, psock);
>  	saved_destroy(sk);
>  }
> @@ -1619,9 +1619,10 @@ void sock_map_close(struct sock *sk, long timeout)
>  	saved_close = psock->saved_close;
>  	sock_map_remove_links(sk, psock);
>  	rcu_read_unlock();
> -	sk_psock_stop(psock, true);
> -	sk_psock_put(sk, psock);
> +	sk_psock_stop(psock);
>  	release_sock(sk);
> +	cancel_work_sync(&psock->work);
> +	sk_psock_put(sk, psock);
>  	saved_close(sk, timeout);
>  }
>  EXPORT_SYMBOL_GPL(sock_map_close);

Sorry for the delay. I've been out.

Great idea. I don't see why not.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-10-24  9:39 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-10-12 21:08 Lockdep warning after c0feea594e058223973db94c1c32a830c9807c86 sdf
2022-10-13 16:45 ` Martin KaFai Lau
2022-10-13 20:39 ` Jakub Sitnicki
2022-10-16 18:11   ` Cong Wang
2022-10-24  9:36     ` Jakub Sitnicki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).