From: Martin KaFai Lau <martin.lau@linux.dev>
To: john.fastabend@gmail.com, jakub@cloudflare.com
Cc: netdev@vger.kernel.org, bpf@vger.kernel.org, sdf@google.com
Subject: Re: Lockdep warning after c0feea594e058223973db94c1c32a830c9807c86
Date: Thu, 13 Oct 2022 09:45:56 -0700 [thread overview]
Message-ID: <ac4831b9-44ef-29dc-bf9f-0d9adf503123@linux.dev> (raw)
In-Reply-To: <Y0csu2SwegJ8Tab+@google.com>
On 10/12/22 2:08 PM, sdf@google.com wrote:
> Hi John & Jakub,
>
> Upstream commit c0feea594e05 ("workqueue: don't skip lockdep work
> dependency in cancel_work_sync()") seems to trigger the following
> lockdep warning during test_prog's sockmap_listen:
>
> [ +0.003631] WARNING: possible circular locking dependency detected
> [ +0.003647] 6.0.0-dbx-DEV #10 Not tainted
> [ +0.002402] ------------------------------------------------------
> [ +0.003685] kworker/1:0/23 is trying to acquire lock:
> [ +0.003012] ffff888100b1e3f0 (sk_lock-AF_INET){+.+.}-{0:0}, at:
> tcp_sendpage+0x28/0x80
> [ +0.004655]
> but task is already holding lock:
> [ +0.003434] ffff88810642c360 (&psock->work_mutex){+.+.}-{3:3}, at:
> sk_psock_backlog+0x2e/0x370
> [ +0.005043]
> which lock already depends on the new lock.
>
> [ +0.004792]
> the existing dependency chain (in reverse order) is:
> [ +0.004397]
> -> #2 (&psock->work_mutex){+.+.}-{3:3}:
> [ +0.003732] __mutex_lock_common+0xdf/0xe70
> [ +0.002958] mutex_lock_nested+0x20/0x30
> [ +0.002685] sk_psock_backlog+0x2e/0x370
> [ +0.002689] process_one_work+0x22c/0x3b0
> [ +0.002815] worker_thread+0x21b/0x400
> [ +0.002652] kthread+0xf7/0x110
> [ +0.002406] ret_from_fork+0x1f/0x30
> [ +0.002512]
> -> #1 ((work_completion)(&psock->work)){+.+.}-{0:0}:
> [ +0.004457] __flush_work+0x6b/0xd0
> [ +0.002638] __cancel_work_timer+0x11a/0x1a0
> [ +0.002973] cancel_work_sync+0x10/0x20
> [ +0.002724] sk_psock_stop+0x298/0x2b0
> [ +0.002969] sock_map_close+0xd8/0x140
> [ +0.002739] inet_release+0x57/0x80
> [ +0.002475] sock_close+0x4b/0xe0
> [ +0.002380] __fput+0x101/0x230
> [ +0.002347] ____fput+0xe/0x10
> [ +0.002259] task_work_run+0x5d/0xb0
> [ +0.002535] exit_to_user_mode_loop+0xd6/0xf0
> [ +0.003019] exit_to_user_mode_prepare+0xa6/0x100
> [ +0.003201] syscall_exit_to_user_mode+0x5b/0x160
> [ +0.003145] do_syscall_64+0x49/0x80
> [ +0.002549] entry_SYSCALL_64_after_hwframe+0x63/0xcd
> [ +0.003410]
> -> #0 (sk_lock-AF_INET){+.+.}-{0:0}:
> [ +0.003906] __lock_acquire+0x16f4/0x30c0
> [ +0.002837] lock_acquire+0xc5/0x1c0
> [ +0.002599] lock_sock_nested+0x32/0x80
> [ +0.002690] tcp_sendpage+0x28/0x80
> [ +0.002435] inet_sendpage+0x7b/0xe0
> [ +0.002534] kernel_sendpage+0x5d/0xa0
> [ +0.002709] skb_send_sock+0x24b/0x2d0
> [ +0.002662] sk_psock_backlog+0x106/0x370
> [ +0.002908] process_one_work+0x22c/0x3b0
> [ +0.002736] worker_thread+0x21b/0x400
> [ +0.002552] kthread+0xf7/0x110
> [ +0.002252] ret_from_fork+0x1f/0x30
> [ +0.002480]
> other info that might help us debug this:
>
> [ +0.004778] Chain exists of:
> sk_lock-AF_INET --> (work_completion)(&psock->work) -->
> &psock->work_mutex
>
> [ +0.007265] Possible unsafe locking scenario:
>
> [ +0.003496] CPU0 CPU1
> [ +0.002717] ---- ----
> [ +0.002809] lock(&psock->work_mutex);
> [ +0.002335] lock((work_completion)(&psock->work));
> [ +0.004496] lock(&psock->work_mutex);
> [ +0.003766] lock(sk_lock-AF_INET);
> [ +0.002185]
> *** DEADLOCK ***
>
> [ +0.003600] 3 locks held by kworker/1:0/23:
> [ +0.002698] #0: ffff888100055138 ((wq_completion)events){+.+.}-{0:0}, at:
> process_one_work+0x1d6/0x3b0
> [ +0.005552] #1: ffffc900001e7e58
> ((work_completion)(&psock->work)){+.+.}-{0:0}, at: process_one_work+0x1fc/0x3b0
> [ +0.006085] #2: ffff88810642c360 (&psock->work_mutex){+.+.}-{3:3}, at:
> sk_psock_backlog+0x2e/0x370
> [ +0.005424]
> stack backtrace:
> [ +0.002689] CPU: 1 PID: 23 Comm: kworker/1:0 Not tainted 6.0.0-dbx-DEV #10
> [ +0.004086] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
> [ +0.006806] Workqueue: events sk_psock_backlog
> [ +0.002699] Call Trace:
> [ +0.001577] <TASK>
> [ +0.001350] dump_stack_lvl+0x69/0xaa
> [ +0.002225] dump_stack+0x10/0x12
> [ +0.002051] print_circular_bug+0x289/0x290
> [ +0.002531] check_noncircular+0x12c/0x140
> [ +0.002578] __lock_acquire+0x16f4/0x30c0
> [ +0.002483] ? ret_from_fork+0x1f/0x30
> [ +0.002297] ? __this_cpu_preempt_check+0x13/0x20
> [ +0.002869] ? lock_is_held_type+0xf8/0x160
> [ +0.002511] lock_acquire+0xc5/0x1c0
> [ +0.002165] ? tcp_sendpage+0x28/0x80
> [ +0.002367] lock_sock_nested+0x32/0x80
> [ +0.002401] ? tcp_sendpage+0x28/0x80
> [ +0.002262] tcp_sendpage+0x28/0x80
> [ +0.002148] inet_sendpage+0x7b/0x[ 12.231432] sysrq: Power Off
> e0
> [ +0.002202[ 12.234545] kvm: exiting hardware virtualization
> ] kernel_sendpage+0x5d/0xa0
> [ +0.002277] skb_send_sock+0x24b/0x2d0
> [ +0.002278] ? _raw_spin_unlock_irqrestore+0x35/0x60
> [ +0.003030] ? __this_cpu_preempt_check+0x13/0x20
> [ +0.002861] ? lockdep_hardirqs_on+0x97/0x140
> [ +0.002685] ? trace_hardirqs_on+0x47/0x50
> [ +0.002576] ? _raw_spin_unlock_irqrestore+0x40/0x60
> [ +0.003207] sk_psock_backlog+0x106/0x370
> [ +0.002476] process_one_work+0x22c/0x3b0
> [ +0.002473] worker_thread+0x21b/0x400
> [ +0.002335] kthread+0xf7/0x110
> [ +0.001954] ? rcu_lock_release+0x20/0x20
> [ +0.002444] ? kthread_blkcg+0x30/0x30
> [ +0.002325] ret_from_fork+0x1f/0x30
> [ +0.002221] </TASK>
>
> This is on bpf-next:
>
> commit d31ada3b511141f4b78cae5a05cc2dad887c40b7 (HEAD -> bpf-next,
> bpf-next/master)
> Author: David Vernet <void@manifault.com>
> Date: Tue Oct 11 11:52:55 2022 -0500
>
> selftests/bpf: Alphabetize DENYLISTs
>
> Are you ware? Any idea what's wrong?
> Is there some stable fix I'm missing in bpf-next?
fwiw, CI has been hitting it pretty often also:
https://github.com/kernel-patches/bpf/actions/runs/3238290042/jobs/5306551130#step:6:5522
Unless bpf-next is missing some fixes, this needs to be addressed.
next prev parent reply other threads:[~2022-10-13 16:46 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-12 21:08 Lockdep warning after c0feea594e058223973db94c1c32a830c9807c86 sdf
2022-10-13 16:45 ` Martin KaFai Lau [this message]
2022-10-13 20:39 ` Jakub Sitnicki
2022-10-16 18:11 ` Cong Wang
2022-10-24 9:36 ` Jakub Sitnicki
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ac4831b9-44ef-29dc-bf9f-0d9adf503123@linux.dev \
--to=martin.lau@linux.dev \
--cc=bpf@vger.kernel.org \
--cc=jakub@cloudflare.com \
--cc=john.fastabend@gmail.com \
--cc=netdev@vger.kernel.org \
--cc=sdf@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).