From: Maksim Kiselev <bigunclemax@gmail.com>
Cc: Justin Sanders <justin@coraid.com>, Jens Axboe <axboe@kernel.dk>,
linux-block@vger.kernel.org, linux-kernel@vger.kernel.org,
Maksim Kiselev <bigunclemax@gmail.com>
Subject: [PATCH v1 0/1] aoe: possible interrupt unsafe locking scenario
Date: Wed, 24 Jan 2024 10:24:35 +0300 [thread overview]
Message-ID: <20240124072436.3745720-1-bigunclemax@gmail.com> (raw)
Hi all,
I'm using an AoE device on board with D1 SoC (riscv arch).
When I enabled CONFIG_PROVE_LOCKING option, I got the warning below.
After some investigations, I made a not very elegant, but working solution.
The patch sent with next message.
I would be glad to know what you think about this.
[ 3039.629033] aoe: 0050b6b92f30 e1.1 v4019 has 264192 sectors
[ 3039.637315]
[ 3039.638831] =====================================================
[ 3039.644921] WARNING: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected
[ 3039.651534] 6.7.0-next-20240119-00003-gc826d21919ac-dirty #7 Not tainted
[ 3039.658233] -----------------------------------------------------
[ 3039.664322] kworker/0:0/8 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:
[ 3039.670940] ffffffd803082a60 (&bdev->bd_size_lock){+.+.}-{2:2}, at: bdev_set_nr_sectors+0x20/0x44
[ 3039.679854]
[ 3039.679854] and this task is already holding:
[ 3039.685684] ffffffd803b44238 (&d->lock){..-.}-{2:2}, at: aoeblk_gdalloc+0xe8/0x322
[ 3039.693298] which would create a new lock dependency:
[ 3039.698349] (&d->lock){..-.}-{2:2} -> (&bdev->bd_size_lock){+.+.}-{2:2}
[ 3039.705089]
[ 3039.705089] but this new dependency connects a SOFTIRQ-irq-safe lock:
[ 3039.713002] (&d->lock){..-.}-{2:2}
[ 3039.713023]
[ 3039.713023] ... which became SOFTIRQ-irq-safe at:
[ 3039.722682] __lock_acquire+0x730/0x170c
[ 3039.726704] lock_acquire+0xb2/0x21a
[ 3039.730375] _raw_spin_lock_irqsave+0x3a/0x64
[ 3039.734833] aoecmd_cfg_rsp+0xc6/0x43a
[ 3039.738683] aoenet_rcv+0x19a/0x1c2
[ 3039.742267] __netif_receive_skb_list_core+0x13c/0x146
[ 3039.747510] netif_receive_skb_list_internal+0x19e/0x2ae
[ 3039.752920] napi_complete_done+0x4a/0x16a
[ 3039.757115] r8152_poll+0x20c/0x79c
[ 3039.760700] __napi_poll.constprop.0+0x26/0x19a
[ 3039.765328] net_rx_action+0x10e/0x226
[ 3039.769175] __do_softirq+0x180/0x3a4
[ 3039.772937] irq_exit_rcu+0x7e/0xa6
[ 3039.776524] handle_riscv_irq+0x40/0x4e
[ 3039.780466] call_on_irq_stack+0x1c/0x28
[ 3039.784486]
[ 3039.784486] to a SOFTIRQ-irq-unsafe lock:
[ 3039.789970] (&bdev->bd_size_lock){+.+.}-{2:2}
[ 3039.789992]
[ 3039.789992] ... which became SOFTIRQ-irq-unsafe at:
[ 3039.800777] ...
[ 3039.800783] __lock_acquire+0x3a2/0x170c
[ 3039.806548] lock_acquire+0xb2/0x21a
[ 3039.810221] _raw_spin_lock+0x2c/0x40
[ 3039.813980] bdev_set_nr_sectors+0x20/0x44
[ 3039.818173] set_capacity+0xe/0x16
[ 3039.821680] zram_add+0xdc/0x1fa
[ 3039.825006] zram_init+0xf8/0x124
[ 3039.828422] do_one_initcall+0x4e/0x1ca
[ 3039.832356] kernel_init_freeable+0x1be/0x218
[ 3039.836812] kernel_init+0x1e/0x10a
[ 3039.840405] ret_from_fork+0xe/0x1c
[ 3039.843991]
[ 3039.843991] other info that might help us debug this:
[ 3039.843991]
[ 3039.851990] Possible interrupt unsafe locking scenario:
[ 3039.851990]
[ 3039.858773] CPU0 CPU1
[ 3039.863302] ---- ----
[ 3039.867829] lock(&bdev->bd_size_lock);
[ 3039.871762] local_irq_disable();
[ 3039.877678] lock(&d->lock);
[ 3039.883172] lock(&bdev->bd_size_lock);
[ 3039.889619] <Interrupt>
[ 3039.892241] lock(&d->lock);
[ 3039.895393]
[ 3039.895393] *** DEADLOCK ***
[ 3039.895393]
[ 3039.901309] 3 locks held by kworker/0:0/8:
[ 3039.905409] #0: ffffffd802a95530 ((wq_completion)aoe_wq){+.+.}-{0:0}, at: process_one_work+0x110/0x3ce
[ 3039.914849] #1: ffffffc800043db8 ((work_completion)(&d->work)){+.+.}-{0:0}, at: process_one_work+0x110
/0x3ce
[ 3039.924799] #2: ffffffd803b44238 (&d->lock){..-.}-{2:2}, at: aoeblk_gdalloc+0xe8/0x322
[ 3039.932848]
[ 3039.932848] the dependencies between SOFTIRQ-irq-safe lock and the holding lock:
[ 3039.941715] -> (&d->lock){..-.}-{2:2} {
[ 3039.945576] IN-SOFTIRQ-W at:
[ 3039.948724] __lock_acquire+0x730/0x170c
[ 3039.954307] lock_acquire+0xb2/0x21a
[ 3039.959538] _raw_spin_lock_irqsave+0x3a/0x64
[ 3039.965554] aoecmd_cfg_rsp+0xc6/0x43a
[ 3039.970966] aoenet_rcv+0x19a/0x1c2
[ 3039.976112] __netif_receive_skb_list_core+0x13c/0x146
[ 3039.982915] netif_receive_skb_list_internal+0x19e/0x2ae
[ 3039.989885] napi_complete_done+0x4a/0x16a
[ 3039.995641] r8152_poll+0x20c/0x79c
[ 3040.000788] __napi_poll.constprop.0+0x26/0x19a
[ 3040.006977] net_rx_action+0x10e/0x226
[ 3040.012384] __do_softirq+0x180/0x3a4
[ 3040.017706] irq_exit_rcu+0x7e/0xa6
[ 3040.022854] handle_riscv_irq+0x40/0x4e
[ 3040.028355] call_on_irq_stack+0x1c/0x28
[ 3040.033937] INITIAL USE at:
[ 3040.036999] __lock_acquire+0x3ae/0x170c
[ 3040.042492] lock_acquire+0xb2/0x21a
[ 3040.047638] _raw_spin_lock_irqsave+0x3a/0x64
[ 3040.053565] aoecmd_cfg_rsp+0xc6/0x43a
[ 3040.058893] aoenet_rcv+0x19a/0x1c2
[ 3040.063951] __netif_receive_skb_list_core+0x13c/0x146
[ 3040.070663] netif_receive_skb_list_internal+0x19e/0x2ae
[ 3040.077547] napi_complete_done+0x4a/0x16a
[ 3040.083215] r8152_poll+0x20c/0x79c
[ 3040.088273] __napi_poll.constprop.0+0x26/0x19a
[ 3040.094376] net_rx_action+0x10e/0x226
[ 3040.099696] __do_softirq+0x180/0x3a4
[ 3040.104931] irq_exit_rcu+0x7e/0xa6
[ 3040.109992] handle_riscv_irq+0x40/0x4e
[ 3040.115402] call_on_irq_stack+0x1c/0x28
[ 3040.120896] }
[ 3040.122565] ... key at: [<ffffffff822deea8>] __key.2+0x0/0x10
[ 3040.128848]
[ 3040.128848] the dependencies between the lock to be acquired
[ 3040.128856] and SOFTIRQ-irq-unsafe lock:
[ 3040.139994] -> (&bdev->bd_size_lock){+.+.}-{2:2} {
[ 3040.144811] HARDIRQ-ON-W at:
[ 3040.147960] __lock_acquire+0x390/0x170c
[ 3040.153542] lock_acquire+0xb2/0x21a
[ 3040.158774] _raw_spin_lock+0x2c/0x40
[ 3040.164093] bdev_set_nr_sectors+0x20/0x44
[ 3040.169848] set_capacity+0xe/0x16
[ 3040.174912] zram_add+0xdc/0x1fa
[ 3040.179798] zram_init+0xf8/0x124
[ 3040.184775] do_one_initcall+0x4e/0x1ca
[ 3040.190269] kernel_init_freeable+0x1be/0x218
[ 3040.196288] kernel_init+0x1e/0x10a
[ 3040.201439] ret_from_fork+0xe/0x1c
[ 3040.206587] SOFTIRQ-ON-W at:
[ 3040.209735] __lock_acquire+0x3a2/0x170c
[ 3040.215316] lock_acquire+0xb2/0x21a
[ 3040.220546] _raw_spin_lock+0x2c/0x40
[ 3040.225866] bdev_set_nr_sectors+0x20/0x44
[ 3040.231618] set_capacity+0xe/0x16
[ 3040.236678] zram_add+0xdc/0x1fa
[ 3040.241563] zram_init+0xf8/0x124
[ 3040.246536] do_one_initcall+0x4e/0x1ca
[ 3040.252028] kernel_init_freeable+0x1be/0x218
[ 3040.258044] kernel_init+0x1e/0x10a
[ 3040.263193] ret_from_fork+0xe/0x1c
[ 3040.268341] INITIAL USE at:
[ 3040.271402] __lock_acquire+0x3ae/0x170c
[ 3040.276896] lock_acquire+0xb2/0x21a
[ 3040.282041] _raw_spin_lock+0x2c/0x40
[ 3040.287273] bdev_set_nr_sectors+0x20/0x44
[ 3040.292939] set_capacity+0xe/0x16
[ 3040.297913] zram_add+0xdc/0x1fa
[ 3040.302712] zram_init+0xf8/0x124
[ 3040.307600] do_one_initcall+0x4e/0x1ca
[ 3040.313005] kernel_init_freeable+0x1be/0x218
[ 3040.318936] kernel_init+0x1e/0x10a
[ 3040.323998] ret_from_fork+0xe/0x1c
[ 3040.329059] }
[ 3040.330728] ... key at: [<ffffffff822d4d18>] __key.5+0x0/0x10
[ 3040.337010] ... acquired at:
[ 3040.339980] check_prev_add+0xf8/0xb40
[ 3040.343913] __lock_acquire+0xd0e/0x170c
[ 3040.348017] lock_acquire+0xb2/0x21a
[ 3040.351775] _raw_spin_lock+0x2c/0x40
[ 3040.355620] bdev_set_nr_sectors+0x20/0x44
[ 3040.359899] set_capacity+0xe/0x16
[ 3040.363486] +0x180/0x322
[ 3040.367514] aoecmd_sleepwork+0x2e/0x7c
[ 3040.371536] process_one_work+0x176/0x3ce
[ 3040.375740] worker_thread+0x19a/0x348
[ 3040.379674] kthread+0xd0/0xec
[ 3040.382918] ret_from_fork+0xe/0x1c
[ 3040.386596]
[ 3040.388091]
[ 3040.388091] stack backtrace:
[ 3040.392449] CPU: 0 PID: 8 Comm: kworker/0:0 Not tainted 6.7.0-next-20240119-00003-gc826d21919ac-dirty #7
[ 3040.401929] Hardware name: Sipeed Lichee RV Dock (DT)
[ 3040.406983] Workqueue: aoe_wq aoecmd_sleepwork
[ 3040.411450] Call Trace:
[ 3040.413901] [<ffffffff800067b2>] dump_backtrace+0x1c/0x24
[ 3040.419316] [<ffffffff809d176a>] show_stack+0x2c/0x38
[ 3040.424382] [<ffffffff809de7c2>] dump_stack_lvl+0x2e/0x48
[ 3040.429798] [<ffffffff809de7f0>] dump_stack+0x14/0x1c
[ 3040.434862] [<ffffffff800511fc>] check_irq_usage+0x786/0x798
[ 3040.440531] [<ffffffff80051306>] check_prev_add+0xf8/0xb40
[ 3040.446026] [<ffffffff80053b22>] __lock_acquire+0xd0e/0x170c
[ 3040.451693] [<ffffffff80054cc4>] lock_acquire+0xb2/0x21a
[ 3040.457013] [<ffffffff809e91be>] _raw_spin_lock+0x2c/0x40
[ 3040.462424] [<ffffffff8032468c>] bdev_set_nr_sectors+0x20/0x44
[ 3040.468268] [<ffffffff8033ff0c>] set_capacity+0xe/0x16
[ 3040.473422] [<ffffffff805d32e2>] aoeblk_gdalloc+0x180/0x322
[ 3040.479013] [<ffffffff805d5208>] aoecmd_sleepwork+0x2e/0x7c
[ 3040.484599] [<ffffffff8002b612>] process_one_work+0x176/0x3ce
[ 3040.490361] [<ffffffff8002ba04>] worker_thread+0x19a/0x348
[ 3040.495857] [<ffffffff80033d08>] kthread+0xd0/0xec
[ 3040.500661] [<ffffffff809e9ebe>] ret_from_fork+0xe/0x1c
[ 3040.538247] etherd/e1.1: p1
Maksim Kiselev (1):
aoe: avoid potential deadlock at set_capacity
drivers/block/aoe/aoeblk.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
--
2.40.1
next reply other threads:[~2024-01-24 7:24 UTC|newest] Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top 2024-01-24 7:24 Maksim Kiselev [this message] 2024-01-24 7:24 ` [PATCH v1 1/1] aoe: avoid potential deadlock at set_capacity Maksim Kiselev 2024-01-24 9:18 ` Christoph Hellwig 2024-01-24 10:23 ` Maxim Kiselev 2024-01-24 15:30 ` [PATCH v1 0/1] aoe: possible interrupt unsafe locking scenario Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240124072436.3745720-1-bigunclemax@gmail.com \
--to=bigunclemax@gmail.com \
--cc=axboe@kernel.dk \
--cc=justin@coraid.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.