* kernel BUG scsi_dh_alua sleeping from invalid context && kernel WARNING do not call blocking ops when !TASK_RUNNING
@ 2023-01-16 14:59 Steffen Maier
2023-01-16 16:57 ` Martin Wilck
2023-01-16 17:55 ` Bart Van Assche
0 siblings, 2 replies; 18+ messages in thread
From: Steffen Maier @ 2023-01-16 14:59 UTC (permalink / raw)
To: linux-scsi, Bart Van Assche
Cc: Martin K. Petersen, James E . J . Bottomley, Sachin Sant,
Hannes Reinecke, Martin Wilck, Benjamin Block, linux-s390
Hi all,
since a few days/weeks, we sometimes see below alua and sleep related kernel
BUG and WARNING (with panic_on_warn) in our CI.
It reminds me of
[PATCH 0/2] Rework how the ALUA driver calls scsi_device_put()
https://lore.kernel.org/linux-scsi/166986602290.2101055.17397734326843853911.b4-ty@oracle.com/
which I thought was the fix and went into 6.2-rc(1?) on 2022-12-14 with
[GIT PULL] first round of SCSI updates for the 6.1+ merge window
https://lore.kernel.org/linux-scsi/b2e824bbd1e40da64d2d01657f2f7a67b98919fb.camel@HansenPartnership.com/T/#u
Due to limited history, I cannot tell exactly when problems started and whether
it really correlates to above.
Test workload are all kinds of coverage tests for zfcp recovery including scsi
device removal and/or rescan.
[ 4569.045992] BUG: sleeping function called from invalid context at
drivers/scsi/device_handler/scsi_dh_alua.c:992
[ 4569.046003] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 0, name:
swapper/8
[ 4569.046013] preempt_count: 101, expected: 0
[ 4569.046023] RCU nest depth: 0, expected: 0
[ 4569.046033] no locks held by swapper/8/0.
[ 4569.046042] Preemption disabled at:
[ 4569.046046] [<000000017e27ce4e>] __slab_alloc.constprop.0+0x36/0xb8
[ 4569.046072] CPU: 8 PID: 0 Comm: swapper/8 Tainted: G W
6.2.0-20230114.rc3.git0.46e26dd43df0.300.fc37.s390x+debug #1
[ 4569.046084] Hardware name: IBM 2964 NC9 702 (z/VM 6.4.0)
[ 4569.046094] Call Trace:
[ 4569.046102] [<000000017ed21bcc>] dump_stack_lvl+0xac/0x100
[ 4569.046118] [<000000017df9192c>] __might_resched+0x284/0x2c8
[ 4569.046131] [<000003ff7fb9c874>] alua_rtpg_queue+0x3c/0x98 [scsi_dh_alua]
[ 4569.046146] [<000003ff7fb9cfb2>] alua_check+0x122/0x250 [scsi_dh_alua]
[ 4569.046167] [<000003ff7fb9d562>] alua_check_sense+0x172/0x228 [scsi_dh_alua]
[ 4569.046179] [<000000017e96b3e2>] scsi_check_sense+0x8a/0x2e0
[ 4569.046191] [<000000017e96e4b6>] scsi_decide_disposition+0x286/0x298
[ 4569.046201] [<000000017e972bca>] scsi_complete+0x6a/0x108
[ 4569.046212] [<000000017e746906>] blk_complete_reqs+0x6e/0x88
[ 4569.046227] [<000000017ed3830e>] __do_softirq+0x13e/0x6b8
[ 4569.046238] [<000000017df57902>] __irq_exit_rcu+0x14a/0x170
[ 4569.046264] [<000000017df58472>] irq_exit_rcu+0x22/0x50
[ 4569.046275] [<000000017ed2242a>] do_ext_irq+0x10a/0x1d0
[ 4569.046286] [<000000017ed36156>] ext_int_handler+0xd6/0x110
[ 4569.046296] [<000000017ed362e6>] psw_idle_exit+0x0/0xa
[ 4569.046307] ([<000000017defc5da>] arch_cpu_idle+0x52/0xe0)
[ 4569.046318] [<000000017ed34744>] default_idle_call+0x84/0xd0
[ 4569.046329] [<000000017dfbe4cc>] do_idle+0xfc/0x1b8
[ 4569.046340] [<000000017dfbe80e>] cpu_startup_entry+0x36/0x40
[ 4569.046350] [<000000017df11964>] smp_start_secondary+0x14c/0x160
[ 4569.046371] [<000000017ed3658e>] restart_int_handler+0x6e/0x90
[ 4569.046381] no locks held by swapper/8/0.
Above occurs a few times until it finally ends with:
[ 4760.865496] device-mapper: multipath: 251:6: Reinstating path 8:176.
[ 4760.867398] sd 4:0:0:1083719810: Power-on or device reset occurred
[ 4760.867445] sd 4:0:0:1083719810: [sde] tag#1224 Done: ADD_TO_MLQUEUE Result:
hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
[ 4760.867469] sd 4:0:0:1083719810: [sde] tag#1224 CDB: Test Unit Ready 00 00
00 00 00 00
[ 4760.867493] sd 4:0:0:1083719810: [sde] tag#1224 Sense Key : Unit Attention
[current]
[ 4760.867515] sd 4:0:0:1083719810: [sde] tag#1224 Add. Sense: Power on, reset,
or bus device reset occurred
[ 4760.878066] sd 4:0:0:1083719813: Power-on or device reset occurred
[ 4760.878096] ------------[ cut here ]------------
[ 4760.878107] do not call blocking ops when !TASK_RUNNING; state=2 set at
[<000000017ed2c0fa>] __wait_for_common+0xa2/0x240
[ 4760.878132] WARNING: CPU: 3 PID: 165738 at kernel/sched/core.c:9908
__might_sleep+0x7c/0x98
[ 4760.878147] Modules linked in: af_iucv kvm algif_hash af_alg nft_fib_inet
nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6
nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6
nf_defrag_ipv4 ip_set nf_tables nfnetlink sunrpc vfio_ccw mdev vfio_iommu_type1
vfio sch_fq_codel ip6_tables ip_tables x_tables configfs dm_service_time
ghash_s390 prng chacha_s390 libchacha aes_s390 des_s390 libdes sha512_s390
sha256_s390 sha1_s390 sha_common zfcp scsi_transport_fc dm_mirror
dm_region_hash dm_log scsi_dh_rdac scsi_dh_emc scsi_dh_alua pkey zcrypt
rng_core dm_multipath autofs4
[ 4760.878456] CPU: 3 PID: 165738 Comm: kworker/3:0 Tainted: G W
6.2.0-20230114.rc3.git0.46e26dd43df0.300.fc37.s390x+debug #1
[ 4760.878478] Hardware name: IBM 2964 NC9 702 (z/VM 6.4.0)
[ 4760.878489] Workqueue: kaluad alua_rtpg_work [scsi_dh_alua]
[ 4760.878509] Krnl PSW : 0704d00180000000 000000017df919f0
(__might_sleep+0x80/0x98)
[ 4760.878542] R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0
RI:0 EA:3
[ 4760.878560] Krnl GPRS: c0000000ffffbfff 0000000080000101 000000000000006d
000000017f198e94
[ 4760.878573] 00000380002739f8 00000380002739f0 0000000000000000
000000017f7bca48
[ 4760.878586] 0000000000000001 0000000000000000 00000000000003e0
000003ff7fb9f1bc
[ 4760.878599] 00000000827eb100 0000000000000101 000000017df919ec
0000038000273b88
[ 4760.878620] Krnl Code: 000000017df919e0: c020008c1da3 larl %r2,000000017f115526
000000017df919e6: c0e5006bb91d brasl
%r14,000000017ed08c20
#000000017df919ec: af000000 mc 0,0
>000000017df919f0: a7490000 lghi %r4,0
000000017df919f4: b904003a lgr %r3,%r10
000000017df919f8: b904002b lgr %r2,%r11
000000017df919fc: ebaff0a00004 lmg %r10,%r15,160(%r15)
000000017df91a02: c0f4fffffe53 brcl 15,000000017df916a8
[ 4760.878692] Call Trace:
[ 4760.878703] [<000000017df919f0>] __might_sleep+0x80/0x98
[ 4760.878716] ([<000000017df919ec>] __might_sleep+0x7c/0x98)
[ 4760.878728] [<000003ff7fb9c874>] alua_rtpg_queue+0x3c/0x98 [scsi_dh_alua]
[ 4760.878743] [<000003ff7fb9cfb2>] alua_check+0x122/0x250 [scsi_dh_alua]
[ 4760.878761] [<000003ff7fb9d562>] alua_check_sense+0x172/0x228 [scsi_dh_alua]
[ 4760.878775] [<000000017e96b3e2>] scsi_check_sense+0x8a/0x2e0
[ 4760.878788] [<000000017e96e4b6>] scsi_decide_disposition+0x286/0x298
[ 4760.878802] [<000000017e972bca>] scsi_complete+0x6a/0x108
[ 4760.878815] [<000000017e746906>] blk_complete_reqs+0x6e/0x88
[ 4760.878837] [<000000017ed3830e>] __do_softirq+0x13e/0x6b8
[ 4760.878852] [<000000017df57902>] __irq_exit_rcu+0x14a/0x170
[ 4760.878866] [<000000017df58472>] irq_exit_rcu+0x22/0x50
[ 4760.878880] [<000000017ed223da>] do_ext_irq+0xba/0x1d0
[ 4760.878896] [<000000017ed36156>] ext_int_handler+0xd6/0x110
[ 4760.878909] [<000000017ed34fbe>] _raw_spin_unlock_irqrestore+0x86/0xc0
[ 4760.878928] ([<000000017ed34fae>] _raw_spin_unlock_irqrestore+0x76/0xc0)
[ 4760.878941] [<000000017e033e66>] __mod_timer+0x2d6/0x408
[ 4760.878955] [<000000017ed33864>] schedule_timeout+0xc4/0x168
[ 4760.878969] [<000000017ed2ac62>] io_schedule_timeout+0x5a/0x80
[ 4760.878983] [<000000017ed2c12e>] __wait_for_common+0xd6/0x240
[ 4760.878997] [<000000017e7479a6>] blk_execute_rq+0x126/0x1f8
[ 4760.879011] [<000000017e970722>] __scsi_execute+0x112/0x260
[ 4760.879024] [<000003ff7fb9d750>] alua_rtpg+0x138/0xb10 [scsi_dh_alua]
[ 4760.879038] [<000003ff7fb9e3e4>] alua_rtpg_work+0x2bc/0x4e0 [scsi_dh_alua]
[ 4760.879053] [<000000017df78300>] process_one_work+0x310/0x730
[ 4760.879069] [<000000017df78782>] worker_thread+0x62/0x420
[ 4760.879109] [<000000017df83bc4>] kthread+0x13c/0x150
[ 4760.879124] [<000000017defb930>] __ret_from_fork+0x40/0x58
[ 4760.879138] [<000000017ed35eda>] ret_from_fork+0xa/0x40
[ 4760.879152] 2 locks held by kworker/3:0/165738:
[ 4760.879165] #0: 000000008c7b5948 ((wq_completion)kaluad){+.+.}-{0:0}, at:
process_one_work+0x232/0x730
[ 4760.879210] #1: 0000038001177dc8
((work_completion)(&(&pg->rtpg_work)->work)){+.+.}-{0:0}, at:
process_one_work+0x232/0x730
[ 4760.879249] Last Breaking-Event-Address:
[ 4760.879266] [<000000017e8c6dd0>] __s390_indirect_jump_r14+0x0/0x10
[ 4760.879283] Kernel panic - not syncing: kernel: panic_on_warn set ...
--
Mit freundlichen Gruessen / Kind regards
Steffen Maier
Linux on IBM Z and LinuxONE
https://www.ibm.com/privacy/us/en/
IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Gregor Pillen
Geschaeftsfuehrung: David Faller
Sitz der Gesellschaft: Boeblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: kernel BUG scsi_dh_alua sleeping from invalid context && kernel WARNING do not call blocking ops when !TASK_RUNNING
2023-01-16 14:59 kernel BUG scsi_dh_alua sleeping from invalid context && kernel WARNING do not call blocking ops when !TASK_RUNNING Steffen Maier
@ 2023-01-16 16:57 ` Martin Wilck
2023-01-16 17:48 ` Bart Van Assche
2023-01-16 17:55 ` Bart Van Assche
1 sibling, 1 reply; 18+ messages in thread
From: Martin Wilck @ 2023-01-16 16:57 UTC (permalink / raw)
To: Steffen Maier, linux-scsi, Bart Van Assche
Cc: Martin K. Petersen, James E . J . Bottomley, Sachin Sant,
Hannes Reinecke, Benjamin Block, linux-s390
On Mon, 2023-01-16 at 15:59 +0100, Steffen Maier wrote:
> Hi all,
>
> since a few days/weeks, we sometimes see below alua and sleep related
> kernel
> BUG and WARNING (with panic_on_warn) in our CI.
>
> It reminds me of
> [PATCH 0/2] Rework how the ALUA driver calls scsi_device_put()
> https://lore.kernel.org/linux-scsi/166986602290.2101055.17397734326843853911.b4-ty@oracle.com/
>
> which I thought was the fix and went into 6.2-rc(1?) on 2022-12-14
> with
> [GIT PULL] first round of SCSI updates for the 6.1+ merge window
> https://lore.kernel.org/linux-scsi/b2e824bbd1e40da64d2d01657f2f7a67b98919fb.camel@HansenPartnership.com/T/#u
>
That was the fix for the code path alua_check_vpd()->alua_rtpg_queue()
->scsi_device_put(), where alua_rtpg_queue() was called while obviously
holding a lock. The call chain in your case is different.
But AFAICS Bart's original fix for the BUG, "scsi: alua: Fix alua_rtpg_queue()"
(https://lore.kernel.org/linux-scsi/20221115224903.2325529-1-bvanassche@acm.org/)
would also not solve your issue, because it simply moves the scsi_device_put()
to the caller, alua_check(), which can't sleep, either.
[ 4569.046131] [<000003ff7fb9c874>] alua_rtpg_queue+0x3c/0x98 [scsi_dh_alua]
[ 4569.046146] [<000003ff7fb9cfb2>] alua_check+0x122/0x250 [scsi_dh_alua]
[ 4569.046167] [<000003ff7fb9d562>] alua_check_sense+0x172/0x228 [scsi_dh_alua]
[ 4569.046179] [<000000017e96b3e2>] scsi_check_sense+0x8a/0x2e0
[ 4569.046191] [<000000017e96e4b6>] scsi_decide_disposition+0x286/0x298
[ 4569.046201] [<000000017e972bca>] scsi_complete+0x6a/0x108
[ 4569.046212] [<000000017e746906>] blk_complete_reqs+0x6e/0x88
[ 4569.046227] [<000000017ed3830e>] __do_softirq+0x13e/0x6b8
AFAICS, it comes down to the fact that the assertion in the commit message of
f93ed747e2c7 ("scsi: core: Release SCSI devices synchronously"):
"All upstream scsi_device_put() calls happen from thread context."
turns out to be false for the alua code.
Can we simply defer the scsi_device_put() to a workqueue?
Regards,
Martin
> Due to limited history, I cannot tell exactly when problems started
> and whether
> it really correlates to above.
>
> Test workload are all kinds of coverage tests for zfcp recovery
> including scsi
> device removal and/or rescan.
>
> [ 4569.045992] BUG: sleeping function called from invalid context at
> drivers/scsi/device_handler/scsi_dh_alua.c:992
> [ 4569.046003] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid:
> 0, name:
> swapper/8
> [ 4569.046013] preempt_count: 101, expected: 0
> [ 4569.046023] RCU nest depth: 0, expected: 0
> [ 4569.046033] no locks held by swapper/8/0.
> [ 4569.046042] Preemption disabled at:
> [ 4569.046046] [<000000017e27ce4e>]
> __slab_alloc.constprop.0+0x36/0xb8
> [ 4569.046072] CPU: 8 PID: 0 Comm: swapper/8 Tainted: G W
> 6.2.0-20230114.rc3.git0.46e26dd43df0.300.fc37.s390x+debug #1
> [ 4569.046084] Hardware name: IBM 2964 NC9 702 (z/VM 6.4.0)
> [ 4569.046094] Call Trace:
> [ 4569.046102] [<000000017ed21bcc>] dump_stack_lvl+0xac/0x100
> [ 4569.046118] [<000000017df9192c>] __might_resched+0x284/0x2c8
> [ 4569.046131] [<000003ff7fb9c874>] alua_rtpg_queue+0x3c/0x98
> [scsi_dh_alua]
> [ 4569.046146] [<000003ff7fb9cfb2>] alua_check+0x122/0x250
> [scsi_dh_alua]
> [ 4569.046167] [<000003ff7fb9d562>] alua_check_sense+0x172/0x228
> [scsi_dh_alua]
> [ 4569.046179] [<000000017e96b3e2>] scsi_check_sense+0x8a/0x2e0
> [ 4569.046191] [<000000017e96e4b6>]
> scsi_decide_disposition+0x286/0x298
> [ 4569.046201] [<000000017e972bca>] scsi_complete+0x6a/0x108
> [ 4569.046212] [<000000017e746906>] blk_complete_reqs+0x6e/0x88
> [ 4569.046227] [<000000017ed3830e>] __do_softirq+0x13e/0x6b8
> [ 4569.046238] [<000000017df57902>] __irq_exit_rcu+0x14a/0x170
> [ 4569.046264] [<000000017df58472>] irq_exit_rcu+0x22/0x50
> [ 4569.046275] [<000000017ed2242a>] do_ext_irq+0x10a/0x1d0
> [ 4569.046286] [<000000017ed36156>] ext_int_handler+0xd6/0x110
> [ 4569.046296] [<000000017ed362e6>] psw_idle_exit+0x0/0xa
> [ 4569.046307] ([<000000017defc5da>] arch_cpu_idle+0x52/0xe0)
> [ 4569.046318] [<000000017ed34744>] default_idle_call+0x84/0xd0
> [ 4569.046329] [<000000017dfbe4cc>] do_idle+0xfc/0x1b8
> [ 4569.046340] [<000000017dfbe80e>] cpu_startup_entry+0x36/0x40
> [ 4569.046350] [<000000017df11964>] smp_start_secondary+0x14c/0x160
> [ 4569.046371] [<000000017ed3658e>] restart_int_handler+0x6e/0x90
> [ 4569.046381] no locks held by swapper/8/0.
>
> Above occurs a few times until it finally ends with:
>
> [ 4760.865496] device-mapper: multipath: 251:6: Reinstating path
> 8:176.
> [ 4760.867398] sd 4:0:0:1083719810: Power-on or device reset occurred
> [ 4760.867445] sd 4:0:0:1083719810: [sde] tag#1224 Done:
> ADD_TO_MLQUEUE Result:
> hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
> [ 4760.867469] sd 4:0:0:1083719810: [sde] tag#1224 CDB: Test Unit
> Ready 00 00
> 00 00 00 00
> [ 4760.867493] sd 4:0:0:1083719810: [sde] tag#1224 Sense Key : Unit
> Attention
> [current]
> [ 4760.867515] sd 4:0:0:1083719810: [sde] tag#1224 Add. Sense: Power
> on, reset,
> or bus device reset occurred
> [ 4760.878066] sd 4:0:0:1083719813: Power-on or device reset occurred
> [ 4760.878096] ------------[ cut here ]------------
> [ 4760.878107] do not call blocking ops when !TASK_RUNNING; state=2
> set at
> [<000000017ed2c0fa>] __wait_for_common+0xa2/0x240
> [ 4760.878132] WARNING: CPU: 3 PID: 165738 at
> kernel/sched/core.c:9908
> __might_sleep+0x7c/0x98
> [ 4760.878147] Modules linked in: af_iucv kvm algif_hash af_alg
> nft_fib_inet
> nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4
> nf_reject_ipv6
> nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6
> nf_defrag_ipv4 ip_set nf_tables nfnetlink sunrpc vfio_ccw mdev
> vfio_iommu_type1
> vfio sch_fq_codel ip6_tables ip_tables x_tables configfs
> dm_service_time
> ghash_s390 prng chacha_s390 libchacha aes_s390 des_s390 libdes
> sha512_s390
> sha256_s390 sha1_s390 sha_common zfcp scsi_transport_fc dm_mirror
> dm_region_hash dm_log scsi_dh_rdac scsi_dh_emc scsi_dh_alua pkey
> zcrypt
> rng_core dm_multipath autofs4
> [ 4760.878456] CPU: 3 PID: 165738 Comm: kworker/3:0 Tainted: G
> W
> 6.2.0-20230114.rc3.git0.46e26dd43df0.300.fc37.s390x+debug #1
> [ 4760.878478] Hardware name: IBM 2964 NC9 702 (z/VM 6.4.0)
> [ 4760.878489] Workqueue: kaluad alua_rtpg_work [scsi_dh_alua]
> [ 4760.878509] Krnl PSW : 0704d00180000000 000000017df919f0
> (__might_sleep+0x80/0x98)
> [ 4760.878542] R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3
> CC:1 PM:0
> RI:0 EA:3
> [ 4760.878560] Krnl GPRS: c0000000ffffbfff 0000000080000101
> 000000000000006d
> 000000017f198e94
> [ 4760.878573] 00000380002739f8 00000380002739f0
> 0000000000000000
> 000000017f7bca48
> [ 4760.878586] 0000000000000001 0000000000000000
> 00000000000003e0
> 000003ff7fb9f1bc
> [ 4760.878599] 00000000827eb100 0000000000000101
> 000000017df919ec
> 0000038000273b88
> [ 4760.878620] Krnl Code: 000000017df919e0:
> c020008c1da3 larl %r2,000000017f115526
> 000000017df919e6: c0e5006bb91d brasl
> %r14,000000017ed08c20
> #000000017df919ec:
> af000000 mc 0,0
> >000000017df919f0:
> a7490000 lghi %r4,0
> 000000017df919f4:
> b904003a lgr %r3,%r10
> 000000017df919f8:
> b904002b lgr %r2,%r11
> 000000017df919fc:
> ebaff0a00004 lmg %r10,%r15,160(%r15)
> 000000017df91a02:
> c0f4fffffe53 brcl 15,000000017df916a8
> [ 4760.878692] Call Trace:
> [ 4760.878703] [<000000017df919f0>] __might_sleep+0x80/0x98
> [ 4760.878716] ([<000000017df919ec>] __might_sleep+0x7c/0x98)
> [ 4760.878728] [<000003ff7fb9c874>] alua_rtpg_queue+0x3c/0x98
> [scsi_dh_alua]
> [ 4760.878743] [<000003ff7fb9cfb2>] alua_check+0x122/0x250
> [scsi_dh_alua]
> [ 4760.878761] [<000003ff7fb9d562>] alua_check_sense+0x172/0x228
> [scsi_dh_alua]
> [ 4760.878775] [<000000017e96b3e2>] scsi_check_sense+0x8a/0x2e0
> [ 4760.878788] [<000000017e96e4b6>]
> scsi_decide_disposition+0x286/0x298
> [ 4760.878802] [<000000017e972bca>] scsi_complete+0x6a/0x108
> [ 4760.878815] [<000000017e746906>] blk_complete_reqs+0x6e/0x88
> [ 4760.878837] [<000000017ed3830e>] __do_softirq+0x13e/0x6b8
> [ 4760.878852] [<000000017df57902>] __irq_exit_rcu+0x14a/0x170
> [ 4760.878866] [<000000017df58472>] irq_exit_rcu+0x22/0x50
> [ 4760.878880] [<000000017ed223da>] do_ext_irq+0xba/0x1d0
> [ 4760.878896] [<000000017ed36156>] ext_int_handler+0xd6/0x110
> [ 4760.878909] [<000000017ed34fbe>]
> _raw_spin_unlock_irqrestore+0x86/0xc0
> [ 4760.878928] ([<000000017ed34fae>]
> _raw_spin_unlock_irqrestore+0x76/0xc0)
> [ 4760.878941] [<000000017e033e66>] __mod_timer+0x2d6/0x408
> [ 4760.878955] [<000000017ed33864>] schedule_timeout+0xc4/0x168
> [ 4760.878969] [<000000017ed2ac62>] io_schedule_timeout+0x5a/0x80
> [ 4760.878983] [<000000017ed2c12e>] __wait_for_common+0xd6/0x240
> [ 4760.878997] [<000000017e7479a6>] blk_execute_rq+0x126/0x1f8
> [ 4760.879011] [<000000017e970722>] __scsi_execute+0x112/0x260
> [ 4760.879024] [<000003ff7fb9d750>] alua_rtpg+0x138/0xb10
> [scsi_dh_alua]
> [ 4760.879038] [<000003ff7fb9e3e4>] alua_rtpg_work+0x2bc/0x4e0
> [scsi_dh_alua]
> [ 4760.879053] [<000000017df78300>] process_one_work+0x310/0x730
> [ 4760.879069] [<000000017df78782>] worker_thread+0x62/0x420
> [ 4760.879109] [<000000017df83bc4>] kthread+0x13c/0x150
> [ 4760.879124] [<000000017defb930>] __ret_from_fork+0x40/0x58
> [ 4760.879138] [<000000017ed35eda>] ret_from_fork+0xa/0x40
> [ 4760.879152] 2 locks held by kworker/3:0/165738:
> [ 4760.879165] #0: 000000008c7b5948 ((wq_completion)kaluad){+.+.}-
> {0:0}, at:
> process_one_work+0x232/0x730
> [ 4760.879210] #1: 0000038001177dc8
> ((work_completion)(&(&pg->rtpg_work)->work)){+.+.}-{0:0}, at:
> process_one_work+0x232/0x730
> [ 4760.879249] Last Breaking-Event-Address:
> [ 4760.879266] [<000000017e8c6dd0>]
> __s390_indirect_jump_r14+0x0/0x10
> [ 4760.879283] Kernel panic - not syncing: kernel: panic_on_warn set
> ...
>
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: kernel BUG scsi_dh_alua sleeping from invalid context && kernel WARNING do not call blocking ops when !TASK_RUNNING
2023-01-16 16:57 ` Martin Wilck
@ 2023-01-16 17:48 ` Bart Van Assche
2023-01-16 17:58 ` Martin Wilck
2023-01-17 9:28 ` Martin Wilck
0 siblings, 2 replies; 18+ messages in thread
From: Bart Van Assche @ 2023-01-16 17:48 UTC (permalink / raw)
To: Martin Wilck, Steffen Maier, linux-scsi
Cc: Martin K. Petersen, James E . J . Bottomley, Sachin Sant,
Hannes Reinecke, Benjamin Block, linux-s390
On 1/16/23 08:57, Martin Wilck wrote:
> Can we simply defer the scsi_device_put() to a workqueue?
I'm concerned that would reintroduce a race condition when LLD kernel
modules are removed.
Thanks,
Bart.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: kernel BUG scsi_dh_alua sleeping from invalid context && kernel WARNING do not call blocking ops when !TASK_RUNNING
2023-01-16 14:59 kernel BUG scsi_dh_alua sleeping from invalid context && kernel WARNING do not call blocking ops when !TASK_RUNNING Steffen Maier
2023-01-16 16:57 ` Martin Wilck
@ 2023-01-16 17:55 ` Bart Van Assche
2023-01-16 18:12 ` Steffen Maier
2023-01-17 7:46 ` Martin Wilck
1 sibling, 2 replies; 18+ messages in thread
From: Bart Van Assche @ 2023-01-16 17:55 UTC (permalink / raw)
To: Steffen Maier, linux-scsi
Cc: Martin K. Petersen, James E . J . Bottomley, Sachin Sant,
Hannes Reinecke, Martin Wilck, Benjamin Block, linux-s390
On 1/16/23 06:59, Steffen Maier wrote:
> Hi all,
>
> since a few days/weeks, we sometimes see below alua and sleep related
> kernel BUG and WARNING (with panic_on_warn) in our CI.
>
> It reminds me of
> [PATCH 0/2] Rework how the ALUA driver calls scsi_device_put()
> https://lore.kernel.org/linux-scsi/166986602290.2101055.17397734326843853911.b4-ty@oracle.com/
>
> which I thought was the fix and went into 6.2-rc(1?) on 2022-12-14 with
> [GIT PULL] first round of SCSI updates for the 6.1+ merge window
> https://lore.kernel.org/linux-scsi/b2e824bbd1e40da64d2d01657f2f7a67b98919fb.camel@HansenPartnership.com/T/#u
>
> Due to limited history, I cannot tell exactly when problems started and
> whether it really correlates to above.
>
> Test workload are all kinds of coverage tests for zfcp recovery
> including scsi device removal and/or rescan.
>
> [ 4569.045992] BUG: sleeping function called from invalid context at
> drivers/scsi/device_handler/scsi_dh_alua.c:992
> [ 4569.046003] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 0,
> name: swapper/8
> [ 4569.046013] preempt_count: 101, expected: 0
> [ 4569.046023] RCU nest depth: 0, expected: 0
> [ 4569.046033] no locks held by swapper/8/0.
> [ 4569.046042] Preemption disabled at:
> [ 4569.046046] [<000000017e27ce4e>] __slab_alloc.constprop.0+0x36/0xb8
> [ 4569.046072] CPU: 8 PID: 0 Comm: swapper/8 Tainted: G W
> 6.2.0-20230114.rc3.git0.46e26dd43df0.300.fc37.s390x+debug #1
> [ 4569.046084] Hardware name: IBM 2964 NC9 702 (z/VM 6.4.0)
> [ 4569.046094] Call Trace:
> [ 4569.046102] [<000000017ed21bcc>] dump_stack_lvl+0xac/0x100
> [ 4569.046118] [<000000017df9192c>] __might_resched+0x284/0x2c8
> [ 4569.046131] [<000003ff7fb9c874>] alua_rtpg_queue+0x3c/0x98
> [scsi_dh_alua]
> [ 4569.046146] [<000003ff7fb9cfb2>] alua_check+0x122/0x250 [scsi_dh_alua]
> [ 4569.046167] [<000003ff7fb9d562>] alua_check_sense+0x172/0x228
> [scsi_dh_alua]
> [ 4569.046179] [<000000017e96b3e2>] scsi_check_sense+0x8a/0x2e0
> [ 4569.046191] [<000000017e96e4b6>] scsi_decide_disposition+0x286/0x298
> [ 4569.046201] [<000000017e972bca>] scsi_complete+0x6a/0x108
> [ 4569.046212] [<000000017e746906>] blk_complete_reqs+0x6e/0x88
> [ 4569.046227] [<000000017ed3830e>] __do_softirq+0x13e/0x6b8
> [ 4569.046238] [<000000017df57902>] __irq_exit_rcu+0x14a/0x170
> [ 4569.046264] [<000000017df58472>] irq_exit_rcu+0x22/0x50
> [ 4569.046275] [<000000017ed2242a>] do_ext_irq+0x10a/0x1d0
> [ 4569.046286] [<000000017ed36156>] ext_int_handler+0xd6/0x110
> [ 4569.046296] [<000000017ed362e6>] psw_idle_exit+0x0/0xa
> [ 4569.046307] ([<000000017defc5da>] arch_cpu_idle+0x52/0xe0)
> [ 4569.046318] [<000000017ed34744>] default_idle_call+0x84/0xd0
> [ 4569.046329] [<000000017dfbe4cc>] do_idle+0xfc/0x1b8
> [ 4569.046340] [<000000017dfbe80e>] cpu_startup_entry+0x36/0x40
> [ 4569.046350] [<000000017df11964>] smp_start_secondary+0x14c/0x160
> [ 4569.046371] [<000000017ed3658e>] restart_int_handler+0x6e/0x90
> [ 4569.046381] no locks held by swapper/8/0.
Hi Steffen,
Thanks for your report and also for having included this call trace. Is
my understanding correct that alua_rtpg_queue+0x3c refers to the
might_sleep() near the start of alua_rtpg_queue()? If so, please help
with testing the following patch:
diff --git a/drivers/scsi/device_handler/scsi_dh_alua.c
b/drivers/scsi/device_handler/scsi_dh_alua.c
index 49cc18a87473..79afa7acdfbc 100644
--- a/drivers/scsi/device_handler/scsi_dh_alua.c
+++ b/drivers/scsi/device_handler/scsi_dh_alua.c
@@ -989,8 +989,6 @@ static bool alua_rtpg_queue(struct alua_port_group
int start_queue = 0;
unsigned long flags;
- might_sleep();
-
if (WARN_ON_ONCE(!pg) || scsi_device_get(sdev))
return false;
I'm proposing this change because the context from which a request is
queued should hold a reference on 'sdev' while a request is in progress
so alua_check_sense() should not trigger the scsi_device_put() call in
alua_rtpg_queue().
Thanks,
Bart.
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: kernel BUG scsi_dh_alua sleeping from invalid context && kernel WARNING do not call blocking ops when !TASK_RUNNING
2023-01-16 17:48 ` Bart Van Assche
@ 2023-01-16 17:58 ` Martin Wilck
2023-01-17 9:28 ` Martin Wilck
1 sibling, 0 replies; 18+ messages in thread
From: Martin Wilck @ 2023-01-16 17:58 UTC (permalink / raw)
To: Bart Van Assche, Steffen Maier, linux-scsi
Cc: Martin K. Petersen, James E . J . Bottomley, Sachin Sant,
Hannes Reinecke, Benjamin Block, linux-s390
On Mon, 2023-01-16 at 09:48 -0800, Bart Van Assche wrote:
> On 1/16/23 08:57, Martin Wilck wrote:
> > Can we simply defer the scsi_device_put() to a workqueue?
>
> I'm concerned that would reintroduce a race condition when LLD kernel
> modules are removed.
So what else do you suggest?
Thanks,
Martin
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: kernel BUG scsi_dh_alua sleeping from invalid context && kernel WARNING do not call blocking ops when !TASK_RUNNING
2023-01-16 17:55 ` Bart Van Assche
@ 2023-01-16 18:12 ` Steffen Maier
2023-01-16 18:31 ` Bart Van Assche
2023-01-17 7:46 ` Martin Wilck
1 sibling, 1 reply; 18+ messages in thread
From: Steffen Maier @ 2023-01-16 18:12 UTC (permalink / raw)
To: Bart Van Assche, linux-scsi
Cc: Martin K. Petersen, James E . J . Bottomley, Sachin Sant,
Hannes Reinecke, Martin Wilck, Benjamin Block, linux-s390
Hi Bart,
On 1/16/23 18:55, Bart Van Assche wrote:
> On 1/16/23 06:59, Steffen Maier wrote:
>> since a few days/weeks, we sometimes see below alua and sleep related kernel
>> BUG and WARNING (with panic_on_warn) in our CI.
>>
>> It reminds me of
>> [PATCH 0/2] Rework how the ALUA driver calls scsi_device_put()
>> https://lore.kernel.org/linux-scsi/166986602290.2101055.17397734326843853911.b4-ty@oracle.com/
>>
>> which I thought was the fix and went into 6.2-rc(1?) on 2022-12-14 with
>> [GIT PULL] first round of SCSI updates for the 6.1+ merge window
>> https://lore.kernel.org/linux-scsi/b2e824bbd1e40da64d2d01657f2f7a67b98919fb.camel@HansenPartnership.com/T/#u
>>
>> Due to limited history, I cannot tell exactly when problems started and
>> whether it really correlates to above.
>>
>> Test workload are all kinds of coverage tests for zfcp recovery including
>> scsi device removal and/or rescan.
>>
>> [ 4569.045992] BUG: sleeping function called from invalid context at
>> drivers/scsi/device_handler/scsi_dh_alua.c:992
> Thanks for your report and also for having included this call trace. Is my
> understanding correct that alua_rtpg_queue+0x3c refers to the might_sleep()
> near the start of alua_rtpg_queue()? If so, please help with testing the
> following patch:
>
> diff --git a/drivers/scsi/device_handler/scsi_dh_alua.c
> b/drivers/scsi/device_handler/scsi_dh_alua.c
> index 49cc18a87473..79afa7acdfbc 100644
> --- a/drivers/scsi/device_handler/scsi_dh_alua.c
> +++ b/drivers/scsi/device_handler/scsi_dh_alua.c
> @@ -989,8 +989,6 @@ static bool alua_rtpg_queue(struct alua_port_group
> int start_queue = 0;
> unsigned long flags;
>
> - might_sleep();
> -
> if (WARN_ON_ONCE(!pg) || scsi_device_get(sdev))
> return false;
>
>
> I'm proposing this change because the context from which a request is queued
> should hold a reference on 'sdev' while a request is in progress so
> alua_check_sense() should not trigger the scsi_device_put() call in
> alua_rtpg_queue().
How would removing this check solve the other and seemingly more fatal (even
without panic_on_warn) WARNING?:
[ 4760.878107] do not call blocking ops when !TASK_RUNNING; state=2 set at
[<000000017ed2c0fa>] __wait_for_common+0xa2/0x240
FWIW, it seems we only seem to get such reports for debug kernel builds (not
sure which kconfig options are relevant) but not for production / performance
builds.
--
Mit freundlichen Gruessen / Kind regards
Steffen Maier
Linux on IBM Z and LinuxONE
https://www.ibm.com/privacy/us/en/
IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Gregor Pillen
Geschaeftsfuehrung: David Faller
Sitz der Gesellschaft: Boeblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: kernel BUG scsi_dh_alua sleeping from invalid context && kernel WARNING do not call blocking ops when !TASK_RUNNING
2023-01-16 18:12 ` Steffen Maier
@ 2023-01-16 18:31 ` Bart Van Assche
0 siblings, 0 replies; 18+ messages in thread
From: Bart Van Assche @ 2023-01-16 18:31 UTC (permalink / raw)
To: Steffen Maier, linux-scsi
Cc: Martin K. Petersen, James E . J . Bottomley, Sachin Sant,
Hannes Reinecke, Martin Wilck, Benjamin Block, linux-s390
On 1/16/23 10:12, Steffen Maier wrote:
> How would removing this check solve the other and seemingly more fatal
> (even without panic_on_warn) WARNING?:
>
> [ 4760.878107] do not call blocking ops when !TASK_RUNNING; state=2 set
> at [<000000017ed2c0fa>] __wait_for_common+0xa2/0x240
Isn't the warning address the same for both reports, namely
alua_rtpg_queue+0x3c?
> FWIW, it seems we only seem to get such reports for debug kernel builds
> (not sure which kconfig options are relevant) but not for production /
> performance builds.
Sleep-in-atomic warnings are only reported with kernel debugging
enabled. With kernel debugging disabled, sleeping in atomic context
results in different behavior (kernel hangs).
Thanks,
Bart.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: kernel BUG scsi_dh_alua sleeping from invalid context && kernel WARNING do not call blocking ops when !TASK_RUNNING
2023-01-16 17:55 ` Bart Van Assche
2023-01-16 18:12 ` Steffen Maier
@ 2023-01-17 7:46 ` Martin Wilck
1 sibling, 0 replies; 18+ messages in thread
From: Martin Wilck @ 2023-01-17 7:46 UTC (permalink / raw)
To: Bart Van Assche, Steffen Maier, linux-scsi
Cc: Martin K. Petersen, James E . J . Bottomley, Sachin Sant,
Hannes Reinecke, Benjamin Block, linux-s390
Hello Bart,
On Mon, 2023-01-16 at 09:55 -0800, Bart Van Assche wrote:
> On 1/16/23 06:59, Steffen Maier wrote:
> > Hi all,
> >
> > since a few days/weeks, we sometimes see below alua and sleep
> > related
> > kernel BUG and WARNING (with panic_on_warn) in our CI.
> >
> > It reminds me of
> > [PATCH 0/2] Rework how the ALUA driver calls scsi_device_put()
> > https://lore.kernel.org/linux-scsi/166986602290.2101055.17397734326843853911.b4-ty@oracle.com/
> >
> > which I thought was the fix and went into 6.2-rc(1?) on 2022-12-14
> > with
> > [GIT PULL] first round of SCSI updates for the 6.1+ merge window
> > https://lore.kernel.org/linux-scsi/b2e824bbd1e40da64d2d01657f2f7a67b98919fb.camel@HansenPartnership.com/T/#u
> >
> > Due to limited history, I cannot tell exactly when problems started
> > and
> > whether it really correlates to above.
> >
> > Test workload are all kinds of coverage tests for zfcp recovery
> > including scsi device removal and/or rescan.
> >
> > [ 4569.045992] BUG: sleeping function called from invalid context
> > at
> > drivers/scsi/device_handler/scsi_dh_alua.c:992
> > [ 4569.046003] in_atomic(): 1, irqs_disabled(): 0, non_block: 0,
> > pid: 0,
> > name: swapper/8
> > [ 4569.046013] preempt_count: 101, expected: 0
> > [ 4569.046023] RCU nest depth: 0, expected: 0
> > [ 4569.046033] no locks held by swapper/8/0.
> > [ 4569.046042] Preemption disabled at:
>
> Thanks,
>
> Bart.
> > [ 4569.046046] [<000000017e27ce4e>]
> > __slab_alloc.constprop.0+0x36/0xb8
> > [ 4569.046072] CPU: 8 PID: 0 Comm: swapper/8 Tainted: G W
> > 6.2.0-20230114.rc3.git0.46e26dd43df0.300.fc37.s390x+debug #1
> > [ 4569.046084] Hardware name: IBM 2964 NC9 702 (z/VM 6.4.0)
> > [ 4569.046094] Call Trace:
> > [ 4569.046102] [<000000017ed21bcc>] dump_stack_lvl+0xac/0x100
> > [ 4569.046118] [<000000017df9192c>] __might_resched+0x284/0x2c8
> > [ 4569.046131] [<000003ff7fb9c874>] alua_rtpg_queue+0x3c/0x98
> > [scsi_dh_alua]
> > [ 4569.046146] [<000003ff7fb9cfb2>] alua_check+0x122/0x250
> > [scsi_dh_alua]
> > [ 4569.046167] [<000003ff7fb9d562>] alua_check_sense+0x172/0x228
> > [scsi_dh_alua]
> > [ 4569.046179] [<000000017e96b3e2>] scsi_check_sense+0x8a/0x2e0
> > [ 4569.046191] [<000000017e96e4b6>]
> > scsi_decide_disposition+0x286/0x298
> > [ 4569.046201] [<000000017e972bca>] scsi_complete+0x6a/0x108
> > [ 4569.046212] [<000000017e746906>] blk_complete_reqs+0x6e/0x88
> > [ 4569.046227] [<000000017ed3830e>] __do_softirq+0x13e/0x6b8
> > [ 4569.046238] [<000000017df57902>] __irq_exit_rcu+0x14a/0x170
> > [ 4569.046264] [<000000017df58472>] irq_exit_rcu+0x22/0x50
> > [ 4569.046275] [<000000017ed2242a>] do_ext_irq+0x10a/0x1d0
> > [ 4569.046286] [<000000017ed36156>] ext_int_handler+0xd6/0x110
> > [ 4569.046296] [<000000017ed362e6>] psw_idle_exit+0x0/0xa
> > [ 4569.046307] ([<000000017defc5da>] arch_cpu_idle+0x52/0xe0)
> > [ 4569.046318] [<000000017ed34744>] default_idle_call+0x84/0xd0
> > [ 4569.046329] [<000000017dfbe4cc>] do_idle+0xfc/0x1b8
> > [ 4569.046340] [<000000017dfbe80e>] cpu_startup_entry+0x36/0x40
> > [ 4569.046350] [<000000017df11964>]
> > smp_start_secondary+0x14c/0x160
> > [ 4569.046371] [<000000017ed3658e>] restart_int_handler+0x6e/0x90
> > [ 4569.046381] no locks held by swapper/8/0.
> Hi Steffen,
>
> Thanks for your report and also for having included this call trace.
> Is
> my understanding correct that alua_rtpg_queue+0x3c refers to the
> might_sleep() near the start of alua_rtpg_queue()? If so, please help
> with testing the following patch:
>
> diff --git a/drivers/scsi/device_handler/scsi_dh_alua.c
> b/drivers/scsi/device_handler/scsi_dh_alua.c
> index 49cc18a87473..79afa7acdfbc 100644
> --- a/drivers/scsi/device_handler/scsi_dh_alua.c
> +++ b/drivers/scsi/device_handler/scsi_dh_alua.c
> @@ -989,8 +989,6 @@ static bool alua_rtpg_queue(struct
> alua_port_group
> int start_queue = 0;
> unsigned long flags;
>
> - might_sleep();
> -
> if (WARN_ON_ONCE(!pg) || scsi_device_get(sdev))
> return false;
>
>
> I'm proposing this change because the context from which a request is
> queued should hold a reference on 'sdev' while a request is in
> progress
> so alua_check_sense() should not trigger the scsi_device_put() call
> in
> alua_rtpg_queue().
alua_rtpg_queue() must take an additional reference in order to make
sure that the ref survives until the workqueue is started. A possible
reference hold by the caller doesn't help because the caller might have
dropped the ref before the workqueue runs.
Please explain. Am I overlooking something?
Regards
Martin
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: kernel BUG scsi_dh_alua sleeping from invalid context && kernel WARNING do not call blocking ops when !TASK_RUNNING
2023-01-16 17:48 ` Bart Van Assche
2023-01-16 17:58 ` Martin Wilck
@ 2023-01-17 9:28 ` Martin Wilck
2023-01-17 18:50 ` Bart Van Assche
1 sibling, 1 reply; 18+ messages in thread
From: Martin Wilck @ 2023-01-17 9:28 UTC (permalink / raw)
To: Bart Van Assche, Steffen Maier, linux-scsi
Cc: Martin K. Petersen, James E . J . Bottomley, Sachin Sant,
Hannes Reinecke, Benjamin Block, linux-s390
Hello Bart,
On Mon, 2023-01-16 at 09:48 -0800, Bart Van Assche wrote:
> On 1/16/23 08:57, Martin Wilck wrote:
> > Can we simply defer the scsi_device_put() to a workqueue?
>
> I'm concerned that would reintroduce a race condition when LLD kernel
> modules are removed.
I don't follow. Normally, alua_rtpg_queue() queues rtpg_work, and
alua_rtpg_work() will be called from the work queue and will eventually
call scsi_device_put() when the RTPG is finished.
alua_rtpg_queue() only calls scsi_device_put() if queueing rtpg_work
fails[*]. If we deferred this scsi_device_put() call to a work queue,
what would be the difference (wrt a module_put() race condition)
compared to the case where queue_delayed_work() succeeds?
In both cases, scsi_device_put() would be called from a work queue.
Given that alua_rtpg_queue() must take a reference to the scsi device
for the case that queueing succeeds, and that alua_rtpg_queue() is
sometimes called in atomic context, I think deferring the
scsi_device_put() call is the only option we have.
Thanks,
Martin
[*] or if queueing turns out to be unnecessary, in which case we could
optimize away the scsi_device_get() call.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: kernel BUG scsi_dh_alua sleeping from invalid context && kernel WARNING do not call blocking ops when !TASK_RUNNING
2023-01-17 9:28 ` Martin Wilck
@ 2023-01-17 18:50 ` Bart Van Assche
2023-01-17 21:48 ` Martin Wilck
0 siblings, 1 reply; 18+ messages in thread
From: Bart Van Assche @ 2023-01-17 18:50 UTC (permalink / raw)
To: Martin Wilck, Steffen Maier, linux-scsi
Cc: Martin K. Petersen, James E . J . Bottomley, Sachin Sant,
Hannes Reinecke, Benjamin Block, linux-s390
On 1/17/23 01:28, Martin Wilck wrote:
> On Mon, 2023-01-16 at 09:48 -0800, Bart Van Assche wrote:
>> On 1/16/23 08:57, Martin Wilck wrote:
>>> Can we simply defer the scsi_device_put() to a workqueue?
>>
>> I'm concerned that would reintroduce a race condition when LLD kernel
>> modules are removed.
>
> I don't follow. Normally, alua_rtpg_queue() queues rtpg_work, and
> alua_rtpg_work() will be called from the work queue and will eventually
> call scsi_device_put() when the RTPG is finished.
>
> alua_rtpg_queue() only calls scsi_device_put() if queueing rtpg_work
> fails[*]. If we deferred this scsi_device_put() call to a work queue,
> what would be the difference (wrt a module_put() race condition)
> compared to the case where queue_delayed_work() succeeds?
> In both cases, scsi_device_put() would be called from a work queue.
>
> Given that alua_rtpg_queue() must take a reference to the scsi device
> for the case that queueing succeeds, and that alua_rtpg_queue() is
> sometimes called in atomic context, I think deferring the
> scsi_device_put() call is the only option we have.
Hi Martin,
Before commit f93ed747e2c7 ("scsi: core: Release SCSI devices
synchronously") the SCSI device release code could continue running
asynchronously after the last module_put() call of the LLD associated
with the SCSI device.
Since commit f93ed747e2c7 it is guaranteed that freeing device memory
(scsi_device_dev_release()) has finished before the last LLD
module_put() call happens.
Do you perhaps plan to defer the scsi_device_put() calls in the ALUA
device handler to a workqueue?
Thanks,
Bart.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: kernel BUG scsi_dh_alua sleeping from invalid context && kernel WARNING do not call blocking ops when !TASK_RUNNING
2023-01-17 18:50 ` Bart Van Assche
@ 2023-01-17 21:48 ` Martin Wilck
2023-01-17 21:52 ` Bart Van Assche
0 siblings, 1 reply; 18+ messages in thread
From: Martin Wilck @ 2023-01-17 21:48 UTC (permalink / raw)
To: Bart Van Assche, Steffen Maier, linux-scsi
Cc: Martin K. Petersen, James E . J . Bottomley, Sachin Sant,
Hannes Reinecke, Benjamin Block, linux-s390
On Tue, 2023-01-17 at 10:50 -0800, Bart Van Assche wrote:
> On 1/17/23 01:28, Martin Wilck wrote:
> > On Mon, 2023-01-16 at 09:48 -0800, Bart Van Assche wrote:
> > > On 1/16/23 08:57, Martin Wilck wrote:
> > > > Can we simply defer the scsi_device_put() to a workqueue?
> > >
> > > I'm concerned that would reintroduce a race condition when LLD
> > > kernel
> > > modules are removed.
> >
> > I don't follow. Normally, alua_rtpg_queue() queues rtpg_work, and
> > alua_rtpg_work() will be called from the work queue and will
> > eventually
> > call scsi_device_put() when the RTPG is finished.
> >
> > alua_rtpg_queue() only calls scsi_device_put() if queueing
> > rtpg_work
> > fails[*]. If we deferred this scsi_device_put() call to a work
> > queue,
> > what would be the difference (wrt a module_put() race condition)
> > compared to the case where queue_delayed_work() succeeds?
> > In both cases, scsi_device_put() would be called from a work queue.
> >
> > Given that alua_rtpg_queue() must take a reference to the scsi
> > device
> > for the case that queueing succeeds, and that alua_rtpg_queue() is
> > sometimes called in atomic context, I think deferring the
> > scsi_device_put() call is the only option we have.
>
> Hi Martin,
>
> Before commit f93ed747e2c7 ("scsi: core: Release SCSI devices
> synchronously") the SCSI device release code could continue running
> asynchronously after the last module_put() call of the LLD associated
> with the SCSI device.
>
> Since commit f93ed747e2c7 it is guaranteed that freeing device memory
> (scsi_device_dev_release()) has finished before the last LLD
> module_put() call happens.
>
> Do you perhaps plan to defer the scsi_device_put() calls in the ALUA
> device handler to a workqueue?
Yes, that was my suggestion. Just defer the scsi_device_put() call in
alua_rtpg_queue() in the case where the actual RTPG handler is not
queued. I won't have time for that before next week though.
Martin
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: kernel BUG scsi_dh_alua sleeping from invalid context && kernel WARNING do not call blocking ops when !TASK_RUNNING
2023-01-17 21:48 ` Martin Wilck
@ 2023-01-17 21:52 ` Bart Van Assche
2023-01-17 22:03 ` Martin Wilck
0 siblings, 1 reply; 18+ messages in thread
From: Bart Van Assche @ 2023-01-17 21:52 UTC (permalink / raw)
To: Martin Wilck, Steffen Maier, linux-scsi
Cc: Martin K. Petersen, James E . J . Bottomley, Sachin Sant,
Hannes Reinecke, Benjamin Block, linux-s390
On 1/17/23 13:48, Martin Wilck wrote:
> Yes, that was my suggestion. Just defer the scsi_device_put() call in
> alua_rtpg_queue() in the case where the actual RTPG handler is not
> queued. I won't have time for that before next week though.
Hi Martin,
Do you agree that the call trace shared by Steffen is not sufficient to
conclude that this change is necessary?
Thanks,
Bart.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: kernel BUG scsi_dh_alua sleeping from invalid context && kernel WARNING do not call blocking ops when !TASK_RUNNING
2023-01-17 21:52 ` Bart Van Assche
@ 2023-01-17 22:03 ` Martin Wilck
2023-01-18 0:29 ` Bart Van Assche
0 siblings, 1 reply; 18+ messages in thread
From: Martin Wilck @ 2023-01-17 22:03 UTC (permalink / raw)
To: Bart Van Assche, Steffen Maier, linux-scsi
Cc: Martin K. Petersen, James E . J . Bottomley, Sachin Sant,
Hannes Reinecke, Benjamin Block, linux-s390
On Tue, 2023-01-17 at 13:52 -0800, Bart Van Assche wrote:
> On 1/17/23 13:48, Martin Wilck wrote:
> > Yes, that was my suggestion. Just defer the scsi_device_put() call
> > in
> > alua_rtpg_queue() in the case where the actual RTPG handler is not
> > queued. I won't have time for that before next week though.
>
> Hi Martin,
>
> Do you agree that the call trace shared by Steffen is not sufficient
> to
> conclude that this change is necessary?
Hmm, I suppose I missed your point... to re-iterate my thinking:
1 alua_queue_rtpg() must take a ref to the sdev before queueing work,
whether or not the caller already has one
2 queue_delayed_work() can fail
3 if queue_delayed_work() fails, alua_queue_rtpg() must drop the ref
it just took
4 BUT (and this is what I guess I missed) this ref can't be the last
one dropped, because the caller of alua_rtpg_queue() must still hold
a reference. And scsi_device_put() only sleeps if the last ref is
dropped. Therefore the issue in Steffen's call stack should
indeed be fixed just by removing the might_sleep(). If all callers
callers of alua_rtpg_queue() must hold an sdev reference (I believe
they do), we can indeed remove the might_sleep() entirely.
Is this correct reasoning, and what you meant previously? If yes, I
agree, and I apologize for not realizing it in the first place.
But I think this is subtle enough to deserve a comment in the code.
Thanks
Martin
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: kernel BUG scsi_dh_alua sleeping from invalid context && kernel WARNING do not call blocking ops when !TASK_RUNNING
2023-01-17 22:03 ` Martin Wilck
@ 2023-01-18 0:29 ` Bart Van Assche
2023-01-18 8:45 ` Martin Wilck
2023-01-18 16:17 ` Steffen Maier
0 siblings, 2 replies; 18+ messages in thread
From: Bart Van Assche @ 2023-01-18 0:29 UTC (permalink / raw)
To: Martin Wilck, Steffen Maier, linux-scsi
Cc: Martin K. Petersen, James E . J . Bottomley, Sachin Sant,
Hannes Reinecke, Benjamin Block, linux-s390
On 1/17/23 14:03, Martin Wilck wrote:
> On Tue, 2023-01-17 at 13:52 -0800, Bart Van Assche wrote:
>> On 1/17/23 13:48, Martin Wilck wrote:
>>> Yes, that was my suggestion. Just defer the scsi_device_put() call
>>> in
>>> alua_rtpg_queue() in the case where the actual RTPG handler is not
>>> queued. I won't have time for that before next week though.
>>
>> Hi Martin,
>>
>> Do you agree that the call trace shared by Steffen is not sufficient
>> to
>> conclude that this change is necessary?
>
> Hmm, I suppose I missed your point... to re-iterate my thinking:
>
> 1 alua_queue_rtpg() must take a ref to the sdev before queueing work,
> whether or not the caller already has one
> 2 queue_delayed_work() can fail
> 3 if queue_delayed_work() fails, alua_queue_rtpg() must drop the ref
> it just took
> 4 BUT (and this is what I guess I missed) this ref can't be the last
> one dropped, because the caller of alua_rtpg_queue() must still hold
> a reference. And scsi_device_put() only sleeps if the last ref is
> dropped. Therefore the issue in Steffen's call stack should
> indeed be fixed just by removing the might_sleep(). If all callers
> callers of alua_rtpg_queue() must hold an sdev reference (I believe
> they do), we can indeed remove the might_sleep() entirely.
>
> Is this correct reasoning, and what you meant previously? If yes, I
> agree, and I apologize for not realizing it in the first place.
> But I think this is subtle enough to deserve a comment in the code.
Yes, that's what I'm thinking.
How about the patch below?
Thanks,
Bart.
[PATCH] scsi: device_handler: alua: Remove a might_sleep() annotation
The might_sleep() annotation in alua_rtpg_queue() is not correct since the
command completion code may call this function from atomic context.
Calling alua_rtpg_queue() from atomic context in the command completion
path is fine since request submitters must hold an sdev reference until
command execution has completed. This patch fixes the following kernel
warning:
BUG: sleeping function called from invalid context at drivers/scsi/device_handler/scsi_dh_alua.c:992
Call Trace:
dump_stack_lvl+0xac/0x100
__might_resched+0x284/0x2c8
alua_rtpg_queue+0x3c/0x98 [scsi_dh_alua]
alua_check+0x122/0x250 [scsi_dh_alua]
alua_check_sense+0x172/0x228 [scsi_dh_alua]
scsi_check_sense+0x8a/0x2e0
scsi_decide_disposition+0x286/0x298
scsi_complete+0x6a/0x108
blk_complete_reqs+0x6e/0x88
__do_softirq+0x13e/0x6b8
__irq_exit_rcu+0x14a/0x170
irq_exit_rcu+0x22/0x50
do_ext_irq+0x10a/0x1d0
Reported-by: Steffen Maier <maier@linux.ibm.com>
Cc: Steffen Maier <maier@linux.ibm.com>
Cc: Martin Wilck <mwilck@suse.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
drivers/scsi/device_handler/scsi_dh_alua.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/scsi/device_handler/scsi_dh_alua.c b/drivers/scsi/device_handler/scsi_dh_alua.c
index 55a5073248f8..362fa631f39b 100644
--- a/drivers/scsi/device_handler/scsi_dh_alua.c
+++ b/drivers/scsi/device_handler/scsi_dh_alua.c
@@ -987,6 +987,9 @@ static void alua_rtpg_work(struct work_struct *work)
*
* Returns true if and only if alua_rtpg_work() will be called asynchronously.
* That function is responsible for calling @qdata->fn().
+ *
+ * Context: may be called from atomic context (alua_check()) only if the caller
+ * holds an sdev reference.
*/
static bool alua_rtpg_queue(struct alua_port_group *pg,
struct scsi_device *sdev,
@@ -995,8 +998,6 @@ static bool alua_rtpg_queue(struct alua_port_group *pg,
int start_queue = 0;
unsigned long flags;
- might_sleep();
-
if (WARN_ON_ONCE(!pg) || scsi_device_get(sdev))
return false;
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: kernel BUG scsi_dh_alua sleeping from invalid context && kernel WARNING do not call blocking ops when !TASK_RUNNING
2023-01-18 0:29 ` Bart Van Assche
@ 2023-01-18 8:45 ` Martin Wilck
2023-01-18 16:17 ` Steffen Maier
1 sibling, 0 replies; 18+ messages in thread
From: Martin Wilck @ 2023-01-18 8:45 UTC (permalink / raw)
To: Bart Van Assche, Steffen Maier, linux-scsi
Cc: Martin K. Petersen, James E . J . Bottomley, Sachin Sant,
Hannes Reinecke, Benjamin Block, linux-s390
On Tue, 2023-01-17 at 16:29 -0800, Bart Van Assche wrote:
> On 1/17/23 14:03, Martin Wilck wrote:
> > On Tue, 2023-01-17 at 13:52 -0800, Bart Van Assche wrote:
> > > On 1/17/23 13:48, Martin Wilck wrote:
> > > > Yes, that was my suggestion. Just defer the scsi_device_put()
> > > > call
> > > > in
> > > > alua_rtpg_queue() in the case where the actual RTPG handler is
> > > > not
> > > > queued. I won't have time for that before next week though.
> > >
> > > Hi Martin,
> > >
> > > Do you agree that the call trace shared by Steffen is not
> > > sufficient
> > > to
> > > conclude that this change is necessary?
> >
> > Hmm, I suppose I missed your point... to re-iterate my thinking:
> >
> > 1 alua_queue_rtpg() must take a ref to the sdev before queueing
> > work,
> > whether or not the caller already has one
> > 2 queue_delayed_work() can fail
> > 3 if queue_delayed_work() fails, alua_queue_rtpg() must drop the
> > ref
> > it just took
> > 4 BUT (and this is what I guess I missed) this ref can't be the
> > last
> > one dropped, because the caller of alua_rtpg_queue() must still
> > hold
> > a reference. And scsi_device_put() only sleeps if the last ref
> > is
> > dropped. Therefore the issue in Steffen's call stack should
> > indeed be fixed just by removing the might_sleep(). If all
> > callers
> > callers of alua_rtpg_queue() must hold an sdev reference (I
> > believe
> > they do), we can indeed remove the might_sleep() entirely.
> >
> > Is this correct reasoning, and what you meant previously? If yes, I
> > agree, and I apologize for not realizing it in the first place.
> > But I think this is subtle enough to deserve a comment in the code.
>
> Yes, that's what I'm thinking.
>
> How about the patch below?
>
> Thanks,
>
> Bart.
>
> [PATCH] scsi: device_handler: alua: Remove a might_sleep() annotation
>
> The might_sleep() annotation in alua_rtpg_queue() is not correct
> since the
> command completion code may call this function from atomic context.
> Calling alua_rtpg_queue() from atomic context in the command
> completion
> path is fine since request submitters must hold an sdev reference
> until
> command execution has completed. This patch fixes the following
> kernel
> warning:
>
> BUG: sleeping function called from invalid context at
> drivers/scsi/device_handler/scsi_dh_alua.c:992
> Call Trace:
> dump_stack_lvl+0xac/0x100
> __might_resched+0x284/0x2c8
> alua_rtpg_queue+0x3c/0x98 [scsi_dh_alua]
> alua_check+0x122/0x250 [scsi_dh_alua]
> alua_check_sense+0x172/0x228 [scsi_dh_alua]
> scsi_check_sense+0x8a/0x2e0
> scsi_decide_disposition+0x286/0x298
> scsi_complete+0x6a/0x108
> blk_complete_reqs+0x6e/0x88
> __do_softirq+0x13e/0x6b8
> __irq_exit_rcu+0x14a/0x170
> irq_exit_rcu+0x22/0x50
> do_ext_irq+0x10a/0x1d0
>
> Reported-by: Steffen Maier <maier@linux.ibm.com>
> Cc: Steffen Maier <maier@linux.ibm.com>
> Cc: Martin Wilck <mwilck@suse.com>
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Martin Wilck <mwilck@suse.com>
> ---
> drivers/scsi/device_handler/scsi_dh_alua.c | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/scsi/device_handler/scsi_dh_alua.c
> b/drivers/scsi/device_handler/scsi_dh_alua.c
> index 55a5073248f8..362fa631f39b 100644
> --- a/drivers/scsi/device_handler/scsi_dh_alua.c
> +++ b/drivers/scsi/device_handler/scsi_dh_alua.c
> @@ -987,6 +987,9 @@ static void alua_rtpg_work(struct work_struct
> *work)
> *
> * Returns true if and only if alua_rtpg_work() will be called
> asynchronously.
> * That function is responsible for calling @qdata->fn().
> + *
> + * Context: may be called from atomic context (alua_check()) only if
> the caller
> + * holds an sdev reference.
> */
> static bool alua_rtpg_queue(struct alua_port_group *pg,
> struct scsi_device *sdev,
> @@ -995,8 +998,6 @@ static bool alua_rtpg_queue(struct
> alua_port_group *pg,
> int start_queue = 0;
> unsigned long flags;
>
> - might_sleep();
> -
> if (WARN_ON_ONCE(!pg) || scsi_device_get(sdev))
> return false;
>
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: kernel BUG scsi_dh_alua sleeping from invalid context && kernel WARNING do not call blocking ops when !TASK_RUNNING
2023-01-18 0:29 ` Bart Van Assche
2023-01-18 8:45 ` Martin Wilck
@ 2023-01-18 16:17 ` Steffen Maier
2023-01-24 11:16 ` Steffen Maier
1 sibling, 1 reply; 18+ messages in thread
From: Steffen Maier @ 2023-01-18 16:17 UTC (permalink / raw)
To: Bart Van Assche, Martin Wilck, linux-scsi
Cc: Martin K. Petersen, James E . J . Bottomley, Sachin Sant,
Hannes Reinecke, Benjamin Block, linux-s390
On 1/18/23 01:29, Bart Van Assche wrote:
> On 1/17/23 14:03, Martin Wilck wrote:
>> On Tue, 2023-01-17 at 13:52 -0800, Bart Van Assche wrote:
>>> On 1/17/23 13:48, Martin Wilck wrote:
>>>> Yes, that was my suggestion. Just defer the scsi_device_put() call
>>>> in
>>>> alua_rtpg_queue() in the case where the actual RTPG handler is not
>>>> queued. I won't have time for that before next week though.
>>> Do you agree that the call trace shared by Steffen is not sufficient
>>> to
>>> conclude that this change is necessary?
>>
>> Hmm, I suppose I missed your point... to re-iterate my thinking:
>>
>> 1 alua_queue_rtpg() must take a ref to the sdev before queueing work,
>> whether or not the caller already has one
>> 2 queue_delayed_work() can fail
>> 3 if queue_delayed_work() fails, alua_queue_rtpg() must drop the ref
>> it just took
>> 4 BUT (and this is what I guess I missed) this ref can't be the last
>> one dropped, because the caller of alua_rtpg_queue() must still hold
>> a reference. And scsi_device_put() only sleeps if the last ref is
>> dropped. Therefore the issue in Steffen's call stack should
>> indeed be fixed just by removing the might_sleep(). If all callers
>> callers of alua_rtpg_queue() must hold an sdev reference (I believe
>> they do), we can indeed remove the might_sleep() entirely.
>>
>> Is this correct reasoning, and what you meant previously? If yes, I
>> agree, and I apologize for not realizing it in the first place.
>> But I think this is subtle enough to deserve a comment in the code.
>
> Yes, that's what I'm thinking.
>
> How about the patch below?
>
> Thanks,
>
> Bart.
>
> [PATCH] scsi: device_handler: alua: Remove a might_sleep() annotation
>
> The might_sleep() annotation in alua_rtpg_queue() is not correct since the
> command completion code may call this function from atomic context.
> Calling alua_rtpg_queue() from atomic context in the command completion
> path is fine since request submitters must hold an sdev reference until
> command execution has completed. This patch fixes the following kernel
> warning:
>
> BUG: sleeping function called from invalid context at
> drivers/scsi/device_handler/scsi_dh_alua.c:992
> Call Trace:
> dump_stack_lvl+0xac/0x100
> __might_resched+0x284/0x2c8
> alua_rtpg_queue+0x3c/0x98 [scsi_dh_alua]
> alua_check+0x122/0x250 [scsi_dh_alua]
> alua_check_sense+0x172/0x228 [scsi_dh_alua]
> scsi_check_sense+0x8a/0x2e0
> scsi_decide_disposition+0x286/0x298
> scsi_complete+0x6a/0x108
> blk_complete_reqs+0x6e/0x88
> __do_softirq+0x13e/0x6b8
> __irq_exit_rcu+0x14a/0x170
> irq_exit_rcu+0x22/0x50
> do_ext_irq+0x10a/0x1d0
>
> Reported-by: Steffen Maier <maier@linux.ibm.com>
> Cc: Steffen Maier <maier@linux.ibm.com>
> Cc: Martin Wilck <mwilck@suse.com>
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> ---
> drivers/scsi/device_handler/scsi_dh_alua.c | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/scsi/device_handler/scsi_dh_alua.c
> b/drivers/scsi/device_handler/scsi_dh_alua.c
> index 55a5073248f8..362fa631f39b 100644
> --- a/drivers/scsi/device_handler/scsi_dh_alua.c
> +++ b/drivers/scsi/device_handler/scsi_dh_alua.c
> @@ -987,6 +987,9 @@ static void alua_rtpg_work(struct work_struct *work)
> *
> * Returns true if and only if alua_rtpg_work() will be called asynchronously.
> * That function is responsible for calling @qdata->fn().
> + *
> + * Context: may be called from atomic context (alua_check()) only if the caller
> + * holds an sdev reference.
> */
> static bool alua_rtpg_queue(struct alua_port_group *pg,
> struct scsi_device *sdev,
> @@ -995,8 +998,6 @@ static bool alua_rtpg_queue(struct alua_port_group *pg,
> int start_queue = 0;
> unsigned long flags;
>
> - might_sleep();
> -
I had removed those two lines yesterday for our CI kernel build.
Tonight's run obviously no longer had any related BUG or WARNING.
I checked all dumps from that run to see if anything stalled and whether it was
related to ALUA, but I think we're good.
Tested-by: Steffen Maier <maier@linux.ibm.com>
> if (WARN_ON_ONCE(!pg) || scsi_device_get(sdev))
> return false;
>
>
--
Mit freundlichen Gruessen / Kind regards
Steffen Maier
Linux on IBM Z and LinuxONE
https://www.ibm.com/privacy/us/en/
IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Gregor Pillen
Geschaeftsfuehrung: David Faller
Sitz der Gesellschaft: Boeblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: kernel BUG scsi_dh_alua sleeping from invalid context && kernel WARNING do not call blocking ops when !TASK_RUNNING
2023-01-18 16:17 ` Steffen Maier
@ 2023-01-24 11:16 ` Steffen Maier
2023-01-24 11:36 ` Martin Wilck
0 siblings, 1 reply; 18+ messages in thread
From: Steffen Maier @ 2023-01-24 11:16 UTC (permalink / raw)
To: Bart Van Assche, Martin Wilck, linux-scsi
Cc: Martin K. Petersen, James E . J . Bottomley, Sachin Sant,
Hannes Reinecke, Benjamin Block, linux-s390
On 1/18/23 17:17, Steffen Maier wrote:
> On 1/18/23 01:29, Bart Van Assche wrote:
>> On 1/17/23 14:03, Martin Wilck wrote:
>>> On Tue, 2023-01-17 at 13:52 -0800, Bart Van Assche wrote:
>>>> On 1/17/23 13:48, Martin Wilck wrote:
>>>>> Yes, that was my suggestion. Just defer the scsi_device_put() call
>>>>> in
>>>>> alua_rtpg_queue() in the case where the actual RTPG handler is not
>>>>> queued. I won't have time for that before next week though.
>> [PATCH] scsi: device_handler: alua: Remove a might_sleep() annotation
>>
>> The might_sleep() annotation in alua_rtpg_queue() is not correct since the
>> command completion code may call this function from atomic context.
>> Calling alua_rtpg_queue() from atomic context in the command completion
>> path is fine since request submitters must hold an sdev reference until
>> command execution has completed. This patch fixes the following kernel
>> warning:
>>
>> BUG: sleeping function called from invalid context at
>> drivers/scsi/device_handler/scsi_dh_alua.c:992
>> Call Trace:
>> dump_stack_lvl+0xac/0x100
>> __might_resched+0x284/0x2c8
>> alua_rtpg_queue+0x3c/0x98 [scsi_dh_alua]
>> alua_check+0x122/0x250 [scsi_dh_alua]
>> alua_check_sense+0x172/0x228 [scsi_dh_alua]
>> scsi_check_sense+0x8a/0x2e0
>> scsi_decide_disposition+0x286/0x298
>> scsi_complete+0x6a/0x108
>> blk_complete_reqs+0x6e/0x88
>> __do_softirq+0x13e/0x6b8
>> __irq_exit_rcu+0x14a/0x170
>> irq_exit_rcu+0x22/0x50
>> do_ext_irq+0x10a/0x1d0
>>
>> Reported-by: Steffen Maier <maier@linux.ibm.com>
>> Cc: Steffen Maier <maier@linux.ibm.com>
>> Cc: Martin Wilck <mwilck@suse.com>
>> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
>> ---
>> drivers/scsi/device_handler/scsi_dh_alua.c | 5 +++--
>> 1 file changed, 3 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/scsi/device_handler/scsi_dh_alua.c
>> b/drivers/scsi/device_handler/scsi_dh_alua.c
>> index 55a5073248f8..362fa631f39b 100644
>> --- a/drivers/scsi/device_handler/scsi_dh_alua.c
>> +++ b/drivers/scsi/device_handler/scsi_dh_alua.c
>> @@ -987,6 +987,9 @@ static void alua_rtpg_work(struct work_struct *work)
>> *
>> * Returns true if and only if alua_rtpg_work() will be called asynchronously.
>> * That function is responsible for calling @qdata->fn().
>> + *
>> + * Context: may be called from atomic context (alua_check()) only if the caller
>> + * holds an sdev reference.
>> */
>> static bool alua_rtpg_queue(struct alua_port_group *pg,
>> struct scsi_device *sdev,
>> @@ -995,8 +998,6 @@ static bool alua_rtpg_queue(struct alua_port_group *pg,
>> int start_queue = 0;
>> unsigned long flags;
>>
>> - might_sleep();
>> -
>
> I had removed those two lines yesterday for our CI kernel build.
> Tonight's run obviously no longer had any related BUG or WARNING.
> I checked all dumps from that run to see if anything stalled and whether it was
> related to ALUA, but I think we're good.
>
> Tested-by: Steffen Maier <maier@linux.ibm.com>
I'm afraid, that might have been too early.
Today, I got BUG/WARNING with a slightly different stack trace where
alua_rtpg_queue calls scsi_device_put(), which in turn contains a might_sleep
but seems called in atomic context:
> [ 2517.231562] sd 13:0:0:1073823768: Power-on or device reset occurred
> [ 2517.231582] sd 13:0:0:1073823768: [sdax] tag#2787 Done: ADD_TO_MLQUEUE Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
> [ 2517.231590] sd 13:0:0:1073823768: [sdax] tag#2787 CDB: Test Unit Ready 00 00 00 00 00 00
> [ 2517.231598] sd 13:0:0:1073823768: [sdax] tag#2787 Sense Key : Unit Attention [current]
> [ 2517.231605] sd 13:0:0:1073823768: [sdax] tag#2787 Add. Sense: Power on, reset, or bus device reset occurred
> [ 2517.236104] sd 13:0:0:1074348056: Power-on or device reset occurred
> [ 2517.236124] BUG: sleeping function called from invalid context at drivers/scsi/scsi.c:591
> [ 2517.236130] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 166768, name: systemd-udevd
> [ 2517.236137] preempt_count: 100, expected: 0
> [ 2517.236143] RCU nest depth: 0, expected: 0
> [ 2517.236148] no locks held by systemd-udevd/166768.
> [ 2517.236154] Preemption disabled at:
> [ 2517.236157] [<000000019704d22e>] __do_softirq+0x5e/0x6b8
> [ 2517.236177] CPU: 2 PID: 166768 Comm: systemd-udevd Tainted: G K 6.2.0-20230123.rc5.git2.9dea08313ff5.300.fc37.s390x+debug #1
> [ 2517.236185] Hardware name: IBM 8561 T01 703 (z/VM 7.3.0)
> [ 2517.236190] Call Trace:
> [ 2517.236195] [<00000001970367cc>] dump_stack_lvl+0xac/0x100
> [ 2517.236203] [<00000001962a590c>] __might_resched+0x284/0x2c8
> [ 2517.236213] [<0000000196c7b34a>] scsi_device_put+0x42/0x60
> [ 2517.236224] [<000003ff7fb9c57e>] alua_rtpg_queue.part.0+0xce/0x348 [scsi_dh_alua]
> [ 2517.236234] [<000003ff7fb9d20a>] alua_check+0x132/0x260 [scsi_dh_alua]
> [ 2517.236241] [<000003ff7fb9d4aa>] alua_check_sense+0x172/0x228 [scsi_dh_alua]
> [ 2517.236248] [<0000000196c7fd0e>] scsi_check_sense+0x86/0x2e0
> [ 2517.236256] [<0000000196c82cc6>] scsi_decide_disposition+0x286/0x298
> [ 2517.236262] [<0000000196c873da>] scsi_complete+0x6a/0x108
> [ 2517.236269] [<0000000196a5aeea>] blk_complete_reqs+0x6a/0x88
> [ 2517.236281] [<000000019704d30a>] __do_softirq+0x13a/0x6b8
> [ 2517.236287] [<000000019626b802>] __irq_exit_rcu+0x14a/0x170
> [ 2517.236297] [<000000019626c372>] irq_exit_rcu+0x22/0x50
> [ 2517.236303] [<0000000197036fda>] do_ext_irq+0xba/0x1d0
> [ 2517.236309] [<000000019704ad06>] ext_int_handler+0xd6/0x110
> [ 2517.236315] [<00000001963accd2>] seccomp_run_filters+0x9a/0x198
> [ 2517.236328] [<00000001963ad5bc>] __seccomp_filter+0x4c/0x3b8
> [ 2517.236334] [<0000000196335f1a>] syscall_trace_enter.constprop.0+0xda/0x310
> [ 2517.236345] [<0000000197036bf0>] __do_syscall+0xf0/0x208
> [ 2517.236350] [<000000019704aa52>] system_call+0x82/0xb0
> [ 2517.236356] no locks held by systemd-udevd/166768.
The same can also happen outside of process context, where it happened to run
alua_rtpg() before an IRQ happened for :
> [ 2517.249685] ------------[ cut here ]------------
> [ 2517.249691] do not call blocking ops when !TASK_RUNNING; state=2 set at [<0000000197040cb2>] __wait_for_common+0xa2/0x240
> [ 2517.249710] WARNING: CPU: 0 PID: 121221 at kernel/sched/core.c:9959 __might_sleep+0x7c/0x98
> [ 2517.249719] Modules linked in: kvm af_iucv algif_hash af_alg nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables nfnetlink dm_service_time sunrpc zfcp scsi_transport_fc s390_trng vfio_ccw mdev vfio_iommu_type1 vfio sch_fq_codel ip6_tables ip_tables x_tables configfs ghash_s390 prng chacha_s390 libchacha aes_s390 des_s390 libdes sha3_512_s390 sha3_256_s390 nvme sha512_s390 sha256_s390 sha1_s390 sha_common nvme_core scsi_dh_rdac scsi_dh_emc scsi_dh_alua pkey zcrypt rng_core dm_multipath autofs4
> [ 2517.249869] Unloaded tainted modules: test_klp_state3(K):1 test_klp_state2(K):4 test_klp_state(K):3 test_klp_callbacks_demo2(K):2 test_klp_callbacks_demo(K):12 test_klp_atomic_replace(K):2 test_klp_livepatch(K):6 [last unloaded: test_klp_callbacks_demo(K)]
> [ 2517.249907] CPU: 0 PID: 121221 Comm: kworker/0:1 Tainted: G W K 6.2.0-20230123.rc5.git2.9dea08313ff5.300.fc37.s390x+debug #1
> [ 2517.249915] Hardware name: IBM 8561 T01 703 (z/VM 7.3.0)
> [ 2517.249921] Workqueue: kaluad alua_rtpg_work [scsi_dh_alua]
> [ 2517.249931] Krnl PSW : 0704d00180000000 00000001962a59d0 (__might_sleep+0x80/0x98)
> [ 2517.249944] R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 RI:0 EA:3
> [ 2517.249953] Krnl GPRS: c0000000ffffbfff 0000000080000101 000000000000006d 00000001974ae114
> [ 2517.249960] 0000037ffff339a0 0000037ffff33998 0000000000000000 0000000000000001
> [ 2517.249966] 0700037ffff33b50 00000000be69c000 000000000000024f 00000001974cb458
> [ 2517.249973] 00000000a4080100 00000000a5344220 00000001962a59cc 0000037ffff33b30
> [ 2517.249985] Krnl Code: 00000001962a59c0: c020008c269f larl %r2,000000019742a6fe
> 00000001962a59c6: c0e5006bbf19 brasl %r14,000000019701d7f8
> #00000001962a59cc: af000000 mc 0,0
> >00000001962a59d0: a7490000 lghi %r4,0
> 00000001962a59d4: b904003a lgr %r3,%r10
> 00000001962a59d8: b904002b lgr %r2,%r11
> 00000001962a59dc: ebaff0a00004 lmg %r10,%r15,160(%r15)
> 00000001962a59e2: c0f4fffffe53 brcl 15,00000001962a5688
> [ 2517.250023] Call Trace:
> [ 2517.250028] [<00000001962a59d0>] __might_sleep+0x80/0x98
> [ 2517.250036] ([<00000001962a59cc>] __might_sleep+0x7c/0x98)
> [ 2517.250043] [<0000000196c7b34a>] scsi_device_put+0x42/0x60
> [ 2517.250050] [<000003ff7fb9c57e>] alua_rtpg_queue.part.0+0xce/0x348 [scsi_dh_alua]
> [ 2517.250058] [<000003ff7fb9d20a>] alua_check+0x132/0x260 [scsi_dh_alua]
> [ 2517.250066] [<000003ff7fb9d4aa>] alua_check_sense+0x172/0x228 [scsi_dh_alua]
> [ 2517.250073] [<0000000196c7fd0e>] scsi_check_sense+0x86/0x2e0
> [ 2517.250080] [<0000000196c82cc6>] scsi_decide_disposition+0x286/0x298
> [ 2517.250087] [<0000000196c873da>] scsi_complete+0x6a/0x108
> [ 2517.250095] [<0000000196a5aeea>] blk_complete_reqs+0x6a/0x88
> [ 2517.250102] [<000000019704d30a>] __do_softirq+0x13a/0x6b8
> [ 2517.250109] [<000000019626b802>] __irq_exit_rcu+0x14a/0x170
> [ 2517.250116] [<000000019626c372>] irq_exit_rcu+0x22/0x50
> [ 2517.250123] [<0000000197036fda>] do_ext_irq+0xba/0x1d0
> [ 2517.250130] [<000000019704ad06>] ext_int_handler+0xd6/0x110
> [ 2517.250136] [<0000000197049ac2>] _raw_spin_unlock_irq+0x42/0x70
> [ 2517.250143] ([<0000000197049abe>] _raw_spin_unlock_irq+0x3e/0x70)
> [ 2517.250150] [<0000000197040cdc>] __wait_for_common+0xcc/0x240
> [ 2517.250157] [<0000000196a5bf8e>] blk_execute_rq+0x126/0x1f8
> [ 2517.250165] [<0000000196c84f32>] __scsi_execute+0x112/0x260
> [ 2517.250172] [<000003ff7fb9d698>] alua_rtpg+0x138/0xb10 [scsi_dh_alua]
> [ 2517.250179] [<000003ff7fb9e32c>] alua_rtpg_work+0x2bc/0x4e0 [scsi_dh_alua]
> [ 2517.250186] [<000000019628c244>] process_one_work+0x30c/0x730
> [ 2517.250197] [<000000019628c6ca>] worker_thread+0x62/0x420
> [ 2517.250205] [<0000000196297b08>] kthread+0x138/0x150
> [ 2517.250214] [<000000019620f92c>] __ret_from_fork+0x3c/0x58
> [ 2517.250222] [<000000019704aa8a>] ret_from_fork+0xa/0x40
> [ 2517.250229] 2 locks held by kworker/0:1/121221:
> [ 2517.250235] #0: 000000008ba79148 ((wq_completion)kaluad){+.+.}-{0:0}, at: process_one_work+0x232/0x730
> [ 2517.250256] #1: 000003800695fdc8 ((work_completion)(&(&pg->rtpg_work)->work)){+.+.}-{0:0}, at: process_one_work+0x232/0x730
> [ 2517.250276] Last Breaking-Event-Address:
> [ 2517.250281] [<000000019701d85e>] __warn_printk+0x66/0x70
> [ 2517.250291] Kernel panic - not syncing: kernel: panic_on_warn set ...
--
Mit freundlichen Gruessen / Kind regards
Steffen Maier
Linux on IBM Z and LinuxONE
https://www.ibm.com/privacy/us/en/
IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Gregor Pillen
Geschaeftsfuehrung: David Faller
Sitz der Gesellschaft: Boeblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: kernel BUG scsi_dh_alua sleeping from invalid context && kernel WARNING do not call blocking ops when !TASK_RUNNING
2023-01-24 11:16 ` Steffen Maier
@ 2023-01-24 11:36 ` Martin Wilck
0 siblings, 0 replies; 18+ messages in thread
From: Martin Wilck @ 2023-01-24 11:36 UTC (permalink / raw)
To: Steffen Maier, Bart Van Assche, linux-scsi
Cc: Martin K. Petersen, James E . J . Bottomley, Sachin Sant,
Hannes Reinecke, Benjamin Block, linux-s390
On Tue, 2023-01-24 at 12:16 +0100, Steffen Maier wrote:
> On 1/18/23 17:17, Steffen Maier wrote:
>
> >
> > I had removed those two lines yesterday for our CI kernel build.
> > Tonight's run obviously no longer had any related BUG or WARNING.
> > I checked all dumps from that run to see if anything stalled and
> > whether it was
> > related to ALUA, but I think we're good.
> >
> > Tested-by: Steffen Maier <maier@linux.ibm.com>
>
> I'm afraid, that might have been too early.
> Today, I got BUG/WARNING with a slightly different stack trace where
> alua_rtpg_queue calls scsi_device_put(), which in turn contains a
> might_sleep
> but seems called in atomic context:
>
> > [ 2517.231562] sd 13:0:0:1073823768: Power-on or device reset
> > occurred
> > [ 2517.231582] sd 13:0:0:1073823768: [sdax] tag#2787 Done:
> > ADD_TO_MLQUEUE Result: hostbyte=DID_OK driverbyte=DRIVER_OK
> > cmd_age=0s
> > [ 2517.231590] sd 13:0:0:1073823768: [sdax] tag#2787 CDB: Test Unit
> > Ready 00 00 00 00 00 00
> > [ 2517.231598] sd 13:0:0:1073823768: [sdax] tag#2787 Sense Key :
> > Unit Attention [current]
> > [ 2517.231605] sd 13:0:0:1073823768: [sdax] tag#2787 Add. Sense:
> > Power on, reset, or bus device reset occurred
> > [ 2517.236104] sd 13:0:0:1074348056: Power-on or device reset
> > occurred
> > [ 2517.236124] BUG: sleeping function called from invalid context
> > at drivers/scsi/scsi.c:591
> > [ 2517.236130] in_atomic(): 1, irqs_disabled(): 0, non_block: 0,
> > pid: 166768, name: systemd-udevd
> > [ 2517.236137] preempt_count: 100, expected: 0
> > [ 2517.236143] RCU nest depth: 0, expected: 0
> > [ 2517.236148] no locks held by systemd-udevd/166768.
> > [ 2517.236154] Preemption disabled at:
> > [ 2517.236157] [<000000019704d22e>] __do_softirq+0x5e/0x6b8
> > [ 2517.236177] CPU: 2 PID: 166768 Comm: systemd-udevd Tainted:
> > G K 6.2.0-
> > 20230123.rc5.git2.9dea08313ff5.300.fc37.s390x+debug #1
> > [ 2517.236185] Hardware name: IBM 8561 T01 703 (z/VM 7.3.0)
> > [ 2517.236190] Call Trace:
> > [ 2517.236195] [<00000001970367cc>] dump_stack_lvl+0xac/0x100
> > [ 2517.236203] [<00000001962a590c>] __might_resched+0x284/0x2c8
> > [ 2517.236213] [<0000000196c7b34a>] scsi_device_put+0x42/0x60
> > [ 2517.236224] [<000003ff7fb9c57e>]
> > alua_rtpg_queue.part.0+0xce/0x348 [scsi_dh_alua]
> > [ 2517.236234] [<000003ff7fb9d20a>] alua_check+0x132/0x260
> > [scsi_dh_alua]
> > [ 2517.236241] [<000003ff7fb9d4aa>] alua_check_sense+0x172/0x228
> > [scsi_dh_alua]
> > [ 2517.236248] [<0000000196c7fd0e>] scsi_check_sense+0x86/0x2e0
> > [ 2517.236256] [<0000000196c82cc6>]
> > scsi_decide_disposition+0x286/0x298
> > [ 2517.236262] [<0000000196c873da>] scsi_complete+0x6a/0x108
> > [ 2517.236269] [<0000000196a5aeea>] blk_complete_reqs+0x6a/0x88
> > [ 2517.236281] [<000000019704d30a>] __do_softirq+0x13a/0x6b8
> > [ 2517.236287] [<000000019626b802>] __irq_exit_rcu+0x14a/0x170
> > [ 2517.236297] [<000000019626c372>] irq_exit_rcu+0x22/0x50
> > [ 2517.236303] [<0000000197036fda>] do_ext_irq+0xba/0x1d0
> > [ 2517.236309] [<000000019704ad06>] ext_int_handler+0xd6/0x110
> > [ 2517.236315] [<00000001963accd2>] seccomp_run_filters+0x9a/0x198
> > [ 2517.236328] [<00000001963ad5bc>] __seccomp_filter+0x4c/0x3b8
> > [ 2517.236334] [<0000000196335f1a>]
> > syscall_trace_enter.constprop.0+0xda/0x310
> > [ 2517.236345] [<0000000197036bf0>] __do_syscall+0xf0/0x208
> > [ 2517.236350] [<000000019704aa52>] system_call+0x82/0xb0
> > [ 2517.236356] no locks held by systemd-udevd/166768.
>
> The same can also happen outside of process context, where it
> happened to run
> alua_rtpg() before an IRQ happened for :
>
> > [ 2517.249685] ------------[ cut here ]------------
> > [ 2517.249691] do not call blocking ops when !TASK_RUNNING; state=2
> > set at [<0000000197040cb2>] __wait_for_common+0xa2/0x240
> > [ 2517.249710] WARNING: CPU: 0 PID: 121221 at
> > kernel/sched/core.c:9959 __might_sleep+0x7c/0x98
> > [ 2517.249719] Modules linked in: kvm af_iucv algif_hash af_alg
> > nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet
> > nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat
> > nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables
> > nfnetlink dm_service_time sunrpc zfcp scsi_transport_fc s390_trng
> > vfio_ccw mdev vfio_iommu_type1 vfio sch_fq_codel ip6_tables
> > ip_tables x_tables configfs ghash_s390 prng chacha_s390 libchacha
> > aes_s390 des_s390 libdes sha3_512_s390 sha3_256_s390 nvme
> > sha512_s390 sha256_s390 sha1_s390 sha_common nvme_core scsi_dh_rdac
> > scsi_dh_emc scsi_dh_alua pkey zcrypt rng_core dm_multipath autofs4
> > [ 2517.249869] Unloaded tainted modules: test_klp_state3(K):1
> > test_klp_state2(K):4 test_klp_state(K):3
> > test_klp_callbacks_demo2(K):2 test_klp_callbacks_demo(K):12
> > test_klp_atomic_replace(K):2 test_klp_livepatch(K):6 [last
> > unloaded: test_klp_callbacks_demo(K)]
> > [ 2517.249907] CPU: 0 PID: 121221 Comm: kworker/0:1 Tainted:
> > G W K 6.2.0-
> > 20230123.rc5.git2.9dea08313ff5.300.fc37.s390x+debug #1
> > [ 2517.249915] Hardware name: IBM 8561 T01 703 (z/VM 7.3.0)
> > [ 2517.249921] Workqueue: kaluad alua_rtpg_work [scsi_dh_alua]
> > [ 2517.249931] Krnl PSW : 0704d00180000000 00000001962a59d0
> > (__might_sleep+0x80/0x98)
> > [ 2517.249944] R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3
> > CC:1 PM:0 RI:0 EA:3
> > [ 2517.249953] Krnl GPRS: c0000000ffffbfff 0000000080000101
> > 000000000000006d 00000001974ae114
> > [ 2517.249960] 0000037ffff339a0 0000037ffff33998
> > 0000000000000000 0000000000000001
> > [ 2517.249966] 0700037ffff33b50 00000000be69c000
> > 000000000000024f 00000001974cb458
> > [ 2517.249973] 00000000a4080100 00000000a5344220
> > 00000001962a59cc 0000037ffff33b30
> > [ 2517.249985] Krnl Code: 00000001962a59c0:
> > c020008c269f larl %r2,000000019742a6fe
> > 00000001962a59c6:
> > c0e5006bbf19 brasl %r14,000000019701d7f8
> > #00000001962a59cc:
> > af000000 mc 0,0
> > >00000001962a59d0:
> > a7490000 lghi %r4,0
> > 00000001962a59d4:
> > b904003a lgr %r3,%r10
> > 00000001962a59d8:
> > b904002b lgr %r2,%r11
> > 00000001962a59dc:
> > ebaff0a00004 lmg %r10,%r15,160(%r15)
> > 00000001962a59e2:
> > c0f4fffffe53 brcl 15,00000001962a5688
> > [ 2517.250023] Call Trace:
> > [ 2517.250028] [<00000001962a59d0>] __might_sleep+0x80/0x98
> > [ 2517.250036] ([<00000001962a59cc>] __might_sleep+0x7c/0x98)
> > [ 2517.250043] [<0000000196c7b34a>] scsi_device_put+0x42/0x60
> > [ 2517.250050] [<000003ff7fb9c57e>]
> > alua_rtpg_queue.part.0+0xce/0x348 [scsi_dh_alua]
> > [ 2517.250058] [<000003ff7fb9d20a>] alua_check+0x132/0x260
> > [scsi_dh_alua]
> > [ 2517.250066] [<000003ff7fb9d4aa>] alua_check_sense+0x172/0x228
> > [scsi_dh_alua]
> > [ 2517.250073] [<0000000196c7fd0e>] scsi_check_sense+0x86/0x2e0
> > [ 2517.250080] [<0000000196c82cc6>]
> > scsi_decide_disposition+0x286/0x298
> > [ 2517.250087] [<0000000196c873da>] scsi_complete+0x6a/0x108
> > [ 2517.250095] [<0000000196a5aeea>] blk_complete_reqs+0x6a/0x88
> > [ 2517.250102] [<000000019704d30a>] __do_softirq+0x13a/0x6b8
> > [ 2517.250109] [<000000019626b802>] __irq_exit_rcu+0x14a/0x170
> > [ 2517.250116] [<000000019626c372>] irq_exit_rcu+0x22/0x50
> > [ 2517.250123] [<0000000197036fda>] do_ext_irq+0xba/0x1d0
> > [ 2517.250130] [<000000019704ad06>] ext_int_handler+0xd6/0x110
> > [ 2517.250136] [<0000000197049ac2>] _raw_spin_unlock_irq+0x42/0x70
> > [ 2517.250143] ([<0000000197049abe>]
> > _raw_spin_unlock_irq+0x3e/0x70)
> > [ 2517.250150] [<0000000197040cdc>] __wait_for_common+0xcc/0x240
> > [ 2517.250157] [<0000000196a5bf8e>] blk_execute_rq+0x126/0x1f8
> > [ 2517.250165] [<0000000196c84f32>] __scsi_execute+0x112/0x260
> > [ 2517.250172] [<000003ff7fb9d698>] alua_rtpg+0x138/0xb10
> > [scsi_dh_alua]
> > [ 2517.250179] [<000003ff7fb9e32c>] alua_rtpg_work+0x2bc/0x4e0
> > [scsi_dh_alua]
> > [ 2517.250186] [<000000019628c244>] process_one_work+0x30c/0x730
> > [ 2517.250197] [<000000019628c6ca>] worker_thread+0x62/0x420
> > [ 2517.250205] [<0000000196297b08>] kthread+0x138/0x150
> > [ 2517.250214] [<000000019620f92c>] __ret_from_fork+0x3c/0x58
> > [ 2517.250222] [<000000019704aa8a>] ret_from_fork+0xa/0x40
> > [ 2517.250229] 2 locks held by kworker/0:1/121221:
> > [ 2517.250235] #0: 000000008ba79148 ((wq_completion)kaluad){+.+.}-
> > {0:0}, at: process_one_work+0x232/0x730
> > [ 2517.250256] #1: 000003800695fdc8 ((work_completion)(&(&pg-
> > >rtpg_work)->work)){+.+.}-{0:0}, at: process_one_work+0x232/0x730
> > [ 2517.250276] Last Breaking-Event-Address:
> > [ 2517.250281] [<000000019701d85e>] __warn_printk+0x66/0x70
> > [ 2517.250291] Kernel panic - not syncing: kernel: panic_on_warn
> > set ...
>
I assume that Bart's previous reasoning applies here, too.
scsi_device_put() sleeps only if it releases the last reference to the
device. The calling stack, working on an I/O if the device in question,
must hold another reference to the scsi_device, so the ref being put
by alua_check->alua_rtpg_queue() can't be the last one.
Consequently, following this line of reasoning, we could remove the
might_sleep() in scsi_device_put(), too, eliminating this issue. But
that would mean that we couldn't detect possible other, actually broken
callers of scsi_device_put() any more, neither now nor in the future.
Perhaps we should introduce something like scsi_device_put_safe(),
to be called only from contexts where we are certain that another
reference must exists? It's the only possibility I see, but it doesn't
feel quite right.
Regards
Martin
^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2023-01-24 11:36 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-01-16 14:59 kernel BUG scsi_dh_alua sleeping from invalid context && kernel WARNING do not call blocking ops when !TASK_RUNNING Steffen Maier
2023-01-16 16:57 ` Martin Wilck
2023-01-16 17:48 ` Bart Van Assche
2023-01-16 17:58 ` Martin Wilck
2023-01-17 9:28 ` Martin Wilck
2023-01-17 18:50 ` Bart Van Assche
2023-01-17 21:48 ` Martin Wilck
2023-01-17 21:52 ` Bart Van Assche
2023-01-17 22:03 ` Martin Wilck
2023-01-18 0:29 ` Bart Van Assche
2023-01-18 8:45 ` Martin Wilck
2023-01-18 16:17 ` Steffen Maier
2023-01-24 11:16 ` Steffen Maier
2023-01-24 11:36 ` Martin Wilck
2023-01-16 17:55 ` Bart Van Assche
2023-01-16 18:12 ` Steffen Maier
2023-01-16 18:31 ` Bart Van Assche
2023-01-17 7:46 ` Martin Wilck
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox