* [BUG] gfs2: sleeping lock in gfs2_quota_init() with preempt disabled on PREEMPT_RT
@ 2025-08-12 5:39 Yunseong Kim
2025-08-12 10:38 ` Sebastian Andrzej Siewior
0 siblings, 1 reply; 2+ messages in thread
From: Yunseong Kim @ 2025-08-12 5:39 UTC (permalink / raw)
To: Andreas Gruenbacher
Cc: Christian Brauner, gfs2, linux-fsdevel, linux-rt-devel, syzkaller,
Austin Kim
While testing with PREEMPT_RT enabled, I encountered a potential issue in
gfs2_quota_init() where preemption is disabled via bit_spin_lock(), but a
subsequent call eventually reaches a sleepable lock.
Below is the simplified call flow:
[PREEMPT_RT]
gfs2_quota_init()
|
v
spin_lock_bucket()
|-- bit_spin_lock()
| |
| +--> preempt_disable() [Preemption disabled]
v
gfs2_qd_search_bucket()
|
v
lockref_get_not_dead()
|
v
spin_lock(&lockref->lock) [Sleepable: sleeping in preemption disabled]
[NON-PREEMPT_RT]
gfs2_quota_init()
|
v
spin_lock_bucket()
|-- bit_spin_lock()
| |
| +--> preempt_disable() [Preemption disabled]
v
gfs2_qd_search_bucket()
|
v
lockref_get_not_dead()
|
v
spin_lock(&lockref->lock) [Non-sleeping spinlock]
BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48
in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 15311, name: syz.6.947
preempt_count: 1, expected: 0
RCU nest depth: 1, expected: 1
Preemption disabled at:
[<ffff800081fa916c>] bit_spin_lock include/linux/bit_spinlock.h:25 [inline]
[<ffff800081fa916c>] hlist_bl_lock include/linux/list_bl.h:148 [inline]
[<ffff800081fa916c>] spin_lock_bucket+0x60/0x190 fs/gfs2/quota.c:98
CPU: 2 UID: 0 PID: 15311 Comm: syz.6.947 Not tainted 6.16.0-rc1-rt1-dirty #12 PREEMPT_RT
Hardware name: QEMU KVM Virtual Machine, BIOS 2025.02-8 05/13/2025
Call trace:
show_stack+0x2c/0x3c arch/arm64/kernel/stacktrace.c:466 (C)
__dump_stack+0x30/0x40 lib/dump_stack.c:94
dump_stack_lvl+0x148/0x1d8 lib/dump_stack.c:120
dump_stack+0x1c/0x3c lib/dump_stack.c:129
__might_resched+0x2e4/0x52c kernel/sched/core.c:8800
__rt_spin_lock kernel/locking/spinlock_rt.c:48 [inline]
rt_spin_lock+0xa8/0x1bc kernel/locking/spinlock_rt.c:57
spin_lock include/linux/spinlock_rt.h:44 [inline]
lockref_get_not_dead+0x30/0xd0 lib/lockref.c:155
gfs2_qd_search_bucket fs/gfs2/quota.c:269 [inline]
gfs2_quota_init+0x778/0x1114 fs/gfs2/quota.c:1460
gfs2_make_fs_rw+0x144/0x26c fs/gfs2/super.c:149
gfs2_fill_super+0x1448/0x1a38 fs/gfs2/ops_fstype.c:1278
get_tree_bdev_flags+0x360/0x414 fs/super.c:1679
get_tree_bdev+0x2c/0x3c fs/super.c:1702
gfs2_get_tree+0x54/0x1b4 fs/gfs2/ops_fstype.c:1335
vfs_get_tree+0x90/0x28c fs/super.c:1802
do_new_mount+0x228/0x814 fs/namespace.c:3885
path_mount+0x574/0xd64 fs/namespace.c:4209
do_mount fs/namespace.c:4222 [inline]
__do_sys_mount fs/namespace.c:4433 [inline]
__se_sys_mount fs/namespace.c:4410 [inline]
__arm64_sys_mount+0x3e8/0x468 fs/namespace.c:4410
__invoke_syscall arch/arm64/kernel/syscall.c:35 [inline]
invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:49
el0_svc_common+0x130/0x23c arch/arm64/kernel/syscall.c:132
do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:151
el0_svc+0x40/0x13c arch/arm64/kernel/entry-common.c:767
el0t_64_sync_handler+0x78/0x108 arch/arm64/kernel/entry-common.c:786
el0t_64_sync+0x1ac/0x1b0 arch/arm64/kernel/entry.S:600
On non-RT kernels this is fine because spin_lock() is non-sleeping, but
on PREEMPT_RT many spinlocks map to sleepable locks, which causes sleeping
while preemption disabled.
At the moment, two possible approaches come to mind:
1. Avoid using the bit-spinlock in this path entirely
- Rework the bucket protection so we do not call into code that may sleep
while preemption is disabled.
- no sleeping while preemption-disabled.
- May require reworking bucket protection/race re-check logic.
2. Replace the inner lock with a non-sleeping primitive
- (e.g., use a raw_spinlock_t for lockref->lock) so it can be taken while
preemption is disabled.
- Since lockref does not currently support raw_spinlock_t, this would require
additional rework of the lockref structure.
- Minimal structural change; avoids sleeping in the problematic context.
I can help test patches for this on both PREEMPT_RT and NON-RT kernels.
Thanks,
Yunseong Kim
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: [BUG] gfs2: sleeping lock in gfs2_quota_init() with preempt disabled on PREEMPT_RT
2025-08-12 5:39 [BUG] gfs2: sleeping lock in gfs2_quota_init() with preempt disabled on PREEMPT_RT Yunseong Kim
@ 2025-08-12 10:38 ` Sebastian Andrzej Siewior
0 siblings, 0 replies; 2+ messages in thread
From: Sebastian Andrzej Siewior @ 2025-08-12 10:38 UTC (permalink / raw)
To: Yunseong Kim
Cc: Andreas Gruenbacher, Christian Brauner, gfs2, linux-fsdevel,
linux-rt-devel, syzkaller, Austin Kim
On 2025-08-12 14:39:29 [+0900], Yunseong Kim wrote:
> While testing with PREEMPT_RT enabled, I encountered a potential issue in
> gfs2_quota_init() where preemption is disabled via bit_spin_lock(), but a
> subsequent call eventually reaches a sleepable lock.
>
> Below is the simplified call flow:
> [PREEMPT_RT]
…
> At the moment, two possible approaches come to mind:
>
> 1. Avoid using the bit-spinlock in this path entirely
> - Rework the bucket protection so we do not call into code that may sleep
> while preemption is disabled.
> - no sleeping while preemption-disabled.
> - May require reworking bucket protection/race re-check logic.
So this would add 4096 spinlocks for RT. Given that those a bit bigger
than the other ones, it will use a bit more than 16KiB of memory.
Maybe fewer could be used assuming lock sharing is fine here but I
haven't checked the code
> 2. Replace the inner lock with a non-sleeping primitive
> - (e.g., use a raw_spinlock_t for lockref->lock) so it can be taken while
> preemption is disabled.
> - Since lockref does not currently support raw_spinlock_t, this would require
> additional rework of the lockref structure.
> - Minimal structural change; avoids sleeping in the problematic context.
This is something.
Looking at the layout of struct lockref USE_CMPXCHG_LOCKREF can only
work if the spinlock is the raw_spinlock_t without lockdep. So it
expects the lock to be unlocked and does a 64bit cmpxchg() to inc/ dec
the counter that follows.
I was confused why lockref_mark_dead() works for dcache given that the
lock is not released outside of lockref.c except for
lockref_put_or_lock(). But then I noticed d_lock.
So making the lock raw_spinlock_t breaks the few lockref_put_or_lock()
users. The impact on dcache is huge and I noticed some waitqueue
handling which would break. So there is that.
Based on this I think it should be handled somehow within gfs2.
> I can help test patches for this on both PREEMPT_RT and NON-RT kernels.
>
>
> Thanks,
> Yunseong Kim
Sebastian
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2025-08-12 10:38 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-12 5:39 [BUG] gfs2: sleeping lock in gfs2_quota_init() with preempt disabled on PREEMPT_RT Yunseong Kim
2025-08-12 10:38 ` Sebastian Andrzej Siewior
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).