* [PATCH] cgroup: Disable preemption for cgrp->freezer.freeze_seq when CONFIG_PREEMPT_RT=y.
@ 2025-10-02 5:22 Kuniyuki Iwashima
2025-10-02 5:49 ` Sebastian Andrzej Siewior
` (3 more replies)
0 siblings, 4 replies; 14+ messages in thread
From: Kuniyuki Iwashima @ 2025-10-02 5:22 UTC (permalink / raw)
To: Tejun Heo, Johannes Weiner, Michal Koutný,
Sebastian Andrzej Siewior, Clark Williams, Steven Rostedt
Cc: Tiffany Yang, Kuniyuki Iwashima, Kuniyuki Iwashima, cgroups,
linux-rt-devel, syzbot+27a2519eb4dad86d0156
syzbot reported the splat below. [0]
Commit afa3701c0e45 ("cgroup: cgroup.stat.local time accounting")
introduced cgrp->freezer.freeze_seq.
The writer side is under spin_lock_irq(), but the section is still
preemptible with CONFIG_PREEMPT_RT=y.
Let's wrap the section with preempt_{disable,enable}_nested().
[0]:
WARNING: CPU: 0 PID: 6076 at ./include/linux/seqlock.h:221 __seqprop_assert include/linux/seqlock.h:221 [inline]
WARNING: CPU: 0 PID: 6076 at ./include/linux/seqlock.h:221 cgroup_do_freeze kernel/cgroup/freezer.c:182 [inline]
WARNING: CPU: 0 PID: 6076 at ./include/linux/seqlock.h:221 cgroup_freeze+0x80a/0xf90 kernel/cgroup/freezer.c:309
Modules linked in:
CPU: 0 UID: 0 PID: 6076 Comm: syz.0.17 Not tainted syzkaller #0 PREEMPT_{RT,(full)}
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/18/2025
RIP: 0010:__seqprop_assert include/linux/seqlock.h:221 [inline]
RIP: 0010:cgroup_do_freeze kernel/cgroup/freezer.c:182 [inline]
RIP: 0010:cgroup_freeze+0x80a/0xf90 kernel/cgroup/freezer.c:309
Code: 90 e9 9e fb ff ff 44 89 f1 80 e1 07 80 c1 03 38 c1 0f 8c e7 f9 ff ff 4c 89 f7 e8 e1 43 67 00 e9 da f9 ff ff e8 17 68 06 00 90 <0f> 0b 90 e9 10 fc ff ff 44 89 f9 80 e1 07 38 c1 48 8b 0c 24 0f 8c
RSP: 0018:ffffc90003b178e0 EFLAGS: 00010293
RAX: ffffffff81b6c6a9 RBX: 0000000000000000 RCX: ffff88803671bc00
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffffc90003b17a70 R08: 0000000000000000 R09: 0000000000000000
R10: dffffc0000000000 R11: fffffbfff1d6d2a7 R12: dffffc0000000000
R13: 0000000000000000 R14: 0000000000000001 R15: ffff88803623a791
FS: 00005555915ae500(0000) GS:ffff888127017000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000001b33363fff CR3: 0000000023b98000 CR4: 00000000003526f0
Call Trace:
<TASK>
cgroup_freeze_write+0x156/0x1c0 kernel/cgroup/cgroup.c:4174
cgroup_file_write+0x39b/0x740 kernel/cgroup/cgroup.c:4312
kernfs_fop_write_iter+0x3b0/0x540 fs/kernfs/file.c:352
new_sync_write fs/read_write.c:593 [inline]
vfs_write+0x5d5/0xb40 fs/read_write.c:686
ksys_write+0x14b/0x260 fs/read_write.c:738
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xfa/0x3b0 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f0dc3e9eec9
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ffd9b7b6198 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 00007f0dc40f5fa0 RCX: 00007f0dc3e9eec9
RDX: 0000000000000012 RSI: 0000200000000200 RDI: 0000000000000004
RBP: 00007f0dc3f21f91 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007f0dc40f5fa0 R14: 00007f0dc40f5fa0 R15: 0000000000000003
</TASK>
Fixes: afa3701c0e45 ("cgroup: cgroup.stat.local time accounting")
Reported-by: syzbot+27a2519eb4dad86d0156@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/68de0b21.050a0220.25d7ab.077d.GAE@google.com/
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
---
kernel/cgroup/freezer.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/kernel/cgroup/freezer.c b/kernel/cgroup/freezer.c
index 6c18854bff34..7e779c8a6f89 100644
--- a/kernel/cgroup/freezer.c
+++ b/kernel/cgroup/freezer.c
@@ -179,6 +179,7 @@ static void cgroup_do_freeze(struct cgroup *cgrp, bool freeze, u64 ts_nsec)
lockdep_assert_held(&cgroup_mutex);
spin_lock_irq(&css_set_lock);
+ preempt_disable_nested();
write_seqcount_begin(&cgrp->freezer.freeze_seq);
if (freeze) {
set_bit(CGRP_FREEZE, &cgrp->flags);
@@ -189,6 +190,7 @@ static void cgroup_do_freeze(struct cgroup *cgrp, bool freeze, u64 ts_nsec)
cgrp->freezer.freeze_start_nsec);
}
write_seqcount_end(&cgrp->freezer.freeze_seq);
+ preempt_enable_nested();
spin_unlock_irq(&css_set_lock);
if (freeze)
--
2.51.0.618.g983fd99d29-goog
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH] cgroup: Disable preemption for cgrp->freezer.freeze_seq when CONFIG_PREEMPT_RT=y.
2025-10-02 5:22 [PATCH] cgroup: Disable preemption for cgrp->freezer.freeze_seq when CONFIG_PREEMPT_RT=y Kuniyuki Iwashima
@ 2025-10-02 5:49 ` Sebastian Andrzej Siewior
2025-10-02 8:28 ` Michal Koutný
` (2 subsequent siblings)
3 siblings, 0 replies; 14+ messages in thread
From: Sebastian Andrzej Siewior @ 2025-10-02 5:49 UTC (permalink / raw)
To: Kuniyuki Iwashima
Cc: Tejun Heo, Johannes Weiner, Michal Koutný, Clark Williams,
Steven Rostedt, Tiffany Yang, Kuniyuki Iwashima, cgroups,
linux-rt-devel, syzbot+27a2519eb4dad86d0156
On 2025-10-02 05:22:07 [+0000], Kuniyuki Iwashima wrote:
> syzbot reported the splat below. [0]
>
> Commit afa3701c0e45 ("cgroup: cgroup.stat.local time accounting")
> introduced cgrp->freezer.freeze_seq.
>
> The writer side is under spin_lock_irq(), but the section is still
> preemptible with CONFIG_PREEMPT_RT=y.
>
> Let's wrap the section with preempt_{disable,enable}_nested().
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Sebastian
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] cgroup: Disable preemption for cgrp->freezer.freeze_seq when CONFIG_PREEMPT_RT=y.
2025-10-02 5:22 [PATCH] cgroup: Disable preemption for cgrp->freezer.freeze_seq when CONFIG_PREEMPT_RT=y Kuniyuki Iwashima
2025-10-02 5:49 ` Sebastian Andrzej Siewior
@ 2025-10-02 8:28 ` Michal Koutný
2025-10-02 16:22 ` Kuniyuki Iwashima
2025-10-02 18:11 ` Waiman Long
2025-10-02 16:45 ` Tejun Heo
2025-10-04 8:35 ` Tiffany Yang
3 siblings, 2 replies; 14+ messages in thread
From: Michal Koutný @ 2025-10-02 8:28 UTC (permalink / raw)
To: Kuniyuki Iwashima, Sebastian Andrzej Siewior
Cc: Tejun Heo, Johannes Weiner, Clark Williams, Steven Rostedt,
Tiffany Yang, Kuniyuki Iwashima, cgroups, linux-rt-devel,
syzbot+27a2519eb4dad86d0156
[-- Attachment #1: Type: text/plain, Size: 668 bytes --]
Hello.
Thanks for looking into this Kuniyuki.
On Thu, Oct 02, 2025 at 05:22:07AM +0000, Kuniyuki Iwashima <kuniyu@google.com> wrote:
> The writer side is under spin_lock_irq(), but the section is still
> preemptible with CONFIG_PREEMPT_RT=y.
I see similar construction in other places, e.g.
mems_allowed_seq in set_mems_allowed
period_seqcount in ioc_start_period
pidmap_lock_seq in alloc_pid/pidfs_add_pid
(where their outer lock becomes preemptible on PREEMPT_RT.)
> Let's wrap the section with preempt_{disable,enable}_nested().
Is it better to wrap them all (for CONFIG_PREEMPT_RT=y) or should they
become seqlock_t on CONFIG_PREEMPT_RT=y?
Thanks,
Michal
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 265 bytes --]
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] cgroup: Disable preemption for cgrp->freezer.freeze_seq when CONFIG_PREEMPT_RT=y.
2025-10-02 8:28 ` Michal Koutný
@ 2025-10-02 16:22 ` Kuniyuki Iwashima
2025-10-02 16:55 ` Sebastian Andrzej Siewior
2025-10-02 18:11 ` Waiman Long
1 sibling, 1 reply; 14+ messages in thread
From: Kuniyuki Iwashima @ 2025-10-02 16:22 UTC (permalink / raw)
To: Michal Koutný
Cc: Sebastian Andrzej Siewior, Tejun Heo, Johannes Weiner,
Clark Williams, Steven Rostedt, Tiffany Yang, Kuniyuki Iwashima,
cgroups, linux-rt-devel, syzbot+27a2519eb4dad86d0156
On Thu, Oct 2, 2025 at 1:28 AM Michal Koutný <mkoutny@suse.com> wrote:
>
> Hello.
>
> Thanks for looking into this Kuniyuki.
>
> On Thu, Oct 02, 2025 at 05:22:07AM +0000, Kuniyuki Iwashima <kuniyu@google.com> wrote:
> > The writer side is under spin_lock_irq(), but the section is still
> > preemptible with CONFIG_PREEMPT_RT=y.
>
> I see similar construction in other places, e.g.
> mems_allowed_seq in set_mems_allowed
IIUC, local_irq_save() or local_irq_disable() is used
for the writer of mems_allowed_seq, so there should
be not preemptible.
> period_seqcount in ioc_start_period
> pidmap_lock_seq in alloc_pid/pidfs_add_pid
These two seem to have the same problem.
> (where their outer lock becomes preemptible on PREEMPT_RT.)
>
> > Let's wrap the section with preempt_{disable,enable}_nested().
>
> Is it better to wrap them all (for CONFIG_PREEMPT_RT=y) or should they
> become seqlock_t on CONFIG_PREEMPT_RT=y?
I think wrapping them would be better as the wrapper is just
an lockdep assertion when PREEMPT_RT=n
Thanks!
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] cgroup: Disable preemption for cgrp->freezer.freeze_seq when CONFIG_PREEMPT_RT=y.
2025-10-02 5:22 [PATCH] cgroup: Disable preemption for cgrp->freezer.freeze_seq when CONFIG_PREEMPT_RT=y Kuniyuki Iwashima
2025-10-02 5:49 ` Sebastian Andrzej Siewior
2025-10-02 8:28 ` Michal Koutný
@ 2025-10-02 16:45 ` Tejun Heo
2025-10-02 16:55 ` Sebastian Andrzej Siewior
2025-10-03 14:42 ` Tejun Heo
2025-10-04 8:35 ` Tiffany Yang
3 siblings, 2 replies; 14+ messages in thread
From: Tejun Heo @ 2025-10-02 16:45 UTC (permalink / raw)
To: Kuniyuki Iwashima
Cc: Johannes Weiner, Michal Koutný, Sebastian Andrzej Siewior,
Clark Williams, Steven Rostedt, Tiffany Yang, cgroups,
linux-kernel
Applied to cgroup/for-6.18-fixes.
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] cgroup: Disable preemption for cgrp->freezer.freeze_seq when CONFIG_PREEMPT_RT=y.
2025-10-02 16:22 ` Kuniyuki Iwashima
@ 2025-10-02 16:55 ` Sebastian Andrzej Siewior
2025-10-02 17:14 ` Kuniyuki Iwashima
2025-10-02 17:17 ` Michal Koutný
0 siblings, 2 replies; 14+ messages in thread
From: Sebastian Andrzej Siewior @ 2025-10-02 16:55 UTC (permalink / raw)
To: Kuniyuki Iwashima
Cc: Michal Koutný, Tejun Heo, Johannes Weiner, Clark Williams,
Steven Rostedt, Tiffany Yang, Kuniyuki Iwashima, cgroups,
linux-rt-devel, syzbot+27a2519eb4dad86d0156
On 2025-10-02 09:22:25 [-0700], Kuniyuki Iwashima wrote:
> On Thu, Oct 2, 2025 at 1:28 AM Michal Koutný <mkoutny@suse.com> wrote:
> >
> > Hello.
> >
> > Thanks for looking into this Kuniyuki.
> >
> > On Thu, Oct 02, 2025 at 05:22:07AM +0000, Kuniyuki Iwashima <kuniyu@google.com> wrote:
> > > The writer side is under spin_lock_irq(), but the section is still
> > > preemptible with CONFIG_PREEMPT_RT=y.
> >
> > I see similar construction in other places, e.g.
> > mems_allowed_seq in set_mems_allowed
>
> IIUC, local_irq_save() or local_irq_disable() is used
> for the writer of mems_allowed_seq, so there should
> be not preemptible.
That local_irq_disable() looks odd.
mems_allowed_seq is different, it is associated with
task_struct::alloc_lock. This lock is acquired by set_mems_allowed() so
it is enough. That local_irq_disable() is there because seqcount
read side can be used from softirq.
>
> > period_seqcount in ioc_start_period
> > pidmap_lock_seq in alloc_pid/pidfs_add_pid
>
> These two seem to have the same problem.
Nope. period_seqcount is a seqcount_spinlock_t. So is pidmap_lock_seq.
> > (where their outer lock becomes preemptible on PREEMPT_RT.)
> >
> > > Let's wrap the section with preempt_{disable,enable}_nested().
> >
> > Is it better to wrap them all (for CONFIG_PREEMPT_RT=y) or should they
> > become seqlock_t on CONFIG_PREEMPT_RT=y?
>
> I think wrapping them would be better as the wrapper is just
> an lockdep assertion when PREEMPT_RT=n
Now that I swap in in everything.
If you have a "naked" seqcount_t then you need manually ensure that
there can be only one writer. And then need to disable preemption on top
of it in order to ensure that the writer makes always progress.
In the freezer case, may I suggest the following instead:
diff --git a/include/linux/cgroup-defs.h b/include/linux/cgroup-defs.h
index 539c64eeef38f..933c4487a8462 100644
--- a/include/linux/cgroup-defs.h
+++ b/include/linux/cgroup-defs.h
@@ -435,7 +435,7 @@ struct cgroup_freezer_state {
int nr_frozen_tasks;
/* Freeze time data consistency protection */
- seqcount_t freeze_seq;
+ seqcount_spinlock_t freeze_seq;
/*
* Most recent time the cgroup was requested to freeze.
diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index eb9fc7ae65b08..c0215e7de3666 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -5813,7 +5813,7 @@ static struct cgroup *cgroup_create(struct cgroup *parent, const char *name,
* if the parent has to be frozen, the child has too.
*/
cgrp->freezer.e_freeze = parent->freezer.e_freeze;
- seqcount_init(&cgrp->freezer.freeze_seq);
+ seqcount_spinlock_init(&cgrp->freezer.freeze_seq, &css_set_lock);
if (cgrp->freezer.e_freeze) {
/*
* Set the CGRP_FREEZE flag, so when a process will be
While former works, too this is way nicer. Not sure if it compiles but
it should.
> Thanks!
Sebastian
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH] cgroup: Disable preemption for cgrp->freezer.freeze_seq when CONFIG_PREEMPT_RT=y.
2025-10-02 16:45 ` Tejun Heo
@ 2025-10-02 16:55 ` Sebastian Andrzej Siewior
2025-10-02 17:04 ` Tejun Heo
2025-10-03 14:42 ` Tejun Heo
1 sibling, 1 reply; 14+ messages in thread
From: Sebastian Andrzej Siewior @ 2025-10-02 16:55 UTC (permalink / raw)
To: Tejun Heo
Cc: Kuniyuki Iwashima, Johannes Weiner, Michal Koutný,
Clark Williams, Steven Rostedt, Tiffany Yang, cgroups,
linux-kernel
On 2025-10-02 06:45:01 [-1000], Tejun Heo wrote:
> Applied to cgroup/for-6.18-fixes.
10 minutes too slow then.
> Thanks.
>
Sebastian
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] cgroup: Disable preemption for cgrp->freezer.freeze_seq when CONFIG_PREEMPT_RT=y.
2025-10-02 16:55 ` Sebastian Andrzej Siewior
@ 2025-10-02 17:04 ` Tejun Heo
2025-10-02 20:51 ` Sebastian Andrzej Siewior
0 siblings, 1 reply; 14+ messages in thread
From: Tejun Heo @ 2025-10-02 17:04 UTC (permalink / raw)
To: Sebastian Andrzej Siewior
Cc: Kuniyuki Iwashima, Johannes Weiner, Michal Koutný,
Clark Williams, Steven Rostedt, Tiffany Yang, cgroups,
linux-kernel
On Thu, Oct 02, 2025 at 06:55:49PM +0200, Sebastian Andrzej Siewior wrote:
> On 2025-10-02 06:45:01 [-1000], Tejun Heo wrote:
> > Applied to cgroup/for-6.18-fixes.
>
> 10 minutes too slow then.
Oh, I can easily revert. Just let me know what you want to happen.
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] cgroup: Disable preemption for cgrp->freezer.freeze_seq when CONFIG_PREEMPT_RT=y.
2025-10-02 16:55 ` Sebastian Andrzej Siewior
@ 2025-10-02 17:14 ` Kuniyuki Iwashima
2025-10-02 17:17 ` Michal Koutný
1 sibling, 0 replies; 14+ messages in thread
From: Kuniyuki Iwashima @ 2025-10-02 17:14 UTC (permalink / raw)
To: Sebastian Andrzej Siewior
Cc: Michal Koutný, Tejun Heo, Johannes Weiner, Clark Williams,
Steven Rostedt, Tiffany Yang, Kuniyuki Iwashima, cgroups,
linux-rt-devel, syzbot+27a2519eb4dad86d0156
On Thu, Oct 2, 2025 at 9:55 AM Sebastian Andrzej Siewior
<bigeasy@linutronix.de> wrote:
>
> On 2025-10-02 09:22:25 [-0700], Kuniyuki Iwashima wrote:
> > On Thu, Oct 2, 2025 at 1:28 AM Michal Koutný <mkoutny@suse.com> wrote:
> > >
> > > Hello.
> > >
> > > Thanks for looking into this Kuniyuki.
> > >
> > > On Thu, Oct 02, 2025 at 05:22:07AM +0000, Kuniyuki Iwashima <kuniyu@google.com> wrote:
> > > > The writer side is under spin_lock_irq(), but the section is still
> > > > preemptible with CONFIG_PREEMPT_RT=y.
> > >
> > > I see similar construction in other places, e.g.
> > > mems_allowed_seq in set_mems_allowed
> >
> > IIUC, local_irq_save() or local_irq_disable() is used
> > for the writer of mems_allowed_seq, so there should
> > be not preemptible.
>
> That local_irq_disable() looks odd.
> mems_allowed_seq is different, it is associated with
> task_struct::alloc_lock. This lock is acquired by set_mems_allowed() so
> it is enough. That local_irq_disable() is there because seqcount
> read side can be used from softirq.
>
> >
> > > period_seqcount in ioc_start_period
> > > pidmap_lock_seq in alloc_pid/pidfs_add_pid
> >
> > These two seem to have the same problem.
>
> Nope. period_seqcount is a seqcount_spinlock_t. So is pidmap_lock_seq.
Oh thanks, somehow I assumed all of them are seqcount,
-ENOCOFEE :p
>
> > > (where their outer lock becomes preemptible on PREEMPT_RT.)
> > >
> > > > Let's wrap the section with preempt_{disable,enable}_nested().
> > >
> > > Is it better to wrap them all (for CONFIG_PREEMPT_RT=y) or should they
> > > become seqlock_t on CONFIG_PREEMPT_RT=y?
> >
> > I think wrapping them would be better as the wrapper is just
> > an lockdep assertion when PREEMPT_RT=n
>
> Now that I swap in in everything.
>
> If you have a "naked" seqcount_t then you need manually ensure that
> there can be only one writer. And then need to disable preemption on top
> of it in order to ensure that the writer makes always progress.
>
> In the freezer case, may I suggest the following instead:
Isn't it a bit redundant when PREEMPT_RT=n as we take
spin_lock_irq() ?
>
> diff --git a/include/linux/cgroup-defs.h b/include/linux/cgroup-defs.h
> index 539c64eeef38f..933c4487a8462 100644
> --- a/include/linux/cgroup-defs.h
> +++ b/include/linux/cgroup-defs.h
> @@ -435,7 +435,7 @@ struct cgroup_freezer_state {
> int nr_frozen_tasks;
>
> /* Freeze time data consistency protection */
> - seqcount_t freeze_seq;
> + seqcount_spinlock_t freeze_seq;
>
> /*
> * Most recent time the cgroup was requested to freeze.
> diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
> index eb9fc7ae65b08..c0215e7de3666 100644
> --- a/kernel/cgroup/cgroup.c
> +++ b/kernel/cgroup/cgroup.c
> @@ -5813,7 +5813,7 @@ static struct cgroup *cgroup_create(struct cgroup *parent, const char *name,
> * if the parent has to be frozen, the child has too.
> */
> cgrp->freezer.e_freeze = parent->freezer.e_freeze;
> - seqcount_init(&cgrp->freezer.freeze_seq);
> + seqcount_spinlock_init(&cgrp->freezer.freeze_seq, &css_set_lock);
> if (cgrp->freezer.e_freeze) {
> /*
> * Set the CGRP_FREEZE flag, so when a process will be
>
> While former works, too this is way nicer. Not sure if it compiles but
> it should.
>
> > Thanks!
>
> Sebastian
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] cgroup: Disable preemption for cgrp->freezer.freeze_seq when CONFIG_PREEMPT_RT=y.
2025-10-02 16:55 ` Sebastian Andrzej Siewior
2025-10-02 17:14 ` Kuniyuki Iwashima
@ 2025-10-02 17:17 ` Michal Koutný
1 sibling, 0 replies; 14+ messages in thread
From: Michal Koutný @ 2025-10-02 17:17 UTC (permalink / raw)
To: Sebastian Andrzej Siewior
Cc: Kuniyuki Iwashima, Tejun Heo, Johannes Weiner, Clark Williams,
Steven Rostedt, Tiffany Yang, Kuniyuki Iwashima, cgroups,
linux-rt-devel, syzbot+27a2519eb4dad86d0156
[-- Attachment #1: Type: text/plain, Size: 593 bytes --]
On Thu, Oct 02, 2025 at 06:55:10PM +0200, Sebastian Andrzej Siewior <bigeasy@linutronix.de> wrote:
> Nope. period_seqcount is a seqcount_spinlock_t. So is pidmap_lock_seq.
Oops, I didn't check that write_seqcount_{being,end} are macros that may
accept different types and mistakenly assumed seqcount_t too.
> /* Freeze time data consistency protection */
> - seqcount_t freeze_seq;
> + seqcount_spinlock_t freeze_seq;
...
> While former works, too this is way nicer. Not sure if it compiles but
> it should.
Thanks! (It's therefore also better aligned with the other occurences.)
Michal
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 265 bytes --]
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] cgroup: Disable preemption for cgrp->freezer.freeze_seq when CONFIG_PREEMPT_RT=y.
2025-10-02 8:28 ` Michal Koutný
2025-10-02 16:22 ` Kuniyuki Iwashima
@ 2025-10-02 18:11 ` Waiman Long
1 sibling, 0 replies; 14+ messages in thread
From: Waiman Long @ 2025-10-02 18:11 UTC (permalink / raw)
To: Michal Koutný, Kuniyuki Iwashima, Sebastian Andrzej Siewior
Cc: Tejun Heo, Johannes Weiner, Clark Williams, Steven Rostedt,
Tiffany Yang, Kuniyuki Iwashima, cgroups, linux-rt-devel,
syzbot+27a2519eb4dad86d0156
On 10/2/25 4:28 AM, Michal Koutný wrote:
> Hello.
>
> Thanks for looking into this Kuniyuki.
>
> On Thu, Oct 02, 2025 at 05:22:07AM +0000, Kuniyuki Iwashima <kuniyu@google.com> wrote:
>> The writer side is under spin_lock_irq(), but the section is still
>> preemptible with CONFIG_PREEMPT_RT=y.
> I see similar construction in other places, e.g.
> mems_allowed_seq in set_mems_allowed
> period_seqcount in ioc_start_period
> pidmap_lock_seq in alloc_pid/pidfs_add_pid
> (where their outer lock becomes preemptible on PREEMPT_RT.)
>
>> Let's wrap the section with preempt_{disable,enable}_nested().
> Is it better to wrap them all (for CONFIG_PREEMPT_RT=y) or should they
> become seqlock_t on CONFIG_PREEMPT_RT=y?
Changing it to seqlock_t will introduce another lock that need to be
handled. Alternatively, we can use seqcount_spinlock_t for freeze_seq
for the current case and associate css_set_lock with the sequence count
in seqcount_spinlock_init(). That should properly handle the case for
PREEMPT_RT kernel.
Cheers,
Longman
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] cgroup: Disable preemption for cgrp->freezer.freeze_seq when CONFIG_PREEMPT_RT=y.
2025-10-02 17:04 ` Tejun Heo
@ 2025-10-02 20:51 ` Sebastian Andrzej Siewior
0 siblings, 0 replies; 14+ messages in thread
From: Sebastian Andrzej Siewior @ 2025-10-02 20:51 UTC (permalink / raw)
To: Tejun Heo
Cc: Kuniyuki Iwashima, Johannes Weiner, Michal Koutný,
Clark Williams, Steven Rostedt, Tiffany Yang, cgroups,
linux-kernel
On 2025-10-02 07:04:39 [-1000], Tejun Heo wrote:
> On Thu, Oct 02, 2025 at 06:55:49PM +0200, Sebastian Andrzej Siewior wrote:
> > On 2025-10-02 06:45:01 [-1000], Tejun Heo wrote:
> > > Applied to cgroup/for-6.18-fixes.
> >
> > 10 minutes too slow then.
>
> Oh, I can easily revert. Just let me know what you want to happen.
Oh, thank you. I will prepare than the suggest patch. I don't mind if
someone beats me to it ;)
> Thanks.
Sebastian
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] cgroup: Disable preemption for cgrp->freezer.freeze_seq when CONFIG_PREEMPT_RT=y.
2025-10-02 16:45 ` Tejun Heo
2025-10-02 16:55 ` Sebastian Andrzej Siewior
@ 2025-10-03 14:42 ` Tejun Heo
1 sibling, 0 replies; 14+ messages in thread
From: Tejun Heo @ 2025-10-03 14:42 UTC (permalink / raw)
To: Kuniyuki Iwashima
Cc: Johannes Weiner, Michal Koutný, Sebastian Andrzej Siewior,
Clark Williams, Steven Rostedt, Tiffany Yang, cgroups,
linux-kernel
On Thu, Oct 02, 2025 at 06:45:01AM -1000, Tejun Heo wrote:
> Applied to cgroup/for-6.18-fixes.
I applied the following patch and dropped this one per the discussion on
this thread:
http://lkml.kernel.org/r/20251003114555.413804-1-nirbhay.lkd@gmail.com
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] cgroup: Disable preemption for cgrp->freezer.freeze_seq when CONFIG_PREEMPT_RT=y.
2025-10-02 5:22 [PATCH] cgroup: Disable preemption for cgrp->freezer.freeze_seq when CONFIG_PREEMPT_RT=y Kuniyuki Iwashima
` (2 preceding siblings ...)
2025-10-02 16:45 ` Tejun Heo
@ 2025-10-04 8:35 ` Tiffany Yang
3 siblings, 0 replies; 14+ messages in thread
From: Tiffany Yang @ 2025-10-04 8:35 UTC (permalink / raw)
To: Kuniyuki Iwashima
Cc: Tejun Heo, Johannes Weiner, Michal Koutný,
Sebastian Andrzej Siewior, Clark Williams, Steven Rostedt,
Kuniyuki Iwashima, cgroups, linux-rt-devel,
syzbot+27a2519eb4dad86d0156
Hi Kuniyuki,
Kuniyuki Iwashima <kuniyu@google.com> writes:
> syzbot reported the splat below. [0]
> Commit afa3701c0e45 ("cgroup: cgroup.stat.local time accounting")
> introduced cgrp->freezer.freeze_seq.
> The writer side is under spin_lock_irq(), but the section is still
> preemptible with CONFIG_PREEMPT_RT=y.
> Let's wrap the section with preempt_{disable,enable}_nested().
> [0]:
> WARNING: CPU: 0 PID: 6076 at ./include/linux/seqlock.h:221
> __seqprop_assert include/linux/seqlock.h:221 [inline]
> WARNING: CPU: 0 PID: 6076 at ./include/linux/seqlock.h:221
> cgroup_do_freeze kernel/cgroup/freezer.c:182 [inline]
> WARNING: CPU: 0 PID: 6076 at ./include/linux/seqlock.h:221
> cgroup_freeze+0x80a/0xf90 kernel/cgroup/freezer.c:309
> Modules linked in:
> CPU: 0 UID: 0 PID: 6076 Comm: syz.0.17 Not tainted syzkaller #0
> PREEMPT_{RT,(full)}
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 08/18/2025
> RIP: 0010:__seqprop_assert include/linux/seqlock.h:221 [inline]
> RIP: 0010:cgroup_do_freeze kernel/cgroup/freezer.c:182 [inline]
> RIP: 0010:cgroup_freeze+0x80a/0xf90 kernel/cgroup/freezer.c:309
> Code: 90 e9 9e fb ff ff 44 89 f1 80 e1 07 80 c1 03 38 c1 0f 8c e7 f9 ff
> ff 4c 89 f7 e8 e1 43 67 00 e9 da f9 ff ff e8 17 68 06 00 90 <0f> 0b 90 e9
> 10 fc ff ff 44 89 f9 80 e1 07 38 c1 48 8b 0c 24 0f 8c
> RSP: 0018:ffffc90003b178e0 EFLAGS: 00010293
> RAX: ffffffff81b6c6a9 RBX: 0000000000000000 RCX: ffff88803671bc00
> RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
> RBP: ffffc90003b17a70 R08: 0000000000000000 R09: 0000000000000000
> R10: dffffc0000000000 R11: fffffbfff1d6d2a7 R12: dffffc0000000000
> R13: 0000000000000000 R14: 0000000000000001 R15: ffff88803623a791
> FS: 00005555915ae500(0000) GS:ffff888127017000(0000)
> knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000001b33363fff CR3: 0000000023b98000 CR4: 00000000003526f0
> Call Trace:
> <TASK>
> cgroup_freeze_write+0x156/0x1c0 kernel/cgroup/cgroup.c:4174
> cgroup_file_write+0x39b/0x740 kernel/cgroup/cgroup.c:4312
> kernfs_fop_write_iter+0x3b0/0x540 fs/kernfs/file.c:352
> new_sync_write fs/read_write.c:593 [inline]
> vfs_write+0x5d5/0xb40 fs/read_write.c:686
> ksys_write+0x14b/0x260 fs/read_write.c:738
> do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
> do_syscall_64+0xfa/0x3b0 arch/x86/entry/syscall_64.c:94
> entry_SYSCALL_64_after_hwframe+0x77/0x7f
> RIP: 0033:0x7f0dc3e9eec9
> Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89
> f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0
> ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
> RSP: 002b:00007ffd9b7b6198 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
> RAX: ffffffffffffffda RBX: 00007f0dc40f5fa0 RCX: 00007f0dc3e9eec9
> RDX: 0000000000000012 RSI: 0000200000000200 RDI: 0000000000000004
> RBP: 00007f0dc3f21f91 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
> R13: 00007f0dc40f5fa0 R14: 00007f0dc40f5fa0 R15: 0000000000000003
> </TASK>
> Fixes: afa3701c0e45 ("cgroup: cgroup.stat.local time accounting")
> Reported-by: syzbot+27a2519eb4dad86d0156@syzkaller.appspotmail.com
> Closes:
> https://lore.kernel.org/all/68de0b21.050a0220.25d7ab.077d.GAE@google.com/
> Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
> ---
> kernel/cgroup/freezer.c | 2 ++
> 1 file changed, 2 insertions(+)
> diff --git a/kernel/cgroup/freezer.c b/kernel/cgroup/freezer.c
> index 6c18854bff34..7e779c8a6f89 100644
> --- a/kernel/cgroup/freezer.c
> +++ b/kernel/cgroup/freezer.c
> @@ -179,6 +179,7 @@ static void cgroup_do_freeze(struct cgroup *cgrp,
> bool freeze, u64 ts_nsec)
> lockdep_assert_held(&cgroup_mutex);
> spin_lock_irq(&css_set_lock);
> + preempt_disable_nested();
> write_seqcount_begin(&cgrp->freezer.freeze_seq);
> if (freeze) {
> set_bit(CGRP_FREEZE, &cgrp->flags);
> @@ -189,6 +190,7 @@ static void cgroup_do_freeze(struct cgroup *cgrp,
> bool freeze, u64 ts_nsec)
> cgrp->freezer.freeze_start_nsec);
> }
> write_seqcount_end(&cgrp->freezer.freeze_seq);
> + preempt_enable_nested();
> spin_unlock_irq(&css_set_lock);
> if (freeze)
Thank you for looking into this! I really appreciate it!
--
Tiffany Y. Yang
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2025-10-04 8:35 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-02 5:22 [PATCH] cgroup: Disable preemption for cgrp->freezer.freeze_seq when CONFIG_PREEMPT_RT=y Kuniyuki Iwashima
2025-10-02 5:49 ` Sebastian Andrzej Siewior
2025-10-02 8:28 ` Michal Koutný
2025-10-02 16:22 ` Kuniyuki Iwashima
2025-10-02 16:55 ` Sebastian Andrzej Siewior
2025-10-02 17:14 ` Kuniyuki Iwashima
2025-10-02 17:17 ` Michal Koutný
2025-10-02 18:11 ` Waiman Long
2025-10-02 16:45 ` Tejun Heo
2025-10-02 16:55 ` Sebastian Andrzej Siewior
2025-10-02 17:04 ` Tejun Heo
2025-10-02 20:51 ` Sebastian Andrzej Siewior
2025-10-03 14:42 ` Tejun Heo
2025-10-04 8:35 ` Tiffany Yang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox